Unweaving the Rainbow: Science, Delusion and the Appetite for Wonder - Richard Dawkins (2000)
Chapter 11. REWEAVING THE WORLD
Since my education began I have always had things described to me with their colors and sounds, by one with keen senses and a fine feeling for the significant. Therefore, I habitually think of things as colored and resonant. Habit accounts for part. The soul sense accounts for another part. The brain with its five-sensed construction asserts its right and accounts for the rest. Inclusive of all, the unity of the world demands that color be kept in it whether I have cognizance of it or not. Rather than be shut out, I take part in it by discussing it, happy in the happiness of those near to me who gaze at the lovely hues of the sunset or the rainbow.
HELEN KELLER, The Story of My Life (1902)
Where the gene pool of a species is sculpted into a set of models of ancestral worlds, the brain of an individual houses a parallel set of models of the animal's own world. Both are equivalent to descriptions of the past, and both are used to aid survival into the future. The difference is one of timescale and of relative privacy. The genetic description is a collective memory belonging to the species as a whole, going back into the indefinite past. The memory of the brain is private and contains the individual's experiences since it was born.
Our subjective knowledge of a familiar place does indeed feel to us like a model of the place. Not an accurate scale model, certainly less accurate than we think it is, but a serviceable model for the purposes required. One way to approach this idea was proposed some years ago by the Cambridge physiologist Horace Barlow, incidentally a direct descendant of Charles Darwin. Barlow is especially interested in vision and his argument starts from the realization that to recognize an object is a much more difficult problem than we, who seem to see so effortlessly, ordinarily understand.
For we are blissfully unaware of what a formidably clever thing we do every second of our waking lives when we see and recognize objects. The sense organs' task of unweaving the physical stimuli that bombard them is easy compared with the brain's task of reweaving an internal model of the world that it can then make use of. The argument holds for all our sensory systems, but I'll stick mostly to vision because that is the one that means the most to us.
Think what a problem our brain solves when it recognizes something, say a letter A. Or think of the problem of recognizing a particular person's face. By long in-group convention, the hypothetical face we are talking about is assumed to belong to the grandmother of the distinguished neurobiologist J. Lettvin, but substitute any face you know, or indeed any object you can recognize. We are not concerned here with subjective consciousness, with the philosophically hard problem of what it means to be aware of your grandmother's face. Just a cell in the brain which fires if and only if the grandmother's face appears on the retina will do nicely for a start, and it is very difficult to arrange. It would be easy if we could assume that the face would always fall exactly on a particular part of the retina. There could be a keyhole arrangement, with a grandmother-shaped region of cells on the retina wired up to a grandmother-signalling cell in the brain. Other cells—members of the 'anti-keyhole'—would have to be wired up in inhibitory fashion, otherwise the central nervous cell would respond to a white sheet just as strongly as to the grandmother's face which—together with all other conceivable images—it would necessarily 'contain'. The essence of responding to a key image is to avoid responding to everything else.
The keyhole strategy is ruled out by sheer force of numbers. Even if Lettvin needed to recognize nothing but his grandmother, how could he cope when her image falls on a different part of the retina? How cope with her image's changing size and shape as she approaches or recedes, as she turns sideways, or cants to the rear, as she smiles or as she frowns? If we add up all possible combinations of keyholes and anti-keyholes, the number enters the astronomical range. When you realize that Lettvin can recognize not only his grandmother's face but hundreds of other faces, the other bits of his grandmother and of other people, all the letters of the alphabet, all the thousands of objects to which a normal person can instantly give a name, in all possible orientations and apparent sizes, the explosion of triggering cells gets rapidly out of hand. The American psychologist Fred Attneave, who had come up with the same general idea as Barlow, dramatized the point by the following calculation: if there were just one brain cell to cope, keyhole fashion, with each image that we can distinguish in all its presentations, the volume of the brain would have to be measured in cubic light years.
How then, with a brain capacity measured only in hundreds of cubic centimetres, do we do it? The answer was proposed in the 1950s by Barlow and Attneave independently. They suggested that nervous systems exploit the massive redundancy in all sensory information. Redundancy is jargon from the world of information theory, originally developed by engineers concerned with the economics of telephone line capacity. Information, in the technical sense, is surprise value, measured as the inverse of expected probability. Redundancy is the opposite of information, a measure of unsurprisingness, of old-hatitude. Redundant messages or parts of messages are not informative because the receiver, in some sense, already knows what is coming. Newspapers do not carry headlines saying, 'The sun rose this morning'. That would convey almost zero information. But if a morning came when the sun did not rise, headline writers would, if any survived, make much of it. The information content would be high, measured as the surprise value of the message. Much of spoken and written language is redundant—hence possible condense telegraphese: redundancy lost, information preserved.
Everything that we know about the world outside our skulls comes to us via nerve cells whose impulses chatter like machine guns. What passes along a nerve cell is a volleying of 'spikes', impulses whose voltage is fixed (or at least irrelevant) but whose rate of arriving varies meaningfully. Now let's think about coding principles. How would you translate information from the outside world, say, the sound of an oboe or the temperature of a bath, into a pulse code? A first thought is a simple rate code: the hotter the bath, the faster the machine gun should fire. The brain, in other words, would have a thermometer calibrated in pulse rates. Actually, this is not a good code because it is uneconomical with pulses. By exploiting redundancy, it is possible to devise codes that convey the same information at a cost of fewer pulses. Temperatures in the world mostly stay the same for long periods at a time. To signal 'It is hot, it is hot, it is still hot...' by a continuously high rate of machine-gun pulses is wasteful; it is better to say, 'It has suddenly become hot' (now you can assume that it will stay the same until further notice).
And, satisfyingly, this is what nerve cells mostly do, not just for signalling temperature but for signalling almost everything about the world. Most nerve cells are biased to signal changes in the world. If a trumpet plays a long sustained note, a typical nerve cell telling the brain about it would show the following pattern of impulses: Before the trumpet starts, low firing rate; immediately after the trumpet starts, high firing rate; as the trumpet carries on sustaining its note, the firing rate dies away to an infrequent mutter; at the moment when the trumpet stops, high firing rate, dying away to a resting mutter again. Or there might be one class of nerve cells that fire only at the onset of sounds and a different class of cells that fire only when sounds go off. Similar exploitation of redundancy—screening out of the sameness in the world—goes on in cells that tell the brain about changes in light, changes in temperature, changes in pressure. Everything about the world is signalled as change, and this is a major economy.
But you and I don't seem to hear the trumpet die away. To us the trumpet seems to carry on at the same volume and then to stop abruptly. Yes, of course. That's what you'd expect because the coding system is ingenious. It doesn't throw away information, it only throws away redundancy. The brain is told only about changes, and it is then in a position to reconstruct the rest. Barlow doesn't put it like this, but we could say that the brain constructs a virtual sound, using the messages supplied by the nerves coming from the ears. The reconstructed virtual sound is complete and unabridged, even though the messages themselves are economically stripped down to information about changes. The system works because the state of the world at a given time is usually not greatly different from the preceding second. Only if the world changed capriciously, randomly and frequently, would it be economical for sense organs to signal continuously the state of the world. As it is, sense organs are set up to signal, economically, the discontinuities in the world; and the brain, assuming correctly that the world doesn't change capriciously and at random, uses the information to construct an internal virtual reality in which the continuity is restored.
The world presents an equivalent kind of redundancy in space, and the nervous system uses the corresponding trick. Sense organs tell the brain about edges and the brain fills in the boring bits between. Suppose you are looking at a black rectangle on a white background. The whole scene is projected on to your retina—you can think of the retina as a screen covered with a dense carpet of tiny photocells, the rods and cones. In theory, each photocell could report to the brain the exact state of the light falling upon it. But the scene we are looking at is massively redundant. Cells registering black are overwhelmingly likely to be surrounded by other cells registering black. Cells registering white are nearly all surrounded by other white-signalling cells. The important exceptions are cells on edges. Those on the white side of an edge signal white themselves and so do their neighbours that sit further into the white area. But their neighbours on the other side are in the black area. The brain can theoretically reconstruct the whole scene if just the retinal cells on edges fire. If this could be achieved there would be massive savings in nerve impulses. Once again, redundancy is removed and only information gets through.
Elegantly, the economy is achieved in practice by the mechanism known as 'lateral inhibition'. Here's a simplified version of the principle, using our analogy of the screen of photocells. Each photocell sends one long wire to the central computer (brain) and also short wires to its immediate neighbours in the photocell screen. The short connections to the neighbours inhibit them, that is, turn down their firing rate. It is easy to see that maximal firing will come only from cells that lie along edges, for they are inhibited from one side only. Lateral inhibition of this kind is common among the low-level units of both vertebrate and invertebrate eyes.
Once again, we could say that the brain constructs a virtual world which is more complete than the picture relayed to it by the senses. The information which the senses supply to the brain is mostly information about edges. But the model in the brain is able to reconstruct the bits between the edges. As in the case of discontinuities in time, an economy is achieved by the elimination—and later reconstruction in the brain—of redundancy. This economy is possible only because uniform patches exist in the world. If the shades and colours in the world were randomly dotted about, no economical remodelling would be possible.
Another kind of redundancy stems from the fact that many lines in the real world are straight, or curved in smooth and therefore predictable (or mathematically reconstructable), ways. If the ends of a line are specified, the middle can be filled in using a simple rule that the brain already 'knows'. Among the nerve cells that have been discovered in the brains of mammals are the so-called 'line-detectors', neurones that fire whenever a straight line, aligned in a particular direction, falls on a particular place in the retina, the so-called 'retinal field' of the brain cell. Each of these line-detector cells has its own preferred direction. In the cat brain, there are only two preferred directions, horizontal and vertical, with an approximately equal number favouring each direction; however, in monkeys other angles are accommodated. From the point of view of the redundancy argument, what is going on here is as follows. In the retina, all the cells along a straight line fire and most of these impulses are redundant. The nervous system economizes by using a single cell to register the line, labelled with its angle. Straight lines are economically specified by their position and direction alone, or by their ends, not by the light value of every point along their length. The brain reweaves a virtual line in which the points along the line are reconstructed.
However, if a part of a scene suddenly detaches itself from the rest and starts to crawl over the background, it is news and should be signalled. Biologists have indeed discovered nerve cells that are silent until something moves against a still background. These cells don't respond when the entire scene moves—that would correspond to the sort of apparent movement the animal would see when it itself moves. But movement of a small object against a still background is information-rich and there are nerve cells tuned to detect it. The most famous of these are the so-called 'bug-detectors' discovered in frogs by Lettvin (he of the grandmother) and his colleagues. A bug-detector is a cell which is apparently blind to everything except the movement of small objects against their background. As soon as an insect moves in the field covered by a bug-detector, the cell immediately initiates massive signalling and the frog's tongue is likely to shoot out to catch the insect. To a sufficiently sophisticated nervous system, though, even the movement of a bug is redundant if it is movement in a straight line. Once you've been told that a bug is moving steadily in a northerly direction, you can assume that it will continue to move in this direction until further notice. Carrying the logic a step further, we should expect to find higher-order movement detector cells in the brain that are especially sensitive to change in movement, say, change in direction or change in speed. Lettvin and his colleagues found a cell that seems to do this, again in the frog. In their paper in Sensory Communication (1961) they describe a particular experiment as follows:
Let us begin with an empty gray hemisphere for the visual field. There is usually no response of the cell to turning on and off the illumination. It is silent We bring in a small dark object, say 1 to 2 degrees in diameter, and at a certain point in its travel, almost anywhere in the field, the cell suddenly 'notices' it. Thereafter, wherever that object is moved it is tracked by the cell. Every time it moves, with even the faintest jerk, there is a burst of impulses that dies down to a mutter that continues as long as the object is visible. If the object is kept moving, the bursts signal discontinuities in the movement, such as the turning of comers, reversals, and so forth, and these bursts occur against a continuous background mutter that tells us the object is visible to the cell...
To summarize, it is as if the nervous system is tuned at successive hierarchical levels to respond strongly to the unexpected, weakly or not at all to the expected. What happens at successively higher levels is that the definition of that which is expected becomes progressively more sophisticated. At the lowest level, every spot of light is news. And the next level up, only edges are 'news'. At a higher level still, since so many edges are straight, only the ends of edges are news. Higher again, only movement is news. Then only changes in rate or direction of movement. In Barlow's terms derived from the theory of codes, we could say that the nervous system uses short, economical words for messages that occur frequently and are expected; long, expensive words for messages that occur rarely and are not expected. It is a bit like language, in which (the generalization is called Zipf's Law) the shortest words in the dictionary are the ones most often used in speech. To push the idea to an extreme, most of the time the brain does not need to be told anything because what is going on is the norm. The message would be redundant. The brain is protected from redundancy by a hierarchy of filters, each filter tuned to remove expected features of a certain kind.
It follows that the set of nervous filters constitutes a kind of summary description of the norm, of the statistical properties of the world in which the animal lives. It is the nervous equivalent of our insight of the previous chapter: that the genes of a species come to constitute a statistical description of the worlds in which its ancestors were naturally selected. Now we see that the sensory coding units with which the brain confronts the environment also constitute a statistical description of that environment. They are tuned to discount the common and emphasize the rare. Our hypothetical zoologist of the future should therefore be able, by inspecting the nervous system of an unknown animal and measuring the statistical biases in its tuning, to reconstruct the statistical properties of the world in which the animal lived, to read off what is common and what rare in the animal's world.
The inference would be indirect, in the same way as for the case of the genes. We would not be reading the animal's world as a direct description. Rather, we'd infer things about the animal's world by inspecting the glossary of abbreviations that its brain used to describe it. Civil servants love acronyms like CAP (Common Agricultural Policy) and HEFCE (Higher Education Funding Council for England); fledgling bureaucrats surely need a glossary of such abbreviations, a codebook. If you find such a codebook dropped in the street, you could work out which ministry it came from by seeing which phrases have been granted abbreviations, presumably because they are commonly used in that ministry. An intercepted codebook is not a particular message about the world, but it is a statistical summary of the kind of world which this code was designed to describe economically.
We can think of each brain as equipped with a store cupboard of basic images, useful for modelling important or common features of the animal's world. Although, following Barlow, I have emphasized learning as the means by which the store cupboard is stocked, there is no reason why natural selection itself, working on genes, should not do some of the work of filling up the cupboard. In this case, following the logic of the previous chapter, we should say that the store cupboard in the brain contains images from the ancestral past of the species. We could call it a collective unconscious, if the phrase had not become tarnished by association.
But the biases of the image kit in the cupboard will not only reflect what is statistically unexpected in the world. Natural selection will ensure that the repertoire of virtual representations is also well endowed with images that are of particular salience or importance in the life of the particular kind of animal and in the world of its ancestors, even if these are not especially common. An animal may need only once in its life to recognize a complicated pattern, say the shape of a female of its species, but on that occasion it is vitally important to get it right, and do so without delay. For humans, faces are of special importance, as well as being common in our world. The same is true of social monkeys. Monkey brains have been found to possess a special class of cells which fire at full strength only when presented with a complete face. We've already seen that humans with particular kinds of localized brain damage experience a very peculiar, and revealing, kind of selective blindness. They can't recognize faces. They can see everything else, apparently normally, and they can see that a face has a shape, with features. They can describe the nose, the eyes and the mouth. But they can't recognize the face even of the person they love best in all the world.
Normal people not only recognize faces. We seem to have an almost indecent eagerness to see faces, whether they are really there or not. We see faces in damp patches on the ceiling, in the contours of a hillside, in clouds or in Martian rocks. Generations of moongazers have been led, by the most unpromising of raw materials, to invent a face in the pattern of craters on the moon. The Daily Express (London) of 15 January 1998 bestowed most of a page, complete with banner headline, on the story that an Irish cleaning woman saw the face of Jesus in her duster: 'Now a stream of pilgrims is expected at her semi-detached home ... The woman's parish priest said, "I've never seen anything like it before in my 34 years in the priesthood."' The accompanying photograph shows a pattern of dirty polish on a cloth which slightly resembles a face of some kind: there is a faint suggestion of an eye on one side of what could be a nose; there is also a sloping eyebrow on the other side which gives it a look of Harold Macmillan, although I suppose even Harold Macmillan might look like Jesus to a suitably prepared mind. The Express reminds us of similar stories, including the 'nun bun' served up in a Nashville café, which 'resembled the face of Mother Teresa, 86' and caused great excitement until 'the aged nun wrote to the café demanding the bun be removed'.
The eagerness of the brain to construct a face, when offered the slightest encouragement, fosters a remarkable illusion. Get an ordinary mask of a human face—President Clinton's face, or whatever is on sale for fancy dress parties. Stand it up in a good light and look at it from the far side of the room. If you look at it the normal way round, not surprisingly it looks solid. But now turn the mask so that it is facing away from you and look at the hollow side from across the room. Most people see the illusion immediately. If you don't, try adjusting the light. It may help if you shut one eye, but it is by no means necessary. The illusion is that the hollow side of the mask looks solid. The nose, brows and mouth stick out towards you and seem nearer than the ears. It is even more striking if you move from side to side, or up and down. The apparently solid face seems to turn with you, in an odd, almost magical way.
I'm not talking about the ordinary experience we have when the eyes of a good portrait seem to follow you around the room. The hollow mask illusion is far more spooky. It seems to hover, luminously, in space. The face really really seems to turn. I have a mask of Einstein's face mounted in my room, hollow side out, and visitors gasp when they glimpse it. The illusion is most strikingly displayed if you set the mask on a slowly rotating turntable. As the solid side turns before you, you'll see it move in a sensible 'normal reality' way. Now the hollow side comes into view and something extraordinary happens. You see another solid face, but it is rotating in the opposite direction. Because one face (say, the real solid face) is turning clockwise while the other, pseudo-solid face appears to be turning anticlockwise, the face that is rotating into view seems to swallow up the face that is rotating away from view. As the turning continues, you then see the really hollow but apparently solid face rotating firmly in the wrong direction for a while, before the really solid face reappears and swallows up the virtual face. The whole experience of watching the illusion is quite unsettling and it remains so no matter how long you go on watching it. You don't get used to it and don't lose the illusion.
What is happening? We can take the answer in two stages. First, why do we see the hollow mask as solid? And second, why does it seem to rotate in the wrong direction? We've already agreed that the brain is very good at—and very keen on—constructing faces in its internal simulation room. The information that the eyes are feeding to the brain is of course compatible with the mask's being hollow, but it is also compatible—just—with an alternative hypothesis, that it is solid. And the brain, in its simulation, goes for the second alternative, presumably because of its eagerness to see faces. So it overrules the messages from the eyes that say, 'This is hollow'; instead, it listens to the messages that say, 'This is a face, this is a face, face, face, face.' Faces are always solid. So the brain takes a face model out of its cupboard which is, by its nature, solid.
But having constructed its apparently solid face model, the brain is caught in a contradiction when the mask starts to rotate. To simplify the explanation, suppose that the mask is that of Oliver Cromwell and that his famous warts are visible from both sides of the mask. When looking at the hollow interior of the nose, which is really pointing away from the viewer, the eye looks straight across to the right side of the nose where there is a prominent wart. But the constructed virtual nose is apparently pointing towards the viewer, not away, and the wart is on what, from the virtual Cromwell's point of view, would be his left side, as if we were looking at Cromwell's mirror image. As the mask rotates, if the face were really solid, our eye would see more of the side that it expected to see more of and less of the side that it expected to see less of. But because the mask is actually hollow, the reverse happens. The relative proportions of the retinal image change in the way the brain would expect if the face were solid but rotating in the opposite direction. And that is the illusion that we see. The brain resolves the inevitable contradiction, as one side gives way to the other, in the only way possible, given its stubborn insistence on the mask's being a solid face: it simulates a virtual model of one face swallowing up the other face.
The rare brain disorder that destroys our ability to recognize faces is called prosopagnosia. It is caused by injury to specific parts of the brain. This very fact supports the importance of a 'face cupboard' in the brain. I don't know, but I'd bet that prosopagnosics wouldn't see the hollow mask illusion. Francis Crick discusses prosopagnosia in his book The Astonishing Hypothesis (1994), together with other revealing clinical conditions. For instance, one patient found the following condition very frightening which, as Crick observes, is not surprising:
...objects or persons she saw in one place suddenly appeared in another without her being aware they were moving. This was particularly distressing if she wanted to cross a road, since a car that at first seemed far away would suddenly be very close ... She experienced the world rather as some of us might see the dance floor in the strobe lighting of a discotheque.
This woman had a mental cupboard full of images for assembling her virtual world, just as we all do. The images themselves were probably perfectly good. But something had gone wrong with her software for deploying them in a smoothly changing virtual world. Other patients have lost their ability to construct virtual depth. They see the world as though it was made of flat, cardboard cut-outs. Yet other patients can recognize objects only if they are presented from a familiar angle. The rest of us, having seen, say, a saucepan from the side, can effortlessly recognize it from above. These patients have presumably lost some ability to manipulate virtual images and turn them around. The technology of virtual reality gives us a language to think about such skills, and this will be my next topic.
I shall not dwell on the details of today's virtual reality which is certain, in any case, to become obsolete. The technology changes as rapidly as everything else in the world of computers. Essentially what happens is as follows. You don a headset which presents to each of your eyes a miniature computer screen. The images on the two screens are nearly the same as each other, but offset to give the stereo illusion of three dimensions. The scene is whatever has been programmed into the computer: the Parthenon, perhaps, intact and in its original garish colours; an imagined landscape on Mars; the inside of a cell, hugely magnified. So far, I might have been describing an ordinary 3-D movie. But the virtual reality machine provides a two-way street. The computer doesn't just present you with scenes, it responds to you. The headset is wired up to register all turnings of your head, and other body movements, which would, in the normal course of events, affect your viewpoint. The computer is continuously informed of all such movements and—here is the cunning part—it is programmed to change the scene presented to the eyes, in exactly the way it would change if you were really moving your head. As you turn your head, the pillars of the Parthenon, say, swing round and you find yourself looking at a statue which, previously, had been 'behind' you.
A more advanced system might have you in a body stocking, laced with strain gauges to monitor the positions of all your limbs. The computer can now tell whenever you take a step, whenever you sit down, stand up, or wave your arms. You can now walk from one end of the Parthenon to the other, watching the pillars pass by as the computer changes the images in sympathy with your steps. Tread carefully because, remember, you are not really in the Parthenon but in a cluttered computer room. Present day virtual reality systems, indeed, are likely to tether you to the computer by a complicated umbilicus of cables, so let's postulate a future tangle-free radio link, or infrared data beam. Now you can walk freely in an empty real world and explore the fantasy virtual world that has been programmed for you. Since the computer knows where your body stocking is, there is no reason why it shouldn't represent you to yourself as a complete human form, an avatar, allowing you to look down at your 'legs', which might be very different from your real legs. You can watch your avatar's hands as they move in imitation of your real hands. If you use these hands to pick up a virtual object, say a Grecian urn, the urn will seem to rise into the air as you 'lift' it.
If somebody else, who could be in another country, dons another set of kit hooked up to the same computer, in principle you should be able to see their avatar and even shake hands—though with present day technology you might find yourself passing through each other like ghosts. The technicians and programmers are still working on how to create the illusion of texture and the 'feel' of solid resistance. When I visited England's leading virtual reality company, they told me they get many letters from people wanting a virtual sexual partner. Perhaps in the future, lovers separated by the Atlantic will caress each other over the Internet, albeit incommoded by the need to wear gloves and a body stocking wired up with strain gauges and pressure pads.
Now let's take virtual reality a shade away from dreams and closer to practical usefulness. Present day doctors have recourse to the ingenious endoscope, a sophisticated tube that is inserted into a patient's body through, say, the mouth or the rectum and used for diagnosis and even surgical intervention. By the equivalent of pulling wires, the surgeon steers the long tube round the bends of the intestine. The tube itself has a tiny television camera lens at its tip and a light pipe to illuminate the way. The tip of the tube may also be furnished with various remote-control instruments which the surgeon can control, such as micro-scalpels and forceps.
In conventional endoscopy, the surgeon sees what he is doing using an ordinary television screen, and he operates the remote controls using his fingers. But as various people have realized (not least Jaron Lanier, who coined the phrase 'virtual reality' itself) it is in principle possible to give the surgeon the illusion of being shrunk and actually inside the patient's body. This idea is in the research stage, so I shall resort to a fantasy of how the technique might work in the next century. The surgeon of the future has no need to scrub up, for she need not go near her patient. She stands in a wide open area, connected by radio to the endoscope inside the patient's intestine. The miniature screens in front of her two eyes present a magnified stereo image of the interior of the patient immediately in front of the tip of the endoscope. When she moves her head to the left, the computer automatically swivels the tip of the endoscope to the left. The angle of view of the camera inside the intestine faithfully moves to follow the surgeon's head movements in all three planes. She drives the endoscope forward along the intestine by her footsteps. Slowly, slowly, for fear of damaging the patient, the computer pushes the endoscope forwards, its direction always controlled by the direction in which, in a completely different room, the surgeon is walking. It feels to her as though she is actually walking through the intestine. It doesn't even feel claustrophobic. Following present day endoscopic practice, the gut has been carefully inflated with air, otherwise the walls would press in upon the surgeon and force her to crawl rather than walk.
When she finds what she is looking for, say a malignant tumour, the surgeon selects an instrument from her virtual toolbag. Perhaps it is most convenient to model it as a chainsaw, whose image is generated in the computer. Looking through the stereo screens in her helmet at the enlarged 3-D tumour, the surgeon sees the virtual chainsaw in her virtual hands and goes to work, excising the tumour, as though it were a tree stump needing to be removed from the garden. Inside the real patient, the mirrored equivalent of the chainsaw is an ultrafine laser beam. As if by a pantograph, the gross movements of the surgeon's whole arm as she hefts the chainsaw are geared down, by the computer, to equivalent tiny movements of the laser gun in the tip of the endoscope.
For my purposes I need say only that it is theoretically possible to create the illusion of walking through somebody's intestine using the techniques of virtual reality. I do not know whether it will actually help surgeons. I suspect that it will, although a present day hospital consultant whom I have asked is a little sceptical. This same surgeon refers to himself and his fellow gastroenterologists as glorified plumbers. Plumbers themselves sometimes use larger-scale versions of endoscopes for exploring pipes and in America they even send down mechanical 'pigs' to eat their way through blockages in drains. Obviously the methods I imagined for a surgeon would work for a plumber. The plumber could 'tramp' (or 'swim'?) down the virtual water pipe with a virtual miner's lamp on his helmet and a virtual pickaxe in his hand for clearing blockages.
The Parthenon of my first example existed nowhere but in the computer. The computer could as well have introduced you to angels, harpies or winged unicorns. My hypothetical endoscopist and plumber, on the other hand, were walking through a virtual world that was constrained to resemble a mapped portion of reality, the real interior of a drain or a patient's intestine. The virtual world that was presented to the surgeon on her stereo screens was admittedly constructed in a computer, but it was constructed in a disciplined way. There was a real laser gun being controlled, albeit represented as a chainsaw because this would feel like a natural tool to excise a tumour whose apparent size was comparable to the surgeon's own body. The shape of the virtual construction reflected, in the way most convenient to the surgeon's operation, a detail of the real world inside the patient. Such constrained virtual reality is pivotal in this chapter. I believe that every species that has a nervous system uses it to construct a model of its own particular world, constrained by continuous updating through the sense organs. The nature of the model may depend upon how the species concerned is going to use it, at least as much as upon what we might think of as the nature of the world itself.
Think of a gliding gull adroitly riding the winds off a sea cliff. It may not be flapping its wings, but this doesn't mean that its wing muscles are idle. They and the tail muscles are constantly making tiny adjustments, sensitively fine-tuning the bird's flight surfaces to every eddy, every nuance of the air around it. If we fed information about the state of all the nerves controlling these muscles into a computer, from moment to moment, the computer could in principle reconstruct every detail of the air currents through which the bird was gliding. It would do this by assuming that the bird was well designed to stay aloft and on that assumption construct a continuously updated model of the air around it. It would be a dynamic model, like a weather forecaster's model of the world's weather system, which is continuously revised by new data supplied by weather ships, satellites and ground stations and can be extrapolated to predict the future. The weather model advises us about tomorrow's weather; the gull model is theoretically capable of 'advising' the bird on the anticipatory adjustments that it should make to its wing and tail muscles in order to glide on into the next second.
The point we are working towards, of course, is that although no human programmer has yet constructed a computer model to advise gulls on how to adjust their wing and tail muscles, just such a model is surely being run continuously in the brain of our gull and of every other bird in flight. Similar models, pre-programmed in outline by genes and past experience, and continuously updated by new sense data from millisecond to millisecond, are running inside the skull of every swimming fish, every galloping horse, every echo-ranging bat.
That ingenious inventor Paul MacCready is best known for his superbly economical flying machines, the man-powered Gossamer Condor and Gossamer Albatross, and the sun-powered Solar Challenger. He also, in 1985, constructed a half-sized flying replica of the giant Cretaceous pterosaur Quetzalcoatlus. This huge flying reptile, with a wingspan comparable to that of a light aircraft, had almost no tail and was therefore highly unstable in the air. John Maynard Smith, who trained as an aero-engineer before switching to zoology, pointed out that this would have given advantages of manoeuvrability, but it demands accurate moment-to-moment control of the flight surfaces. Without a fast computer to adjust its trim continuously, MacCready's replica would have crashed. The real Quetzalcoatlus must have had an equivalent computer in its head, for the same reason. Earlier pterosaurs had long tails, in some cases terminated by what looks like a ping-pong bat, which would have given great stability, at a cost in manoeuvrability. It seems that, in the evolution of late, almost tailless pterosaurs like Quetzalcoatlus, there was a shift from stable but unmanoeuvrable to manoeuvrable but unstable. The same trend can be seen in the evolution of manmade aeroplanes. In both cases, the trend is made possible only by increasing computer power. As in the case of the seagull, the pterosaur's on-board computer inside its skull must have run a simulation model of the animal and the air through which it flew.
You and I, we humans, we mammals, we animals, inhabit a virtual world, constructed from elements that are, at successively higher levels, useful for representing the real world. Of course, we feel as if we are firmly placed in the real world—which is exactly as it should be if our constrained virtual reality software is any good. It is very good, and the only time we notice it at all is on the rare occasions when it gets something wrong. When this happens we experience an illusion or a hallucination, like the hollow mask illusion we talked about earlier.
The British psychologist Richard Gregory has paid special attention to visual illusions as a means of studying how the brain works. In his book Eye and Brain (fifth edition 1998), he regards seeing as an active process in which the brain sets up hypotheses about what is going on out there, then tests those hypotheses against the data coming in from the sense organs. One of the most familiar of all visual illusions is the Necker cube. This is a simple line drawing of a hollow cube, like a cube made of steel rods. The drawing is a two-dimensional pattern of ink on paper. Yet a normal human sees it as a cube. The brain has made a three-dimensional model based upon the two-dimensional pattern on the paper. This is, indeed, the kind of thing the brain does almost every time you look at a picture. The flat pattern of ink on paper is equally compatible with two alternative three-dimensional brain models. Stare at the drawing for some seconds and you will see it flip. The facet that had previously seemed nearest to you will now appear farthest. Carry on looking, and it will flip back to the original cube. The brain could have been designed to stick, arbitrarily, to one of the two cube models, say the first of the two that it hit upon, even though the other model would have been equally compatible with the information from the retinas. But in fact the brain takes the other option of running each model, or hypothesis, alternately for a few seconds at a time. Hence the apparent cube alternates, which gives the game away. Our brain constructs a three-dimensional model. It is virtual reality in the head.
When we are looking at an actual wooden box, our simulation software is provided with additional information, which enables it to arrive at a clear preference for one of the two internal models. We therefore see the box in one way only, and there is no alternation. But this does not diminish the truth of the general lesson we learn from the Necker cube. Whenever we look at anything, there is a sense in which what our brain actually makes use of is a model of that thing in the brain. The model in the brain, like the virtual Parthenon of my earlier example, is constructed. But, unlike the Parthenon (and perhaps the visions we see in dreams), it is, like the surgeon's computer model of the inside of her patient, not entirely invented: it is constrained by information fed in from the outside world.
A more powerful illusion of solidity is conveyed by stereoscopy, the slight discrepancy between the two images seen by the left and the right eyes. It is this that is exploited by the two screens in a virtual reality helmet. Hold up your right hand, with the thumb towards you, about one foot in front of your face, and look at some distant object, say a tree, with both eyes open. You'll see two hands. These correspond to the images seen by your two eyes. You can quickly find out which is which by first shutting one, then the other, eye. The two hands appear to be in slightly different places because your two eyes converge from different angles and the images on the two retinas are correspondingly, and tellingly, different. The two eyes get a slightly different view of the hand, too. The left eye sees a bit more of the palm, the right eye sees a bit more of the back of the hand.
Now, instead of looking at the distant tree, look at your hand, again with both eyes open. Instead of two hands in the foreground and one tree in the background, you'll see one solid-looking hand and two trees. Yet the hand image is still falling on different places on your two retinas. What this means is that your simulation software has constructed a single model of the hand, a model in 3-D. What's more, the single three-dimensional model has made use of information from both eyes. The brain subtly amalgamates both sets of information and puts together a useful model of a single, three-dimensional, solid hand. Incidentally, all retinal images of course are upside down, but this doesn't matter because the brain constructs its simulation model in the way that best suits its purpose and defines this model as the right way up.
The computational tricks used by the brain to construct a three-dimensional model from two two-dimensional images are astonishingly sophisticated, and are the basis of perhaps the most impressive of all illusions. These date back to a discovery by the Hungarian psychologist Bela Julesz in 1959. A normal stereoscope presents the same photograph to the left and the right eye but taken from suitably different angles. The brain puts the two together and sees an impressively three-dimensional scene. Julesz did the same thing, except that his pictures were random pepper and salt dots. The left and the right eye were shown the same random pattern, but with a crucial difference. In a typical Julesz experiment, an area of the pattern, say, a square, has its random dots displaced to one side, the appropriate distance to create the stereoscopic illusion. And the brain sees the illusion—a square patch stands out—even though there is not the smallest trace of a square in either of the two pictures. The square is present only in the discrepancy between the two pictures. The square looks very real to the viewer, but it really is nowhere but in the brain. The Julesz Effect is the basis of the 'Magic Eye' illusions so popular today. In a tour de force of the explainer's art, Steven Pinker devotes a small section of How the Mind Works (1998) to the principle underlying these pictures. I won't even try to better his explanation.
There is an easy way to demonstrate that the brain works as a sophisticated virtual reality computer. First, look about you by moving your eyes. As you swivel your eyes, the images on your retinas move as if you were in an earthquake. But you don't see an earthquake. To you, the scene seems as steady as a rock. I am leading up, of course, to saying that the virtual model in your brain is constructed to remain steady. But there is more to the demonstration, because there's another way to make the image on your retina move. Gently poke your eyeball through the skin of the eyelid. The retinal image will move in the same kind of way as before. Indeed you could, given sufficient skill with your finger, mimic the effect of shifting your gaze. But now you really will think you see the earth move. The whole scene shifts, as if you were witnessing an earthquake.
What is the difference between these two cases? It is that the brain computer has been set up to take account of normal eye movements and make allowance for them in constructing its computed model of the world. Apparently the brain model makes use of information, not only from the eyes, but also from the instructions to move the eyes. Whenever the brain issues an order to the eye muscles to move the eye, a copy of that order is sent to the part of the brain that is constructing the internal model of the world. Then, when the eyes move, the virtual reality software of the brain is warned to expect the retinal images to move just the right amount, and it makes the model compensate. So the constructed model of the world is seen to stay still, although it may be viewed from another angle. If the earth moves at any time other than when the model is told to expect movement, the virtual model moves accordingly. This is fine, because there really might be an earthquake. Except that you can fool the system by poking your eyeball.
As the final demonstration using yourself as guinea pig, make yourself giddy by spinning round and round. Now stand still and look fixedly at the world. It will appear to spin even though your reason tells you that it is not getting anywhere in its rotation. Your retinal images are not moving, but the accelerometers in your ears (which work by detecting the movements of fluid in the so-called semicircular canals) are telling the brain that you are spinning. The brain instructs the virtual reality software to expect to see the world spinning. When the images on the retina do not spin, therefore, the model registers the discrepancy and spins itself in the opposite direction. To put it in subjective language, the virtual reality software says to itself, 'I know I'm spinning from what the ears are telling me; therefore, in order to hold the model still, it will be necessary to put the opposite spin on the model, relative to the data that the eyes are sending in.' But the retinas actually report no spin, so the compensating spin of the model in the head is what you seem to see. In Barlow's terms, it is the unexpected, it is 'news', and that is why we see it.
Birds have an additional problem which humans ordinarily are spared. A bird perched on a tree branch is constantly being blown up and down, to and fro, and its retinal images seesaw accordingly. It is like living through a permanent earthquake. Birds keep their heads, and hence their view of the world, steady by diligent use of the neck muscles. If you film a bird on a windblown branch, you can almost imagine that the head is nailed to the background, while the neck muscles use the head as a fulcrum to move the rest of the body. When a bird walks, it employs the same trick to keep its perceived world steady. That is why walking chickens jerk their heads back and forth in what can seem to us quite a comical fashion. It is actually rather clever. As the body moves forward, the neck draws the head backwards in a controlled way so that the retinal images remain steady. Then the head shoots forward to allow the cycle to repeat. I can't help wondering whether, as an untoward consequence of the bird way of doing things, a bird might be unable to see a real earthquake because its neck muscles would automatically compensate. More seriously, we might say that the bird is using its neck muscles in a Barlow-style exercise: holding the non-newsworthy part of the world constant so that genuine movement stands out.
Insects and many other animals seem to have a similar habit of working to keep their visual world constant. Experimenters have demonstrated this in a so-called 'optomotor apparatus', where the insect is placed on a table and surrounded by a hollow cylinder painted on the inside with vertical stripes. If you now rotate the cylinder, the insect will use its legs to turn, keeping up with the cylinder. It is working to keep its visual world constant.
Normally, an insect has to tell its simulating software to expect movement when it walks, otherwise it would start compensating for its own movements, and then where would it be? This thought prompted two ingenious Germans, Erich von Hoist and Horst Mittelstaedt, to a diabolically cunning experiment. If you've ever watched a fly washing its face with its hands, you will know that flies are capable of flicking their head completely upside down. Von Hoist and Mittelstaedt succeeded in fixing a fly's head in the inverted position using glue. You have already guessed the consequence. Normally, whenever a fly turns its body, the model in its brain is told to expect a corresponding movement of the visual world. But as soon as it took a step, the wretched fly with its head upside down received data suggesting that the world had moved in the opposite direction to the one expected. It therefore moved its legs further in the same direction in order to compensate. This caused the apparent position of the world to move even further. The fly ended up spinning round and round like a top, at ever-increasing speed—well, within obvious practical limits.
The same Erich von Hoist also pointed out that we should expect a similar confusion if our own voluntary instructions to move our eyes are neutralized, for example by narcotizing the eye-moving muscles. Normally, if you give your eyes the command to move to the right, your retinal images will signal a move to the left. To compensate and create the appearance of stability, the model in the head has to be moved to the right. But if the eye-moving muscles are narcotized, the model should move to the right in anticipation of what turns out to be a non-existent retinal movement. Let von Hoist himself take up the story, in his paper 'The Behavioural Physiology of Animals and Man' (1973):
This is indeed the case! It has been known for many years from people with paralysed eye muscles and it has been established exactly from the experiments of Kornmuller on himself that every intended but unfulfilled eye movement results in the perception of a quantitative movement of the surroundings in the same direction.
We are so used to living in our simulated world and it is kept so beautifully in synchrony with the real world that we don't realize it is a simulated world. It takes clever experiments like those of von Hoist and his colleagues to bring it home to us.
And it has its dark side. A brain that is good at simulating models in imagination is also, almost inevitably, in danger of self-delusion. How many of us as children have lain in bed, terrified because we thought we saw a ghost or a monstrous face staring in at the bedroom window, only to discover that it was a trick of the light? I've already discussed how eagerly our brain's simulation software will construct a solid face where the reality is a hollow face. It will just as eagerly make a ghostly face where the reality is a collection of moonlit folds in a white net curtain.
Every night of our lives we dream. Our simulation software sets up worlds that do not exist; people, animals and places that never existed, perhaps never could exist. At the time, we experience these simulations as though they were reality. Why should we not, given that we habitually experience reality in the same way—as simulation models? The simulation software can delude us when we are awake, too. Illusions like the hollow face are in themselves harmless, and we understand how they work. But our simulation software can also, if we are drugged, or feverish, or fasting, produce hallucinations. Throughout history, people have seen visions of angels, saints and gods; and these have seemed very real to them. Well, of course they would seem real. They Eire models, put together by the normal simulation software. The simulation software is using the same modelling techniques as it uses ordinarily when it presents its continuously updated edition of reality. No wonder these visions have been so influential. No wonder they have changed people's lives. So if ever we hear a story that somebody has seen a vision, been visited by an archangel, or heard voices in the head, we should immediately be suspicious of taking it at face value. Remember that all our heads contain powerful and ultra-realistic simulation software. Our simulation software could knock up a ghost or a dragon or a saintly virgin in no time flat. It would be child's play for software of that sophistication.
A word of warning. The metaphor of virtual reality is beguiling and, in many ways, apt. But there is a danger of its misleading us into thinking that there is a 'little man' or 'homunculus' in the brain watching the virtual reality show. As philosophers such as Daniel Dennett have pointed out, you have explained precisely nothing if you suggest that the eye is wired to the brain in such a way that a little cinema screen, somewhere in the brain, continuously relays whatever is projected on the retina. Who looks at the screen? The question now raised is no smaller than the original question you think you have answered. You might as well let the little man look at the retina directly, which is clearly no solution to anything. The same problem arises if we take the virtual reality metaphor literally and imagine that some agent locked inside the head is 'experiencing' the virtual reality performance.
The problems raised by subjective consciousness are perhaps the most baffling in all philosophy, and solving them is far beyond my ambition. My suggestion is the more modest one that each species, in each situation, needs to deploy its information about the world in whatever way is most useful for taking action. 'Constructing a model in the head' is a helpful way to express how it is done, and comparing it to virtual reality is especially helpful in the case of humans. As I have argued before, the model of the world used by a bat is likely to be similar to the model used by a swallow, even though one is connected to the real world via the ears, the other via the eyes. The brain constructs its model world in the way most suited for action. Since the actions of day-flying swallows and night-flying bats are similar—navigating at high speed in three dimensions, avoiding solid obstacles and catching insects on the wing—they are likely to use the same models. I do not postulate a 'little bat in the head' or a 'little swallow in the head' to watch the model. Somehow the model is used to control the wing muscles, and that is as far as I go.
Nevertheless, each of us humans knows that the illusion of a single agent sitting somewhere in the middle of the brain is a powerful one. I suspect that the case may be parallel to the 'selfish cooperator' model of genes coming together, although they are fundamentally independent agents, to create the illusion of a unitary body. I'll briefly return to the idea near the end of the next chapter.
This chapter has developed the thesis that brains have taken over from DNA part of the role of recording the environment—environments, rather, for they are many and spread out over the near and the distant past. Having a record of the past is useful only in so far as it helps in predicting the future. The animal's body represents a kind of prediction that the future will resemble the ancestral past, in broad outline. The animal is likely to survive to the extent that this turns out to be true. And simulation models of the world allow the animal to act as if in anticipation of what that world is likely to throw its way in the next few seconds, hours or days. For completeness we must note that the brain itself, and its virtual reality software, are ultimately the products of natural selection of ancestral genes. We could say that the genes can predict a limited amount, because only in a general way will the future resemble the past. For the details and the subtleties, they provide the animal with nervous hardware and virtual reality software which will constantly update and revise its predictions to fit high-speed changes in circumstances. It is as if the genes say, 'We can model the basic shape of the environment, the things that don't change over the generations. But for the fast changes, over to you, brain.'
We move through a virtual world of our own brains' making. Our constructed models of rocks and of trees are a part of the environment in which we animals live, no less than the real rocks and trees that they represent. And, intriguingly, our virtual worlds must also be seen as part of the environment in which our genes are naturally selected. We have pictured camel genes as denizens of ancestral worlds, selected to survive in ancient deserts and even more ancient seas, selected to survive in companionship with compatible cartels of other camel genes. All that is true, and equivalent stories of Miocene trees and Pliocene savannahs can be told of our genes. What we must now add is that, among the worlds in which genes have survived, are virtual worlds constructed inside ancestral brains.
In the case of highly social animals like ourselves and our ancestors, our virtual worlds are, at least in part, group constructions. Especially since the invention of language and the rise of artifact and technology, our genes have had to survive in complex and changing worlds for which the most economical description we can find is shared virtual reality. It is a startling thought that, just as genes can be said to survive in deserts or forests, and just as they can be said to survive in the company of other genes in the gene pool, genes can also be said to survive in the virtual, even poetic worlds created by brains. It is to the enigma of the human brain itself that we turn in the final chapter.