From Eternity to Here: The Quest for the Ultimate Theory of Time - Sean Carroll (2010)



1 Wikipedia contributors (2009).

2 Let’s emphasize the directions here, because they are easily confused: Entropy measures disorder, not order, and it increases with time, not decreases. We informally think “things wind down,” but the careful way of saying that is “entropy goes up.”


3 In an effort not to be too abstract, we will occasionally lapse into a kind of language that assumes the directionality of time—“time passes,” we “move into the future,” stuff like that. Strictly speaking, part of our job is to explain why that language seems so natural, as opposed to phrasings along the lines of “there is the present, and there is also the future,” which seems stilted. But it’s less stressful to occasionally give into the “tensed” way of speaking, and question the assumptions behind it more carefully later on.

4 Because the planets orbit in ellipses rather than perfect circles, their velocity around the Sun is not strictly constant, and the actual angle that the Earth describes in its orbit every time Mars completes a single revolution will depend on the time of year. These are details that are easy to take care of when we actually sit down to carefully define units of time.

5 The number of vibrations per second is fixed by the size and shape of the crystal. In a watch, the crystal is tuned to vibrate 32,768 times per second, which happens to be equal to 2 to the 15th power. That number is chosen so that it’s easy for the watch’s inner workings to divide successively by 2 to obtain a frequency of exactly once per second, appropriate for driving the second hand of a watch.

6 Alan Lightman’s imaginative novel Einstein’s Dreams presents a series of vignettes that explore what the world would be like if time worked very differently than it does in the real world.

7 See for example Barbour (1999) or Rovelli (2008).

8 There is a famous joke, attributed to Einstein: “When a man sits with a pretty girl for an hour, it seems like a minute. But let him sit on a hot stove for a minute and it’s longer than any hour. That’s relativity.” I don’t know whether Einstein actually ever said those words. But I do know that’s not relativity.

9 Here is a possible escape clause, if we were really committed to restoring the scientific integrity of Baker’s fantasy: Perhaps time in the rest of the world didn’t completely stop, but just slowed down by a tremendous factor, and still ticked along at a sufficient rate that light could travel from the objects Arno was looking at to his eyes. Close, but no cigar. Even if that happened, the fact that the light was slowed down would lead to an enormous redshift—what looked like visible light in the ordinary world would appear to Arno as radio waves, which his poor eyes wouldn’t be able to see. Perhaps X-rays would be redshifted down to visible wavelengths, but X-ray flashlights are hard to come by. (It does, admittedly, provoke one into thinking how interesting a more realistic version of this scenario might be.)

10 Temporal: of or pertaining to time. It’s a great word that we’ll be using frequently. Sadly, an alternative meaning is “pertaining to the present life or this world”—and we’ll be roaming very far away from that meaning.

11 As a matter of historical accuracy, while Einstein played a central role in the formulation of special relativity, it was legitimately a collaborative effort involving the work of a number of physicists and mathematicians, including George FitzGerald, Hendrik Lorentz, and Henri Poincaré. It was eventually Hermann Minkowski who took Einstein’s final theory and showed that it could be understood in terms of a four-dimensional spacetime, which is often now called “Minkowski space.” His famous 1909 quote was “The views of space and time which I wish to lay before you have sprung from the soil of experimental physics, and therein lies their strength. They are radical. Henceforth space by itself, and time by itself, are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality” (Minkowski, 1909).

12 Pirsig (1974), 375.

13 Price (1996), 3.

14 Vonnegut (1969), 34. Quoted in Lebowitz (2008).

15 Augustine (1998), 235.

16 Good discussions of these issues can be found in Callender (2005), Lockwood (2005), and Davies (1995).

17 Philosophers often discuss different conceptions of time in terms laid out by J. M. E. McTaggart in his famous paper “The Unreality of Time” (1908). There, McTaggart distinguished between three different notions of time, which he labeled as different “series” (see also Lockwood, 2005). The A-series is a series of events measured relative to now, that move through time—“one year ago” doesn’t denote a fixed moment, but one that changes as time passes. The B-series is the sequence of events with permanent temporal labels, such as “October 12, 2009.” And the C-series is simply an ordered list of events—“x happens before y but after z”—without any time stamps at all. McTaggart argued—very roughly—that the B-series and C-series are fixed arrays, lacking the crucial element of change, and therefore insufficient to describe time. But the A-series itself is incoherent, as any specific event will be classified simultaneously as “past,” “present,” and “future,” from the point of view of different moments in time. (The moment of your birth is in the past to you now but was in the future to your parents when they first met.) Therefore, he concludes, time doesn’t exist.

If you get the feeling that this purported contradiction seems more like a problem with language than one with the nature of time, you are on the right track. To a physicist, there seems to be no contradiction between stepping outside the universe and thinking of all of spacetime at once, and admitting that from the point of view of any individual inside the universe time seems to flow.


18 Amis (1991), 11.

19 Fitzgerald (1922).

20 Carroll, L. (2000), 175.

21 Obviously.

22 Diedrick (1995) lists a number of stories that feature time reversals in one form or another, in addition to the ones mentioned here: Lewis Carroll’s Sylvie and Bruno, Jean Cocteau’s Le Testament d’Orphee, Brian Aldiss’s An Age, and Philip K. Dick’s Counter-Clock World. In T. H. White’s The Once and Future King, the character of Merlyn experiences time backward, although White doesn’t try very hard to consistently maintain the conceit. More recently, the technique has been used by Dan Simmons in Hyperion, and serves as a major theme in Andrew Sean Greer’s The Confessions of Max Tivoli and in Greg Egan’s short story “The Hundred-Year Diary.” Vonnegut’s Slaughterhouse-Five includes a brief description of the firebombing of Dresden in reversed order, which Amis credits in the Afterword to Time’s Arrow.

23 Stoppard (1999), 12.

24 In addition to the First Law of Thermodynamics (“the total energy remains constant in any physical process”) and the Second Law (“the entropy of a closed system never decreases”), there is also a Third Law: As the temperature of a system is lowered, there is a minimum value (absolute zero) for which the entropy is also a minimum. These three laws have been colorfully translated as: “You can’t win; you can’t break even; and you can’t even get out of the game.” There is also a Zeroth Law: If two systems are both in thermal equilibrium with a third system, they are in thermal equilibrium with each other. Feel free to invent your own whimsical sporting analogies.

25 Eddington (1927), 74.

26 Snow (1998), 15.

27 In fact, it would be fair to credit Sadi Carnot’s father, French mathematician and military officer Lazare Carnot, with the first glimmerings of this concept of entropy and the Second Law. In 1784, Lazare Carnot wrote a treatise on mechanics in which he argued that perpetual motion was impossible, because any realistic machine would dissipate useful energy through the rattling and shaking of its component parts. He later became a successful leader of the French Revolutionary Army.

28 Not strictly true, actually. Einstein’s general theory of relativity, which explains gravitation in terms of the curvature of spacetime, implies that what we ordinarily call “energy” is not really conserved, for example, in an expanding universe. We’ll talk about that in Chapter Five. But for the purposes of most combustion engines, the expansion of the universe can be neglected, and energy really is conserved.

29 Specifically, by “measures the number of ways we can rearrange the individual parts,” we mean “is proportional to the logarithm of the number of ways we can rearrange the individual parts.” See the Appendix for a discussion of logarithms, and Chapter Nine for a detailed discussion of the statistical definition of entropy.

30 The temperature of the surface of the Sun is approximately 5,800 Kelvin. (One Kelvin is the same as one degree Celsius, except that zero Kelvin corresponds to -273 degrees C—absolute zero, the lowest possible temperature.) Room temperature is approximately 300 Kelvin. Space—or, more properly, the cosmic background radiation that suffuses space—is at about 3 Kelvin. There is a nice discussion of the role of the Sun as a hot spot in a cold sky in Penrose (1989).

31 You will sometimes hear claims by creationists to the effect that evolution according to Darwinian natural selection is incompatible with the growth of entropy, since the history of life on Earth has involved increasingly complex organisms purportedly descending from less complex forms. This is crazy on several levels. The most basic level is simply: The Second Law refers to closed systems, and an organism (or a species, or the biosphere) is not a closed system. We’ll discuss this a bit more in Chapter Nine, but that’s basically all there is to it.

32 Thompson (1862).

33 Pynchon (1984), 88.


34 In fact there was a literal debate—the “Great Debate” between astronomers Harlow Shapley and Heber Curtis was held in 1920 at the Smithsonian in Washington, D.C. Shapley defended the position that the Milky Way was the entirety of the universe, while Curtis argued that the nebulae (or at least some of them, and in particular the Andromeda nebula M31) were galaxies like our own. Although Shapley ended up on the losing side of the big question, he did correctly understand that the Sun was not at the center of the Milky Way.

35 That’s a bit of poetic license. As we will explain later, the cosmological redshift is conceptually distinct from the Doppler effect, despite their close similarity. The former arises from the expansion of space through which the light is traveling, while the latter arises from the motion of the sources through space.

36 After decades of heroic effort, modern astronomers have finally been able to pin down the actual value of this all-important cosmological parameter: 72 kilometers per second per Megaparsec (Freedman et al., 2001). That is, for every million parsecs of distance between us and some galaxy, we will observe an apparent recession velocity of 72 km/sec. For comparison, the current size of the observable universe is about 28 billion parsecs across. A parsec is about 3.26 light years, or 30 trillion kilometers.

37 Strictly speaking, we should say “every sufficiently distant galaxy . . .” Nearby galaxies could be bound into pairs or groups or clusters under the influence of their mutual gravitational attraction. Such groups, like any bound systems, do not expand along with the universe; we say that they have “broken away from the Hubble flow.”

38 Admittedly, it’s a bit subtle. Just two footnotes prior, we said the observable universe was “28 billion parsecs” across. It’s been 14 billion years since the Big Bang, so you might think there are 14 billion light-years from here to the edge of the observable universe, which we can multiply by two to get the total diameter—28 billion light years, or about 9 billion parsecs, right? Was there a typo, or how can these be reconciled? The point is that distances are complicated by the fact that the universe is expanding, and in particular because it is being accelerated by dark energy. The physical distance today to the most distant galaxies within our observable universe is actually larger than 14 billion light-years. If you go through the math, the farthest point that was ever within our observable patch of universe is now 46 billion light-years, or 14 billion parsecs, distant.

39 The idea that particles aren’t created out of empty space should be clearly labeled as an assumption, although it seems to be a pretty good one—at least, within the current universe. (Later we’ll see that particles can very rarely appear from the vacuum in an accelerating universe, in a process analogous to Hawking radiation around black holes.) The old Steady State theory explicitly assumed the opposite, but had to invoke new kinds of physical processes to make it work (and it never really did).

40 To be careful about it, the phrase Big Bang is used in two different ways. One way is as we’ve just defined it—the hypothetical moment of infinite density at the beginning of the universe, or at least conditions in the universe very, very close to that moment in time. But we also speak of the “Big Bang model,” which is simply the general framework of a universe that expands from a hot, dense state according to the rules of general relativity; and sometimes we drop the model. So you might read newspaper stories about cosmologists “testing the predictions of the Big Bang.” You can’t test the predictions of some moment in time; you can only test predictions of a model. Indeed, the two concepts are fairly independent—we will be arguing later in the book that a complete theory of the universe will have to replace the conventional Big Bang singularity by something better, but the Big Bang model of the evolution of the universe over the last 14 billion years is well established and not going anywhere.

41 The microwave background has a messy history. George Gamow, Ralph Alpher, and Robert Herman wrote a series of papers in the late 1940s and early 1950s that clearly predicted the existence of relic microwave radiation from the Big Bang, but their work was subsequently largely forgotten. In the 1960s, Robert Dicke at Princeton and A. G. Doroshkevich and Igor Novikov in the Soviet Union independently recognized the existence and detectability of the radiation. Dicke went so far as to assemble a talented group of young cosmologists (including David Wilkinson and P. J. E. Peebles, who would go on to become leaders in the field) to build an antenna and search for the microwave background themselves. They were scooped by Penzias and Wilson, just a few miles away, who were completely unaware of their work. Gamow passed away in 1968, but it remains mysterious why Alpher and Herman never won the Nobel Prize for their predictions. They told their side of the story in a book, Genesis of the Big Bang (Alpher and Herman, 2001). In 2006, John Mather and George Smoot were awarded the Prize for their measurements of the blackbody spectrum and temperature anisotropies in the microwave background, using NASA’s Cosmic Background Explorer (COBE) satellite.

42 The full story is told by Farrell (2006).

43 Bondi and Gold (1948); Hoyle (1948).

44 See for example Wright (2008).

45 Needless to say, that’s making a long story very short. Type Ia supernovae are believed to be the result of the catastrophic gravitational collapse of white dwarf stars. A white dwarf is a star that has used up all of its nuclear fuel and just sits there quietly, supported by the basic fact that electrons take up space. But some white dwarfs have companion stars, from which matter can slowly dribble onto the dwarf. Eventually the white dwarf hits a point—the Chandrasekhar Limit, named after Subrah manyan Chandrasekhar—where the outward pressure due to electrons cannot compete with the gravitational pull, and the star collapses into a neutron star, ejecting its outer layers as a supernova. Because the Chandrasekhar Limit is approximately the same for every white dwarf in the universe, the brightness of the resulting explosions is approximately the same for every Type Ia supernova. (There are other types of supernovae, which don’t involve white dwarfs at all.) But astronomers have learned how to correct for the differences in brightness by using the empirical fact that brighter supernovae take longer to decline in brightness after the peak luminosity. The story of how astronomers search for such supernovae, and how they eventually discovered the acceleration of the universe, is told in Goldsmith (2000), Kirshner (2004), and Gates (2009); the original papers are Riess et al. (1998) and Perlmutter et al. (1999).

46 Another subtle point needs to be explained. The expansion rate of the universe is measured by the Hubble constant, which relates distance to redshift. It’s not really a “constant”; in the early universe the expansion was much faster, and what we might call the Hubble “parameter” was a lot larger than our current Hubble constant. We might expect that the phrase the universe is accelerating means “the Hubble parameter is increasing,” but that’s not true—it just means “it’s not decreasing very fast.” The “acceleration” refers to an increase in the apparent velocity of any particular galaxy over time. But that velocity is equal to the Hubble parameter times the distance, and the distance is increasing as the universe expands. So an accelerating universe is not necessarily one in which the Hubble parameter is increasing, just one in which the product of the Hubble parameter with the distance to any particular galaxy is increasing. It turns out that, even with a cosmological constant, the Hubble parameter never actually increases; it decreases more slowly as the universe expands and dilutes, until it approaches a fixed constant value after all the matter has gone away and there’s nothing left but cosmological constant.

47 We’re being careful to distinguish between two forms of energy that are important for the evolution of the contemporary universe: “matter,” made of slowly moving particles that dilute away as the universe expands, and “dark energy,” some mysterious stuff that doesn’t dilute away at all, but maintains a constant energy density. But matter itself comes in different forms: “ordinary matter,” including all of the kinds of particles we have ever discovered in experiments here on Earth, and “dark matter,” some other kind of particle that can’t be anything we’ve yet directly seen. The mass (and therefore energy) in ordinary matter is mostly in the form of atomic nuclei—protons and neutrons—but electrons also contribute. So ordinary matter includes you, me, the Earth, the Sun, stars, and all the gas and dust and rocks in space. We know how much of that stuff there is, and it’s not nearly enough to account for the gravitational fields observed in galaxies and clusters. So there must be dark matter, and we’ve ruled out all known particles as candidates; theorists have invented an impressive menu of possibilities, including “axions” and “neutralinos” and “Kaluza-Klein particles.” All told, ordinary matter makes up about 4 percent of the energy in the universe, dark matter makes up about 22 percent, and dark energy makes up about 74 percent. Trying to create or detect dark matter directly is a major goal of modern experimental physics. See Hooper (2007), Carroll (2007), or Gates (2009) for more details.

48 So how much energy is there in the dark energy, anyway? It’s about one calorie per cubic centimeter. Note that the “calories” used to measure the energy content of food are actually kilocalories (1,000 standard calories). So if we took all of the cubic centimeters within the volume of Lake Michigan, their total dark energy content is roughly equal to the nutritional energy content of one Big Mac. Seen another way, if we converted all of the dark energy in all the cubic centimeters within the volume of the Earth into electricity, it would be roughly equal to the electricity usage of an average American over one year. The point is, there’s not all that much dark energy per cubic centimeter—it’s spread very thinly throughout the universe. Of course, we cannot convert dark energy into useful energy of this form—dark energy is completely useless. (Why? Because it’s in a high-entropy state.)

49 Planck wasn’t really doing quantum gravity. In 1899, in attempting to understand some mysteries of blackbody radiation, he had hit upon the need for a new fundamental constant of nature, now known as “Planck ’s constant,” ħ.Taking that new quantity and multiplying and dividing in appropriate ways by the speed of light c and Newton’s constant of gravitation G, Planck invented a system of fundamental units that we now think of as characteristic of quantum gravity: the Planck length LP = 1.6 × 10-35 meters, the Planck time tP = 5.4 × 10-44 seconds, and the Planck mass MP = 2.2 × 10-8 kilograms, along with the Planck energy. Interestingly, Planck’s first thought was that the universal nature of these quantities—based in physics, rather than determined by human convention—could someday help us communicate with extraterrestrial civilizations.

50 Fred Adams and Greg Laughlin devoted an entire book to the subject, well worth reading (Adams and Laughlin, 1999).

51 Huw Price has diagnosed this tendency very convincingly (Price, 1996). He accuses cosmologists of an implicit double standard, applying criteria of naturalness to the early universe that they would never apply to the late universe, and vice versa. Price suggests that a consistent cosmology governed by time-symmetric laws should have time-symmetric evolution. Given that the Big Bang has a low entropy, this implies that the future should feature eventual re-collapse to a Big Crunch that also has low entropy—the Gold universe, first contemplated by Thomas Gold (of Steady State fame). In such a universe, the arrow of time would reverse when the universe reached its maximum size, and entropy would begin to decrease toward the Crunch. This kind of scenario seems less likely now that we have discovered dark energy. (The way we will meet Price’s challenge in this book is to imagine that the universe is indeed time-symmetric on large scales, with high entropy toward both the far past and the far future, which can obviously be achieved only if the Big Bang is not really the beginning.)

52 The universe is not actually going to collapse into one big black hole. As discussed, it’s going to empty out. Remarkably, however, in the presence of dark energy even empty space has entropy, and we obtain the same number (10120) for the maximum entropy of the observable universe. Note that 10120 was also the discrepancy between the theoretical estimate of the vacuum energy and its observed value. This apparent coincidence of two different numbers is actually the same coincidence as that between the current density of matter (which is related to the maximum entropy) and the energy density in a vacuum. In both cases, the numbers work out to be given by taking the size of the observable universe—roughly 10 billion light years—dividing by the Planck length, and squaring the result.


53 On the other hand, the achievements for which Paris Hilton is famous are also pretty mysterious.

54 Einstein’s “miraculous year” was 1905, when he published a handful of papers that individually would have capped the career of almost any other scientist: the definitive formulation of special relativity, the explanation of the photoelectric effect (implying the existence of photons and laying the groundwork for quantum mechanics), proposing a theory of Brownian motion in terms of random collisions at the atomic level, and uncovering the equivalence between mass and energy. For most of the next decade he concentrated on the theory of gravity; his ultimate answer, the general theory of relativity, was completed in 1915, when Einstein was thirty-six years old. He died in 1955 at the age of seventy-six.

55 We should also mention Dutch physicist Hendrik Antoon Lorentz, who beginning in 1892 developed the idea that times and distances were affected when objects moved near the speed of light, and derived the “Lorentz transformations,” relating measurements obtained by observers moving with respect to each other. To Lorentz, velocities were measured with respect to a background of aether; Einstein was the one who first realized that the aether was an unnecessary fiction.

56 Galison (2003). One gets the impression from Galison’s book that he finds the case of Poincaré to actually be more interesting than that of Einstein. However, when an author has a chance to put Einstein in a book title, his name will generally go first. Einstein is box office.

57 George Johnson (2008), in reviewing Leonard Susskind’s book The Black Hole Wars (2008), laments the fate of the modern reader of popular physics books.

I was eager to learn how, in the end, Susskind and company showed that Hawking was probably wrong—that information is indeed conserved. But first I had to get through a sixy-six-page crash course on relativity and quantum mechanics. Every book about contemporary physics seems to begin this way, which can be frustrating to anyone who reads more than one. (Imagine if every account of the 2008 presidential campaign had to begin with the roots of Athenian democracy and the heritage of the French Enlightenment.)

The solution is obvious: The basics of relativity and quantum mechanics should be a regular part of secondary education, just like the roots of Athenian democracy and the heritage of the French Enlightenment. In the meantime, this chapter will be part of the inevitable crash course, but by concentrating in particular on the role of “time” we’ll hopefully be able to avoid the most shopworn ways of explaining things.

58 Science fiction movies and television shows tend to flagrantly disregard this feature of reality, mostly for the practical reason that it’s very hard to fake weightlessness. (Star Trek: Enterprise did feature one amusing scene in which the ship “lost its gravity” while Captain Archer was taking a shower.) The artificial gravity you need to make the captain and crew stride purposefully about the ship’s bridge doesn’t seem compatible with the laws of physics as we know them. If you’re not accelerating, the only way to make that much gravity is to carry around a small planet’s worth of mass, which doesn’t seem practical.

59 Velocity is just the rate of change of position, and acceleration is the rate of change of velocity. In terms of calculus, velocity is the first derivative of the position, and acceleration is the second derivative. It is a deep feature of classical mechanics that the information one can specify about the state of a particle is its position and velocity; the acceleration is then determined by the local conditions and the appropriate laws of physics.

60 Left as exercises for the reader: Can we imagine a world in which absolute orientation in space were observable? What about a world in which position, velocity, and acceleration were all unobservable, but the rate of change of acceleration were observable?

61 Don’t get lost in the hypotheticals here. Today we strongly believe that there is not any medium pervading space, with respect to which we could measure our velocity. But they did believe that in the late nineteenth century; that’s the aether we’ll be talking about. On the other hand, we do believe that there are fields defined at every point in space, and some of those fields (such as the hypothetical Higgs field) might even have nonzero values in empty space. We now believe that waves, electromagnetic and otherwise, are propagating oscillations in these fields. But a field doesn’t really count as a “medium,” both because it can have a zero value, and because we can’t measure our velocity with respect to it.

On the third hand, it’s possible that we don’t know everything, and some imaginative theoretical physicists have been wondering whether there actually might be fields that do define a rest frame, and with respect to which we could imagine measuring our velocities (see, for example, Mattingly, 2005). Such fields have been whimsically dubbed “aether,” but they are not really the kind of aether that was being proposed in the nineteenth century. In particular, they have nothing to do with the propagation of electromagnetic waves, and are perfectly consistent with the underlying principles of relativity.

62 For some of the historical background, see Miller (1981). Many of the original papers concerning relativity are reprinted in Einstein (1923).

63 To actually experience length contraction or time dilation, we need either to have incredibly exquisite measuring devices, or to be moving at velocities close to the speed of light. Neither such devices nor such velocities are part of our everyday lives, which is why special relativity seems so counterintuitive to us. Of course, the fact that most objects around us have relative velocities that are small compared to the speed of light is an interesting fact about the world, which a complete theory of the universe should try to explain.

64 You might be suspicious that this argument doesn’t really demonstrate the impossibility of moving faster than light, only the impossibility of taking something moving slower than light and accelerating it to move faster than light. We might imagine that there exist objects that are always moving faster than light, so they don’t have to be accelerated. And that certainly is a logical possibility; such hypothetical particles are known as “tachyons.” But as far as we know, tachyons do not exist in the real world, and it’s a good thing, too; the ability to send signals faster than light would entail the ability to send signals backward in time, and that would wreak havoc with our notions of causality.

65 You will sometimes hear that special relativity is unable to deal with accelerating bodies, and you need general relativity to take acceleration into account. That is complete rubbish. General relativity is required when (and only when) gravity becomes important and spacetime is curved. Far away from any gravitational fields, where spacetime is flat, special relativity applies, no matter what is going on—including accelerating bodies. It’s true that freely falling (unaccelerated) trajectories have a special status in special relativity, as they are all created equal. But it is entirely incorrect to leap from there to the idea that accelerated trajectories cannot even be described within the language of special relativity.

66 Apologies for the sloppy lapse into temporal chauvinism (by presuming that one moves forward in time), not to mention giving in to the metaphor of “moving” through time. Rather than saying “Every object moves through spacetime,” it would be less prejudicial to say “The history of every object describes a world line that extends through spacetime.” But sometimes it’s just too tedious to be so pedantically precise all the time.

67 One way of relating relativity to Newtonian spacetime is to imagine “letting the speed of light get infinitely large.” Then the light cones we draw would become wider and wider, and the spacelike region would be squeezed down to a single surface, just as in the Newtonian setup. This is a suggestive picture but not terribly respectable. For one thing, we can always choose units in which the speed of light is unity; just measure time in years, and distance in light-years. So what we would actually try to do is change all of the constants of nature so that other velocities diminished with respect to the speed of light. Even if we did that, the process is highly non-unique; we have make an arbitrary choice about how to take the limit so that the light cones converge to some particular surfaces of constant time.

68 That is, at least three dimensions of space. It is quite possible, and taken for granted in certain corners of the theoretical-physics community, that there exist additional dimensions of space that for some reason are invisible to us, at least at the low energies to which we have ready access. There are a number of ways in which extra spatial dimensions could be hidden; see Greene (2000), or Randall (2005). Extra hidden timelike dimensions are considered much less likely, but you never know.

69 Both are reprinted in Einstein (1923).


70 Special relativity grew out of the incompatibility of Newtonian mechanics and Maxwellian electrodynamics, while general relativity grew out of the incompatibility of special relativity and Newtonian gravity. Right now, physics faces another troublesome incompatibility: general relativity and quantum mechanics. We are all hopeful that someday they will be united into a theory of quantum gravity. String theory is the leading candidate at present, but matters are not yet settled.

71 It might seem crazy that tension, which pulls things together, is responsible for the acceleration of the universe, which pushes things apart. The point is that the tension from dark energy is equal at every point throughout space, and precisely cancels, so there is no direct pulling. Instead, we are left with the indirect effect of the dark energy on the curvature of spacetime. That effect is to impart a perpetual push to the universe, because the dark energy density does not dilute away.

72 Here is another way of thinking about it. The fact that energy is conserved in Newtonian mechanics is a reflection of an underlying symmetry of the theory: time-translation invariance. The background spacetime in which particles move is fixed once and for all. But in general relativity that’s no longer true; the background is dynamical, pushing things around, changing their energies.

73 See Michell (1784); Laplace’s essay is reprinted as an appendix in Hawking and Ellis (1974). It is occasionally pointed out, with great raising of eyebrows and meaningful murmurs, that the radius of a “black star” as calculated according to Newtonian gravity is precisely the same size as the predicted Schwarzschild radius of a black hole in general relativity (2GM/c2, where G is Newton’s constant of gravitation, M is the mass of the object, and c is the speed of light). This coincidence is completely accidental, due primarily to the fact that there aren’t many ways you can create a quantity with units of length out of GM, and c.

74 For purposes of this chapter, we are assuming the validity of classical general relativity, even though we know that it must be replaced by a better theory when it comes to singularities. For more on these issues, see Hawking (1988) or Thorne (1994).

75 Feel free to construct your own moral lessons.


76 Referring, of course, to the time machines in George Pal’s 1960 movie version of H. G. Wells’s The Time Machine; Robert Zemeckis’s 1985 film Back to the Future; and the long-running BBC serial Doctor Who, respectively.

77 In the interest in getting on with our story, we’re not being completely fair to the subject of tachyons. Allowing objects that travel faster than light opens the door to paradoxes, but that doesn’t necessarily force us to walk through the door. We might imagine models that allowed for tachyons, but only in self-consistent ways. For some discussion see Feinberg (1967) or Nahin (1999). To make things more confusing, in quantum field theory the word “tachyon” often simply refers to a momentarily unstable configuration of a field, where nothing is actually traveling faster than light.

78 Gödel (1949). In doing research for their massive textbook Gravitation (1973), Charles Misner, Kip Thorne, and John Wheeler visited Gödel to talk about general relativity. What Gödel wanted to ask them, however, was whether contemporary astronomical observations had provided any hints for an overall rotation in the universe. He remained interested in the possible relevance of his solution to the real world.

79 Kerr (1963). The Kerr solution is discussed at a technical level in any modern textbook on general relativity, and at a popular level in Thorne (1994). Thorne relates the story of how Kerr presented his solution at the first Texas Symposium on Relativistic Astrophysics, only to be completely (and somewhat rudely) ignored by the assembled astrophysicists, who were busily arguing about quasars. To be fair, at the time Kerr found his solution he didn’t appreciate that it represented a black hole, although he knew it was a spinning solution to Einstein’s equation. Later on, astrophysicists would come to understand that quasars are powered by spinning black holes, described by Kerr’s spacetime.

80 Tipler (1974). The solution for the curvature of spacetime around an infinite cylinder was actually found by Willem Jacob van Stockum, Dutch physicist (and bomber pilot), in 1937, but Van Stockum didn’t notice that his solution contained closed timelike curves. An excellent overview of both research into time machines in general relativity, and the appearance of time travel in fiction, can be found in Nahin (1999).

81 Erwin Schrödinger, one of the pioneers of quantum mechanics, proposed a famous thought experiment to illustrate the bizarre nature of quantum superposition. He imagined placing the cat in a sealed box containing a radioactive source that, in some fixed time interval, had a 50 percent chance of decaying and activating a source that would release poison gas into the box. According to the conventional view of quantum mechanics, the resulting system is in an equal superposition of “alive cat” and “dead cat,” at least until someone observes the cat; see Chapter Eleven for discussion.

82 Kip Thorne has pointed out the “grandfather paradox” seems a bit squeamish, with the introduction of the extra generation and all, not to mention that it’s somewhat patriarchal. He suggests we should be contemplating the “matricide” paradox.

83 This rule is sometimes raised to the status of a principle; see discussions in Novikov (1983) or Horwich (1987). Philosophers such as Hans Reichenbach (1958) and Hilary Putnam (1962) have also emphasized that closed timelike curves do not necessitate the introduction of paradoxes, so long as the events in spacetime are internally consistent. Really, it’s just common sense. It’s perfectly obvious that there are no paradoxes in the real world; the interesting question is how Nature manages to avoid them.

84 In Chapter Eleven we’ll backtrack from this statement just a bit, when we discuss quantum mechanics. In quantum mechanics, the real world may include more than one classical history. David Deutsch (1997) has suggested that we might take advantage of multiple histories to include one in which you were in the Ice Age, and one in which you were not. (And an infinite number of others.)

85 Back to the Future was perhaps the least plausible time-travel movie ever. Marty McFly travels from the 1980s back to the 1950s, and commences to change the past right and left. What is worse, whenever he interferes with events that supposedly already happened, ramifications of those changes propagate “instantaneously” into the future, and even into a family photograph that Marty has carried with him. It is hard to imagine how that notion of “instantaneous” could be sensibly defined. Although perhaps not impossible—one would have to posit the existence of an additional dimension with many of the properties of ordinary time, through which Marty’s individual consciousness was transported by the effects of his actions. There is probably a good Ph.D. thesis in there somewhere: “Toward a Consistent Ontology of Time and Memory in Back to the Future, et seq.” I’m not sure what department it would belong to, however.

86 More or less the final word in consistent histories in the presence of closed timelike curves was explored in Robert A. Heinlein’s story “All You Zombies—” (1959). Through a series of time jumps and one sex-change operation, the protagonist manages to be his/her own father, mother, and recruiter into the Temporal Corps. Note that the life story is not, however, a self-contained closed loop; the character ages into the future.

87 For a discussion of this point see Friedman et al. (1990).

88 Actually, we are committed determinists. Human beings are made of particles and fields that rigidly obey the laws of physics, and in principle (although certainly not in practice) we could forget that we are human and treat ourselves as complicated collections of elementary particles. But that doesn’t mean we should shrink from facing up to how bizarre the problem of free will in the presence of closed timelike curves really is.

89 This is a bit more definitive-sounding than what physicists are able to actually prove. Indeed, in some extremely simplified cases we can show that the future can be predicted from the past, even in the presence of closed timelike curves; see Friedman and Higuchi (2006). It seems very likely (to me, anyway), that in more realistically complicated models this will no longer be the case; but a definitive set of answers has not yet been obtained.

90 We might be able to slice spacetime into moments of constant time, even in the presence of closed timelike curves—for example, we can do that in the simple circular-time universe. But that’s a very special case, and in a more typical spacetime with closed timelike curves it will be impossible to find any slicing that consistently covers the entire universe.

91 The exception, obviously, is the rotating black hole. We can certainly imagine creating such a hole by the collapse of a rotating star, but there is a different problem: The closed timelike curves are hidden behind an event horizon, so we can’t actually get there without leaving the external world behind once and for all. We’ll discuss later in the chapter whether that should count as an escape hatch. Perhaps more important, the solution found by Kerr that describes a rotating black hole is valid only in the ideal case where there is absolutely no matter in spacetime; it is a black hole all by itself, not one that is created by the collapse of a star. Most experts in general relativity believe that a real-world collapsing star would never give rise to closed timelike curves, even behind an event horizon.

92 Abbott (1899); see also Randall (2005).

93 The original paper was Gott (1991); he also wrote a popular-level book on the subject (2001). Almost every account you will read of this work will not talk about “massive bodies moving in Flatland,” but rather “perfectly straight, parallel cosmic strings moving in four-dimensional spacetime.” That’s because the two situations are precisely equivalent. A cosmic string is a hypothetical relic from the early universe that can be microscopically thin but stretch for cosmological distances; an idealized version would be perfectly straight and stretch forever, but in the real world cosmic strings would wiggle and curve in complicated ways. But if such a string were perfectly straight, nothing at all would depend on the direction of spacetime along that string; in technical terms, the entire spacetime would be invariant with respect to both translations and boosts along the string. Which means, in effect, that the direction along the string is completely irrelevant, and we are free to ignore it. If we simply forget that dimension, an infinitely long string in three-dimensional space becomes equivalent to a point particle in two-dimensional space. The same goes for a collection of several strings, as long as they are all perfectly straight and remain absolutely parallel to one another. Of course, the idea of pushing around infinitely long and perfectly straight strings is almost as bizarre as imagining that we live in a three-dimensional spacetime. That’s okay; we’re just making unrealistic assumptions because we want to push our theories to the edge of what is conceivable, to distinguish what is impossible in principle from what is merely a daunting technical challenge.

94 Soon after Gott’s paper appeared, Curt Cutler (1992) showed that the closed timelike curves extended to infinity, another signal that this solution didn’t really count as building a time machine (as we think of “building” as something that can be accomplished in a local region). Deser, Jackiw, and ’t Hooft (1992) examined Gott’s solution and found that the total momentum corresponded to that of a tachyon. I worked with Farhi, Guth, and Olum (1992, 1994) to show that an open Flatland universe could never contain enough energy to create a Gott time machine starting from scratch. ’t Hooft (1992) showed that a closed Flatland universe would collapse to a singularity before a closed timelike curve would have a chance to form.

95 Farhi, Guth, and Guven (1990).

96 Think of a plane, seen from the perspective of some particular point, as stretching around for 360 degrees of angle. What happens in Flatland is that every bit of energy decreases the total angle around you; we say that every mass is associated with a “deficit angle,” which is removed from the space by its presence. The more mass, the more angle is removed. The resulting geometry looks like a cone at large distances, rather than like a flat piece of paper. But there are only 360 degrees available to be removed, so there is an upper limit on the total amount of energy we can have in an open universe.

97 “Something like” because we are speaking of the topology of space, not its geometry. That is, we’re not saying that the curvature of space is everywhere perfectly spherical, just that you could smoothly deform it into a sphere. A spherical topology accommodates a deficit angle of exactly 720 degrees, twice the upper limit available in an open universe. Think of a cube (which is topologically equivalent to a sphere). It has eight vertices, each of which corresponds to a deficit angle of 90 degrees, for a total of 720.

98 Sagan (1985). The story of how Sagan’s questions inspired Kip Thorne’s work on wormholes and time travel is related in Thorne (1994).

99 As should be obvious from the dates, the work on wormhole time machines actually predates the Flatland explorations. But it involves a bit more exotic physics than Gott’s idea, so it’s logical to discuss the proposals in this order. The original wormhole-as-time-machine paper was Morris, Thorne, and Yurtsever (1988). A detailed investigation into the possible consistency of time travel in wormhole space- times was Friedman et al. (1990), and the story is related at a popular level in Thorne (1994).

100 I once introduced Bob Geroch for a talk he was giving. It’s useful in these situations to find an interesting anecdote to relate about the speaker, so I Googled around and stumbled on something perfect: a Star Trek site featuring a map of our galaxy, prominently displaying something called the “Geroch Wormhole.” (Apparently it connects the Beta Quadrant to the Delta Quadrant, and was the source of a nasty spat with the Romulans.) So I printed a copy of the map on a transparency and showed it during my introduction, to great amusement all around. Later Bob told me he assumed I had made it up myself, and was pleased to hear that his work on wormholes had produced a beneficial practical effect on the outside world. The paper that showed you would have to make a closed timelike curve in order to build a wormhole is Geroch (1967).

101 Hawking (1991). In his conclusion, Hawking also claimed that there was observational evidence that travel backward in time was impossible, based on the fact that we had not been invaded by historians from the future. He was joking (I’m pretty sure). Even if it were possible to construct closed timelike curves from scratch, they could never be used to travel backward to a time before the closed curves had been constructed. So there is no observational evidence against the possibility of building a time machine, just evidence that no one has built one yet.


102 See O’Connor and Robertson (1999), Rouse Ball (1908). You’ll remember Laplace as one of the people who were speculating about black holes long before general relativity came along.

103 Apparently Napoleon found this quite amusing. He related Laplace’s quip to Joseph Lagrange, another distinguished physicist and mathematician of the time. Lagrange responded with, “Ah, but it is a fine hypothesis; it explains so many things.” Rouse Ball (1908), 427.

104 Laplace (2007).

105 There is no worry that Laplace’s Demon exists out there in the universe, smugly predicting our every move. For one thing, it would have to be as big as the universe, and have a computational power equal to that of the universe itself.

106 Stoppard (1999), 103-4. Valentine, one presumes, is referring to the idea that the phenomenon of chaos undermines the idea of determinism. Chaotic dynamics, which is very real, happens when small changes in initial conditions lead to large differences in later evolution. As a practical matter, this makes the future extremely difficult to predict for systems that are chaotic (not everything is)—there will always be some tiny error in our understanding of the present state of a system. I’m not sure that this argument carries much force with respect to Laplace’s Demon. As a practical matter, there was no danger that we were ever going to know the entire state of the universe, much less use it to predict the future; this conception was always a matter of principle. And the prospect of chaos doesn’t change that at all.

107 Granted, physicists couldn’t actually live on any of our checkerboards, for essentially anthropic reasons: The setups are too simplistic to allow for the formation and evolution of complex structures that we might identify with intelligent observers. This stifling simplicity can be traced to an absence of interesting “interactions” between the different elements. In the checkerboard worlds we will look at, the entire description consists of just a single kind of thing (such as a vertical or diagonal line) stretching on without alteration. An interesting world is one in which things can persist more or less for an extended period of time, but gradually change via the influence of interactions with other things in the world.

108 This “one moment at a time” business isn’t perfectly precise, as the real world is not (as far as we know) divided up into discrete steps of time. Time is continuous, flowing smoothly from one time to another while going through every possible moment in between. But that’s okay; calculus provides exactly the right set of mathematical tools to make sense of “chugging forward one moment at a time” when time itself is continuous.

109 Note that translations in space and spatial inversions (reflections between left and right) are also perfectly good symmetries. That doesn’t seem as obvious, just from looking at the picture, but that’s only because the states themselves (the patterns of 0’s and 1’s) are not invariant under spatial shifts or reflections.

Lest you think these statements are completely vacuous, there are some symmetries that might have existed, but don’t. We cannot, for example, exchange the roles of time and space. As a general rule, the more symmetries you have, the simpler things become.

110 This whole checkerboard-worlds idea sometimes goes by the name of cellular automata. A cellular automaton is just some discrete grid that follows a rule for determining the next row from the state of the previous row. They were first investigated in the 1960s, by John von Neumann, who is also the guy who figured out how entropy works in quantum mechanics. Cellular automata are fascinating for many reasons having little to do with the arrow of time; they can exhibit great complexity and can function as universal computers. See Poundstone (1984) or Shalizi (2009).

Not only are we disrespecting cellular automata by pulling them out only to illustrate a few simple features of time reversal and information conservation, but we are also not speaking the usual language of cellular-automaton cognoscenti. For one thing, computer scientists typically imagine that time runs from top to bottom. That’s crazy; everyone knows that time runs from bottom to top on a diagram. More notably, even though we are speaking as if each square is either in the state “white” or the state “gray,” we just admitted that you have to keep track of more information than that to reliably evolve into the future in what we are calling example B. That’s no problem; it just means that we’re dealing with an automaton where the “cells” can take on more than two different states. One could imagine going beyond white and gray to allow squares to have any of four different colors. But for our current purposes that’s a level of complexity we needn’t explicitly introduce.

111 If the laws of physics are not completely deterministic—if they involve some random, stochastic element—then the “specification” of the future evolution will involve probabilities, rather than certainties. The point is that the state includes all of the information that is required to do as well as we can possibly do, given the laws of physics that we are working with.

112 Sometimes people count relativity as a distinct theory, distinguishing between “classical mechanics” and “relativistic mechanics.” But more often they don’t. It makes sense, for most purposes, to think of relativity as introducing a particular kind of classical mechanics, rather than a completely new way of thinking. The way we specify the state of a system, for example, is pretty much the same in relativity as it would be in Newtonian mechanics. Quantum mechanics, on the other hand, really is quite different. So when we deploy the adjective classical, it will usually denote a contrast with quantum, unless otherwise specified.

113 It is not known, at least to me, whether Newton himself actually played billiards, although the game certainly existed in Britain at the time. Immanuel Kant, on the other hand, is known to have made pocket money as a student playing billiards (as well as cards).

114 So the momentum is not just a number; it’s a vector, typically denoted by a little arrow. A vector can be defined as a magnitude (length) and a direction, or as a combination of sub-vectors (components) pointing along each direction of space. You will hear people speak, for example, of “the momentum along the x-direction.”

115 This is a really good question, one that bugged me for years. At various points when one studies classical mechanics, there are times when one hears one’s teachers talk blithely about momenta that are completely inconsistent with the actual trajectory of the system. What is going on?

The problem is that, when we are first introduced to the concept of “momentum,” it is typically defined as the mass times the velocity. But somewhere along the line, as you move into more esoteric realms of classical mechanics, that idea ceases to be a definition and becomes something that you can derive from the underlying theory. In other words, we start conceiving of the essence of momentum as “some vector (magnitude and direction) defined at each point along the path of the particle,” and then derive equations of motion that insist the momentum will be equal to the mass times the velocity. (This is known as the Hamiltonian approach to dynamics.) That’s the way we are thinking in our discussion of time reversal. The momentum is an independent quantity, part of the state of the system; it is equal to the mass times the velocity only when the laws of physics are being obeyed.

116 David Albert (2000) has put forward a radically different take on all this. He suggests that we should define a “state” to be just the positions of particles, not the positions and momenta (which he would call the “dynamical condition”). He justifies this by arguing that states should be logically independent at each moment of time—the states in the future should not depend on the present state, which clearly they do in the way we defined them, as that was the entire point. But by redefining things in this way, Albert is able to live with the most straightforward definition of time-reversal invariance: “A sequence of states played backward in time still obeys the same laws of physics,” without resorting to any arbitrary-sounding transformations along the way. The price he pays is that, although Newtonian mechanics is time-reversal invariant under this definition, almost no other theory is, including classical electromagnetism. Which Albert admits; he claims that the conventional understanding that electromagnetism is invariant under time reversal, handed down from Maxwell to modern textbooks, is simply wrong. As one might expect, this stance invited a fusillade of denunciations; see, for example, Earman (2002), Arntzenius (2004), or Malament (2004).

Most physicists would say that it just doesn’t matter. There’s no such thing as the one true meaning of time-reversal invariance, which is out there in the world waiting for us to capture its essence. There are only various concepts, which we may or may not find useful in thinking about how the world works. Nobody disagrees on how electrons move in the presence of a magnetic field; they just disagree on the words to use when describing that situation. Physicists tend to express bafflement that philosophers care so much about the words. Philosophers, for their part, tend to express exasperation that physicists can use words all the time without knowing what they actually mean.

117 Elementary particles come in the form of “matter particles,” called “fermions,” and “force particles,” called “bosons.” The known bosons include the photon carrying electromagnetism, the gluons carrying the strong nuclear force, and the W and Z bosons carrying the weak nuclear force. The known fermions fall neatly into two types: six different kinds of “quarks,” which feel the strong force and get bound into composite particles like protons and neutrons, and six different kinds of “leptons,” which do not feel the strong force and fly around freely. These two groups of six are further divided into collections of three particles each; there are three quarks with electric charge +2/3 (the up, charm, and top quarks), three quarks with electric charge -⅓ (the down, strange, and bottom quarks), three leptons with electric charge -1 (the electron, the muon, and the tau), and three leptons with zero charge (the electron neutrino, the muon neutrino, and the tau neutrino). To add to the confusion, every type of quark and lepton has a corresponding antiparticle with the opposite electric charge; there is an anti-up-quark with charge -2/3, and so on.

All of which allows us to be a little more specific about the decay of the neutron (two down quarks and one up): it actually creates a proton (two up quarks and one down), an electron, and an electron antineutrino. It’s important that it’s an antineutrino, because that way the net number of leptons doesn’t change; the electron counts as one lepton, but the antineutrino counts as minus one lepton, so they cancel each other out. Physicists have never observed a process in which the net number of leptons or the net number of quarks changes, although they suspect that such processes must exist. After all, there seem to be a lot more quarks than antiquarks in the real world. (We don’t know the net number of leptons very well, since it’s very hard to detect most neutrinos in the universe, and there could be a lot of antineutrinos out there.)

118 “Easiest” means “lowest in mass,” because it takes more energy to make higher-mass particles, and when you do make them they tend to decay more quickly. The lightest two kinds of quarks are the up (charge +2/3) and the down (charge -1/3), but combining an up with an anti-down does not give a neutral particle, so we have to look at higher-mass quarks. The next heaviest is the strange quark, with charge -1/3, so it can be combined with a down to make a kaon.

119 Angelopoulos et al. (1998). A related experiment, measuring time-reversal violation by neutral kaons in a slightly different way, was carried out by the KTeV collaboration at Fermilab, outside Chicago (Alavi-Harati et al. 2000).

120 Quoted in Maglich (1973). The original papers were Lee and Yang (1956) and Wu et al. (1957). As Wu had suspected, other physicists were able to reproduce the result very rapidly; in fact, another group at Columbia performed a quick confirmation experiment, the results of which were published back-to-back with the Wu et al. paper (Garwin, Lederman, and Weinrich, 1957).

121 Christenson et al. (1964). Within the Standard Model of particle physics, there is an established method to account for CP violation, developed by Makoto Kobayashi and Toshihide Maskawa (1973), who generalized an idea due to Nicola Cabbibo. Kobayashi and Maskawa were awarded the Nobel Prize in 2008.

122 We’re making a couple of assumptions here: namely, that the laws are time-translation invariant (not changing from moment to moment), and that they are deterministic (the future can be predicted with absolute confidence, rather than simply with some probability). If either of these fails to be true, the definition of whether a particular set of laws is time-reversal invariant becomes a bit more subtle.


123 Almost the same example is discussed by Wheeler (1994), who attributes it to Paul Ehrenfest. In what Wheeler calls “Ehrenfest’s Urn,” exactly one particle switches side at every step, rather than every particle having a small chance of switching sides.

124 When we have 2 molecules on the right, the first one could be any of the 2,000, and the second could be any of the remaining 1,999. So you might guess there are 1,999 × 2,000 = 3,998,000 different ways this could happen. But that’s overcounting a bit, because the two molecules on the right certainly don’t come in any particular order. (Saying “molecules 723 and 1,198 are on the right” is exactly the same statement as “molecules 1,198 and 723 are on the right.”) So we divide by two to get the right answer: There are 1,999,000 different ways we can have 2 molecules on the right and 1,998 on the left. When we have 3 molecules on the right, we take 1,998 × 1,999 × 2,000 and divide by 3 × 2 different orderings. You can see the pattern; for 4 particles, we would divide 1,997 × 1,998 × 1,999 × 2,000 by 4 × 3 × 2, and so on. These numbers have a name—“binomial coefficients”—and they represent the number of ways we can choose a certain set of objects out of a larger set.

125 We are assuming the logarithm is “base 10,” although any other base can be used. The “logarithm base 2” of 8 = 23 is 3; the logarithm base 2 of 2,048 = 211 is 11. See Appendix for fascinating details.

126 The numerical value of k is about 3.2 × 10-16 ergs per Kelvin; an erg is a measure of energy, while Kelvin of course measures temperature. (That’s not the value you will find in most references; this is because we are using base-10 logarithms, while the formula is more often written using natural logarithms.) When we say “temperature measures the average energy of moving molecules in a substance,” what we mean is “the average energy per degree of freedom is one-half times the temperature times Boltzmann’s constant.”

127 The actual history of physics is so much messier than the beauty of the underlying concepts. Boltzmann came up with the idea of “S = k log W,” but those are not the symbols he would have used. His equation was put into that form by Max Planck, who suggested that it be engraved on Boltzmann’s tomb; it was Planck who first introduced what we now call “Boltzmann’s constant.” To make things worse, the equation on the tomb is not what is usually called “Boltzmann’s equation”—that’s a different equation discovered by Boltzmann, governing the evolution of a distribution of a large number of particles through the space of states.

128 One requirement of making sense of this definition is that we actually know how to count the different kinds of microstates, so we can quantify how many of them belong to various macrostates. That sounds easy enough when the microstates form a discrete set (like distributions of particles in one half of a box or the other half) but becomes trickier when the space of states is continuous (like real molecules with specific positions and momenta, or almost any other realistic situation). Fortunately, within the two major frameworks for dynamics—classical mechanics and quantum mechanics—there is a perfectly well-defined “measure” on the space of states, which allows us to calculate the quantity W, at least in principle. In some particular examples, our understanding of the space of states might get a little murky, in which case we need to be careful.

129 Feynman (1964), 119-20.

130 I know what you’re thinking. “I don’t know about you, but when I dry myself off, most of the water goes onto the towel; it’s not fifty-fifty.” That’s true, but the reason why is because the fiber structure of a nice fluffy towel provides many more places for the water to be than your smooth skin does. That’s also why your hair doesn’t dry as efficiently, and why you can’t dry yourself very well with pieces of paper.

131 At least in certain circumstances, but not always. Imagine we had a box of gas, where every molecule on the left side was “yellow” and every molecule on the right was “green,” although they were otherwise identical. The entropy of that arrangement would be pretty low and would tend to go up dramatically if we allowed the two colors to mix. But we couldn’t get any useful work out of it.

132 The ubiquity of friction and noise in the real world is, of course, due to the Second Law. When two billiard balls smack into each other, there are only a very small number of ways that all the molecules in each ball could respond precisely so as bounce off each other without disturbing the outside world in any way; there are a much larger number of ways that those molecules can interact gently with the air around them to create the noise of the two balls colliding. All of the guises of dissipation in our everyday lives—friction, air resistance, noise, and so on—are manifestations of the tendency of entropy to increase.

133 Thought of yet another way: The next time you are tempted to play the Powerball lottery, where you pick five numbers between 1 and 59 and hope that they come up in a random drawing, pick the numbers “1, 2, 3, 4, 5.” That sequence is precisely as likely as any other “random-looking” sequence. (Of course, a nationwide outcry would ensue if you won, as people would suspect that someone had rigged the drawing. So you’d probably never collect, even if you got lucky.)

134 Strictly speaking, since there are an infinite number of possible positions and an infinite number of possible momenta for each particle, the number of microstates per macrostate is also infinite. But the possible positions and momenta for a particle on the left side of the box can be put into one-to-one correspondence with the possible positions and momenta on the right side; even though both are infinite, they’re “the same infinity.” So it’s perfectly legitimate to say that there are an equal number of possible states per particle on each side of the box. What we’re really doing is counting “the volume of the space of states” corresponding to a particular macrostate.

135 To expand on that a little bit, at the risk of getting hopelessly abstract: As an alternative to averaging within a small region of space, we could imagine averaging over a small region in momentum space. That is, we could talk about the average position of particles with a certain value of momentum, rather than vice versa. But that’s kind of crazy; that information simply isn’t accessible via macroscopic observation. That’s because, in the real world, particles tend to interact (bump into one another) when they are nearby in space, but nothing special happens when two distant particles have the same momentum. Two particles that are close to each other in position can interact, no matter what their relative velocities are, but the converse is not true. (Two particles that are separated by a few light years aren’t going to interact noticeably, no matter what their momentum is.) So the laws of physics pick out “measuring average properties within a small region of space” as a sensible thing to do.

136 A related argument has been given by mathematician Norbert Wiener in Cybernetics (1961), 34.

137 There is a loophole. Instead of starting with a system that had delicately tuned initial conditions for which the entropy would decrease, and then letting it interact with the outside world, we could just ask the following question: “Given that this system will go about interacting with the outside world, what state do I need to put it in right now so that its entropy will decrease in the future?” That kind of future boundary condition is not inconceivable, but it’s a little different than what we have in mind here. In that case, what we have is not some autonomous system with a naturally reversed arrow of time, but a conspiracy among every particle in the universe to permit some subsystem to decrease in entropy. That subsystem would not look like the time-reverse of an ordinary object in the universe; it would look like the rest of the world was conspiring to nudge it into a low-entropy state.

138 Note the caveat “at room temperature.” At a sufficiently high temperature, the velocity of the individual molecules is so high that the water doesn’t stick to the oil, and once again a fully mixed configuration has the highest entropy. (At that temperature the mixture will be vapor.) In the messy real world, statistical mechanics is complicated and should be left to professionals.

139 Here is the formula: For each possible microstate x, let px be the probability that the system is in that microstate. The entropy is then the sum over all possible microstates x of the quantity -kpx log px, where k is Boltzmann’s constant.

140 Boltzmann actually calculated a quantity H, which is essentially the difference between the maximum entropy and the actual entropy, thus the name of the theorem. But that name was attached to the theorem only later on, and in fact Boltzmann himself didn’t even use the letter H; he called it E, which is even more confusing. Boltzmann’s original paper on the H-Theorem was 1872; an updated version, taking into account some of the criticisms by Loschmidt and others, was 1877. We aren’t coming close to doing justice to the fascinating historical development of these ideas; for various different points of view, see von Baeyer (1998), Lindley (2001), and Cercignani (1998); at a more technical level, see Ufflink (2004) and Brush (2003). Any Yale graduates, in particular, will lament the short shrift given to the contributions of Gibbs; see Rukeyser (1942) to redress the balance.

141 Note that Loschmidt is not saying that there are equal numbers of increasing-entropy and decreasing-entropy evolutions that start with the same initial conditions. When we consider time reversal, we switch the initial conditions with the final conditions; all Loschmidt is pointing out is that there are equal numbers of increasing-entropy and decreasing-entropy evolutions overall, when we consider every possible initial condition. If we confine our attention to the set of low-entropy initial conditions, we can successfully argue that entropy will usually increase; but note that we have sneaked in time asymmetry by starting with low-entropy initial conditions rather than final ones.

142 Albert (2000); see also (among many examples) Price (2004). Although I have presented the need for a Past Hypothesis as (hopefully) perfectly obvious, its status is not uncontroversial. For a dash of skepticism, see Callender (2004) or Earman (2006).

143 Readers who have studied some statistical mechanics may wonder why they don’t recall actually doing this. The answer is simply that it doesn’t matter, as long as we are trying to make predictions about the future. If we use statistical mechanics to predict the future behavior of a system, the predictions we get based on the Principle of Indifference plus the Past Hypothesis are indistinguishable from those we would get from the Principle of Indifference alone. As long as there is no assumption of any special future boundary condition, all is well.


144 Quoted in Tribus and McIrvine (1971).

145 Proust (2004), 47.

146 We are, however, learning more and more all the time. See Schacter, Addis, and Buckner (2007) for a recent review of advances in neuroscience that have revealed how the way actual brains reconstruct memories is surprisingly similar to the way they go about imagining the future.

147 Albert (2000).

148 Rowling (2005).

149 Callender (2004). In Callender’s version, it’s not that you die; it’s that the universe ends, but I didn’t want to get confused with Big Crunch scenarios. But really, it would be nice to see more thought experiments in which the future boundary condition was “you fall in love” or “you win the lottery.”

150 Davis (1985, 11) writes: “I will lay out four rules, but each is really only a special application of the great principle of causal order: after cannot cause before . . . there is no way to change the past . . . one-way arrows flow with time.”

151 There are a number of references that go into the story of Maxwell’s Demon in greater detail than we will here. Leff and Rex (2003) collect a number of the original papers. Von Baeyer (1998) uses the Demon as a theme to trace the history of thermodynamics; Seife (2006) gives an excellent introduction to information theory and its role in unraveling this puzzle. Bennett and Landauer themselves wrote about their work in Scientific American (Bennett and Landauer, 1985; Bennett, 1987).

152 This scenario can be elaborated on further. Imagine that the box was embedded in a bath of thermal gas at some temperature T, and that the walls of the box conducted heat, so that the molecule inside was kept in thermal equilibrium with the gas outside. If we could continually renew our information about which side of the box the molecule was on, we could keep extracting energy from it, by cleverly inserting the piston on the appropriate side; after the molecule lost energy to the piston, it would gain the energy back from the thermal bath. What we’ve done is to construct a perpetual motion machine, powered only by our hypothetical limitless supply of information. (Which drives home the fact that information never just comes for free.) Szilárd could even quantify precisely how much energy could be extracted from a single bit of information: kT log 2, where k is Boltzmann’s constant.

153 It’s interesting how, just as much of the pioneering work on thermodynamics in the early nineteenth century was carried out by practical-minded folks who were interested in building better steam engines, much of the pioneering work on information theory in the twentieth century has been carried out by practical-minded folks who were interested in building better communications systems and computers.

154 We can go further than this. Just as Gibbs came up with a definition of entropy that referred to the probability that a system was in various different states, we can define the “information entropy” of a space of possible messages in terms of the probability that the message takes various forms. The formulas for the Gibbs entropy and the information entropy turn out to be identical, although the symbols in them have slightly different meanings.

155 For recent overviews, see Morange (2008) or Regis (2009).

156 The argument that follows comes from Bunn (2009), which was inspired by Styer (2008). See also Lineweaver and Egan (2008) for details and additional arguments.

157 Crick (1990).

158 Schrödinger (1944), 69.

159 From Being to Becoming is the title of a popular book (1980) by Belgian Nobel Laureate Ilya Prigogine, who helped pioneer the study of “dissipative structures” and self-organizing systems in statistical mechanics. See also Prigogine (1955), Kauffman (1993), and Avery (2003).

160 A good recent book is Nelson (2007).

161 He would have been even more wary in modern times; a Google search on “free energy” returns a lot of links to perpetual-motion schemes, along with some resources on clean energy.

162 Informally speaking, the concepts of “useful” and “useless” energy certainly predate Gibbs; his contribution was to attach specific formulas to the ideas, which were later elaborated on by German physicist Hermann von Helmholtz. In particular, what we are calling the “useless” energy is (in Helmholtz’s formulation) simply the temperature of the body times its entropy. The free energy is then the total internal energy of the body minus that quantity.

163 In the 1950s, Claude Shannon built “The Ultimate Machine,” based on an idea by Marvin Minsky. In its resting state, the machine looked like a box with a single switch on one face. If you were to flip the switch, the box would buzz loudly. Then the lid would open and a hand would reach out, flipping the switch back to its original position, and retreat back into the box, which became quiet once more. One possible moral of which is: Persistence can be a good in its own right.

164 Specifically, more massive organisms—which typically have more moving parts and are correspondingly more complex—consume free energy at a higher rate per unit mass than less massive organisms. See, for example, Chaisson (2001).

165 This and other quantitative measures of complexity are associated with the work of Andrey Kolmogorov, Ray Solomonoff, and Gregory Chaitin. For a discussion, see, for example, Gell-Mann (1994).

166 For some thoughts on this particular question, see Dyson (1979) or Adams and Laughlin (1999).


167 Nietzsche (2001), 194. What is it with all the demons, anyway? Between Pascal’s Demon, Maxwell’s Demon, and Nietzsche’s Demon, it’s beginning to look more like Dante’s Inferno than a science book around here. Earlier in The Gay Science (189), Nietzsche touches on physics explicitly, although in a somewhat different context: “We, however, want to become who we are—human beings who are new, unique, incomparable, who give themselves laws, who create themselves! To that end we must become the best students and discoverers of everything lawful and necessary in the world: we must become physicists in order to be creators in this sense—while hitherto all valuations and ideals have been built on ignorance of physics or in contradiction to it. So, long live physics! And even more, long live what compels us to it—our honesty!”

168 Note that, if each cycle were truly a perfect copy of the previous cycles, you would have no memory of having experienced any of the earlier versions (since you didn’t have such a memory before, and it’s a perfect copy). It’s not clear how different such a scenario would be than if the cycle occurred only once.

169 For more of the story, see Galison (2003). Poincaré’s paper is (1890).

170 Another subtlety is that, while the system is guaranteed to return to its starting configuration, it is not guaranteed to attain every possible configuration. The idea that a sufficiently complicated system does visit every possible state is equivalent to the idea that the system is ergodic, which we discussed in Chapter Eight in the context of justifying Boltzmann’s approach to statistical mechanics. It’s true for some systems, but not for all systems, and not even for all interesting ones.

171 It’s my book, so Pluto still counts.

172 Roughly speaking, the recurrence time is given by the exponential of the maximum entropy of the system, in units of the typical time it takes for the system to evolve from one state to the next. (We are assuming some fixed definition of when two states are sufficiently different as to count as distinguishable.) Remember that the entropy is the logarithm of the number of states, and an exponential undoes a logarithm; in other words, the recurrence time is simply proportional to the total number of possible states the system can be in, which makes perfect sense if the system spends roughly equal amounts of time in each allowed state.

173 Poincaré (1893).

174 Zermelo (1896a).

175 Boltzmann (1896).

176 Zermelo (1896b); Boltzmann (1897).

177 Boltzmann (1897).

178 “At least” three ways, because the human imagination is pretty clever. But there aren’t that many choices. Another one would be that the underlying laws of physics are intrinsically irreversible.

179 Boltzmann (1896).

180 We’re imagining that the spirit of the recurrence theorem is valid, not the letter of it. The proof of the recurrence theorem requires that the motions of particles be bounded—perhaps because they are planets moving in closed orbits around the Sun, or because they are molecules confined to a box of gas. Neither case really applies to the universe, nor is anyone suggesting that it might. If the universe consisted of a finite number of particles moving in an infinite space, we would expect some of them to simply move away forever, and recurrences would not happen. However, if there are an infinite number of particles in an infinite space, we can have a fixed finite average density—the number of particles per (for example) cubic light-year. In that case, fluctuations of the form illustrated here are sure to occur, which look for all the world like Poincaré’s recurrences.

181 Boltzmann (1897). He made a very similar suggestion in a slightly earlier paper (1895), where he attributed it to his “old assistant, Dr. Schuetz.” It is unclear whether this attribution should be interpreted as a generous sharing of credit, or a precautionary laying of blame.

182 Note that Boltzmann’s reasoning actually goes past the straightforward implications of the recurrence theorem. The crucial point now is not that any particular low-entropy starting state will be repeated infinitely often in the future—although that’s true—but that anomalously low-entropy states of all sorts will eventually appear as random fluctuations.

183 Epicurus is associated with Epicureanism, a philosophical precursor to utilitarianism. In the popular imagination, “epicurean” conjures up visions of hedonism and sensual pleasure, especially where food and drink are concerned; while Epicurus himself took pleasure as the ultimate good, his notion of “pleasure” was closer to “curling up with a good book ” than “partying late into the night” or “gorging yourself to excess.”

Much of the original writing by the Atomists has been lost; Epicurus, in particular, wrote a thirty-seven-volume treatise on nature, but his only surviving writings are three letters reproduced in Diogenes Laertius’s Lives of the Philosophers. The atheistic implications of their materialist approach to philosophy were not always popular with later generations.

184 Lucretius (1995), 53.

185 A careful quantitative understanding of the likelihood of different kinds of fluctuations was achieved only relatively recently, in the form of something called the “fluctuation theorem” (Evans and Searles, 2002). But the basic idea has been understood for a long time. The probability that the entropy of a system will take a random jump downward is proportional to the exponential of minus the change in entropy. That’s a fancy way of saying that small fluctuations are common, and large fluctuations are extremely rare.

186 It’s tempting to think, But it’s incredibly unlikely for a featureless collection of gas molecules in equilibrium to fluctuate into a pumpkin pie, while it’s not that hard to imagine a pie being created in a world with a baker and so forth. True enough. But as hard as it is to fluctuate a pie all by itself, it’s much more difficult to fluctuate a baker and a pumpkin patch. Most pies that come to being under these assumptions—an eternal universe, fluctuating around equilibrium—will be all by themselves in the universe. The fact that the world with which we are familiar doesn’t seem to work that way is evidence that something about these assumptions is not right.

187 Eddington (1931). Note that what really matters here is not so much the likelihood of significant dips in the entropy of the entire universe, but the conditional question: “Given that one subset of the universe has experienced a dip in entropy, what should we expect of the rest of the universe?” As long as the subset in question is coupled weakly to everything else, the answer is what you would expect, and what Eddington indicated: The entropy of the rest of the universe is likely to be as high as ever. For discussions (at a highly mathematical level) in the context of classical statistical mechanics, see the books by Dembo and Zeitouni (1998) or Ellis (2005). For related issues in the context of quantum mechanics, see Linden et al. (2008).

188 Albrecht and Sorbo (2004).

189 Feynman, Leighton, and Sands (1970).

190 This discussion draws from Hartle and Srednicki (2007). See also Olum (2002), Neal (2006), Page (2008), Garriga and Vilenkin (2008), and Bousso, Freivogel, and Yang (2008).

191 There are a couple of closely related questions that arise when we start comparing different kinds of observers in a very large universe. One is the “simulation argument” (Bostrom 2003), which says that it should be very easy for an advanced civilization to make a powerful computer that simulates a huge number of intelligent beings, and therefore we are most likely to be living in a computer simulation. Another is the “doomsday argument” (Leslie, 1990; Gott, 1993), which says that the human race is unlikely to last for a very long time, because if it did, those of us (now) who live in the early days of human civilization would be very atypical observers. These are very provocative arguments; their persuasive value is left up to the judgment of the reader.

192 See Neal (2006), who calls this approach “Full Non-indexical Conditioning.” “Conditioning” means that we make predictions by asking what the rest of the universe looks like when certain conditions hold (e.g., that we are an observer with certain properties); “full” means that we condition over every single piece of data we have, not only coarse features like “we are an observer”; and “non-indexical” means that we consider absolutely every instance in which the conditions are met, not just one particular instance that we label as “us.”

193 Boltzmann’s travelogue is reprinted in Cercignani (1998), 231. For more details of his life and death, see that book as well as Lindley (2001).


194 Quoted in von Baeyer (2003), 12-13.

195 This is not to say that the ancient Buddhists weren’t wise, but their wisdom was not based on the failure of classical determinism at atomic scales, nor did they anticipate modern physics in any meaningful way, other than the inevitable random similarities of word choice when talking about grand cosmic concepts. (I once heard a lecture claming that the basic ideas of primordial nucleosynthesis were prefigured in the Torah; if you stretch your definitions enough, eerie similarities are everywhere.) It is disrespectful to both ancient philosophers and modern physicists to ignore the real differences in their goals and methods in an attempt to create tangible connections out of superficial resemblances.

196 More recently, dogs have also been recruited for the cause. See Orzel (2009).

197 We’re still glossing over one technicality—the truth is actually one step more complex (as it were) than this description would have you believe, but it’s not a complication that is necessary for our present purposes. Quantum amplitudes are really complex numbers, which means they are combinations of two numbers: a real number, plus an imaginary number. (Imaginary numbers are what you get when you take the square root of a negative real number; so “imaginary two” is the square root of minus four, and so on.) A complex number looks like a + bi, where a and b are real numbers and “i” is the square root of minus one. If the amplitude associated with a certain option is a + bi, the probability it corresponds to is simply a2 + b2, which is guaranteed to be greater than or equal to zero. You will have to trust me that this extra apparatus is extremely important to the workings of quantum mechanics—either that, or start learning some of the mathematical details of the theory. (I can think of less rewarding ways of spending your time, actually.)

198 The fact that any particular sequence of events assigns positive or negative amplitudes to the two final possibilities is an assumption we are making for the purposes of our thought experiment, not a deep feature of the rules of quantum mechanics. In any real-world problem, details of the system being considered will determine what precisely the amplitudes are, but we’re not getting our hands quite that dirty at the moment. Note also that the particular amplitudes in these examples take on the numerical values of plus or minus 0.7071—that’s the number which, when squared, gives you 0.5.

199 At a workshop attended by expert researchers in quantum mechanics in 1997, Max Tegmark took an admittedly highly unscientific poll of the participants’ favored interpretation of quantum mechanics (Tegmark, 1998). The Copenhagen interpretation came in first with thirteen votes, while the many-worlds interpretation came in second with eight. Another nine votes were scattered among other alternatives. Most interesting, eighteen votes were cast for “none of the above/undecided.” And these are the experts.

200 So what does happen if we hook up a surveillance camera but then don’t examine the tapes? It doesn’t matter whether we look at the tapes or not; the camera still counts as an observation, so there will be a chance to observe Ms. Kitty under the table. In the Copenhagen interpretation, we would say, “The camera is a classical measuring device whose influence collapses the wave function.” In the many-worlds interpretation, as we’ll see, the explanation is “the wave function of the camera becomes entangled with the wave function of the cat, so the alternative histories decohere.”

201 Many people have thought about changing the rules of quantum mechanics so that this is no longer the case; they have proposed what are called “hidden variable theories” that go beyond the standard quantum mechanical framework. In 1964, theoretical physicist John Bell proved a remarkable theorem: No local theory of hidden variables can possibly reproduce the predictions of quantum mechanics. This hasn’t stopped people from investigating nonlocal theories—ones where distant events can affect each other instantaneously. But they haven’t really caught on; the vast majority of modern physicists believe that quantum mechanics is simply correct, even if we don’t yet know how to interpret it.

2 02 There is a slightly more powerful statement we can actually make. In classical mechanics, the state is specified by both position and velocity, so you might guess that the quantum wave function assigns probabilities to every possible combination of position and velocity. But that’s not how it works. If you specify the amplitude for every possible position, you are done—you’ve completely determined the entire quantum state. So what happened to the velocity? It turns out that you can write the same wave function in terms of an amplitude for every possible velocity, completely leaving position out of the description. These are not two different states; they are just two different ways of writing exactly the same state. Indeed, there is a cookbook recipe for translating between the two choices, known in the trade as a “Fourier transform.” Given the amplitude for every possible position, you can do a Fourier transform to determine the amplitude for any possible velocity, and vice versa. In particular, if the wave function is an eigenstate, concentrated on one precise value of position (or velocity), its Fourier transform will be completely spread out over all possible velocities (or positions).

203 Einstein, Podolsky, and Rosen (1935).

204 Everett (1957). For discussion from various viewpoints, see Deutsch (1997), Albert (1992), or Ouellette (2007).

205 Note how crucial entanglement is to this story. If there were no entanglement, the outside world would still exist, but the alternatives available to Miss Kitty would be completely independent of what was going on out there. In that case, it would be perfectly okay to attribute a wave function to Miss Kitty all by herself. And thank goodness; that’s the only reason we are able to apply the formalism of quantum mechanics to individual atoms and other simple isolated systems. Not everything is entangled with everything else, or it would be impossible to say much about any particular subsystem of the world.


2 06 Bekenstein (1973).

207 Hawking (1988), 104. Or, as Dennis Overbye (1991, 107) puts it: “In Cambridge Bekenstein’s breakthrough was greeted with derision. Hawking was outraged. He knew this was nonsense.”

208 For discussion of observations of stellar-mass black holes, see Casares (2007); for supermassive black holes in other galaxies, see Kormendy and Richstone (1995). The black hole at the center of our galaxy is associated with a radio source known as “Sagittarius A*”; see Reid (2008).

209 Okay, for some people the looking is even more fun.

210 Way more than that, actually. As of January 2009, Hawking’s original paper (1975) had been cited by more than 3,000 other scientific papers.

211 As of this moment, we have never detected gravitational waves directly, although indirect evidence for their existence (as inferred from the energy lost by a system of two neutron stars known as the “binary pulsar”) was enough to win the Nobel Prize for Joseph Taylor and Russell Hulse in 1993. Right now, several gravitational-wave observatories are working to discover such waves directly, perhaps from the coalescence of two black holes.

212 The area of the event horizon is proportional to the square of the mass of the black hole; in fact, if the area is A and the mass is M, we have A = 16πG2M2/c4, where G is Newton’s constant of gravitation and c is the speed of light.

213 The analogy between black hole mechanics and thermodynamics was spelled out in Bardeen, Carter, and Hawking (1973).

214 One way to think about why the surface gravity is not infinite is to take seriously the caveat “as measured by an observer very far away.” The force right near the black hole is large, but when you measure it from infinity it undergoes a gravitational redshift, just as an escaping photon would. The force is infinitely strong, but there is an infinite redshift from the point of view of a distant observer, and the effects combine to give a finite answer for the surface gravity.

215 More carefully, Bekenstein suggested that the entropy was proportional to the area of the event horizon. Hawking eventually worked out the constant of proportionality.

216 Hawking (1988), 104-5.

217 You may wonder why it seems natural to think of the electromagnetic and gravitational fields, but not the electron field or the quark field. That’s because of the difference between fermions and bosons. Fermions, like electrons and quarks, are matter particles, distinguished by the fact that they can’t pile on top of one another; bosons, like photons and gravitons, are force particles that pile on with abandon. When we observe a macroscopic, classical-looking field, that’s a combination of a huge number of boson particles. Fermions like electrons and quarks simply can’t pile up that way, so their field vibrations only ever show up as individual particles.

218 Overbye (1991), 109.

219 For reference purposes, the Planck length is equal to (/c3)½, where G is Newton’s constant of gravitation, ħ is Planck’s constant from quantum mechanics, and c is the speed of light. (We’ve set Boltzmann’s constant equal to 1.) So the entropy can be expressed as S = (c¾ħG)A. The area of the event horizon is related to the mass M of the black hole by A = 8πG2M2. Putting it all together, the entropy is related to the mass by as S = (4πGc3/ħM2.

220 Particles and antiparticles are all “particles,” if that makes sense. Sometimes the word particle is used specifically to contrast with antiparticle, but more often it just refers to any pointlike elementary object. Nobody would object to the sentence “the positron is a particle, and the electron is its antiparticle.”

221 “Known” is an important caveat. Cosmologists have contemplated the possibility that some unknown process, perhaps in the very early universe, might have created copious amounts of very small black holes, perhaps even related to the dark matter. If these black holes were small enough, they wouldn’t be all that dark; they’d be emitting increasing amounts of Hawking radiation, and the final explosions might even be detectable.

222 One speculative but intriguing idea is that we could make a black hole in a particle accelerator, and then observe it decaying through Hawking radiation. Under ordinary circumstances, that’s hopelessly unrealistic; gravity is such an incredibly weak force that we’ll never be able to build a particle accelerator powerful enough to make even a microscopic black hole. But some modern scenarios, featuring hidden dimensions of spacetime, suggest that gravity becomes much stronger than usual at short distances (see Randall, 2005). In that case, the prospect of making and observing small black holes gets upgraded from “crazy” to “speculative, but not completely crazy.” I’m sure Hawking is rooting for it to happen.

Unfortunately, the prospect of microscopic black holes has been seized on by a group of fearmongers to spin scenarios under which the Large Hadron Collider, a new particle accelerator at the CERN laboratory in Geneva, is going to destroy the world. Even if the chances are small, destroying the world is pretty bad, so we should be careful, right? But careful reviews of the possibilities (Ellis et al., 2008) have concluded that there’s nothing the LHC will do that hasn’t occurred many times already elsewhere in the universe; if something disastrous were going to happen, we should have seen signs of it in other astrophysical objects. Of course, it’s always possible that everyone involved in these reviews is making some sort of unfortunate math mistake. But lots of things are possible. The next time you open a jar of tomato sauce, it’s possible that you will unleash a mutated pathogen that will wipe out all life on Earth. It’s possible that we are being watched and judged by a race of super-intelligent aliens, who will think badly of us and destroy the Earth if we allow ourselves to be cowed by frivolous lawsuits and don’t turn on the LHC. When possibilities become as remote as what we’re speaking about here, it’s time to take the risks and get on with our lives.

223 You might be tempted to pursue ideas along exactly those lines—perhaps the information is copied, and is contained simultaneously in the book falling into the singularity and in the radiation leaving the black hole. A result in quantum mechanics—the “No-Cloning Theorem”—says that can’t happen. Not only can information not be destroyed, but it can’t be duplicated.

224 Preskill’s take on the black hole bets can be found at his Web page: For an in-depth explanation of the black hole information loss paradox, see Susskind (2008).

225 You might think we could sidestep this conclusion by appealing to photons once again, because photons are particles that have zero mass. But they do have energy; the energy of a photon is larger when its wavelength is smaller. Because we’re dealing with a box of a certain fixed size, each photon inside will have a minimum allowed energy; otherwise, it simply wouldn’t fit. And the energy of all those photons, through the miracle of E = mc2, contributes to the mass of the box. (Each photon is massless, but a box of photons has a mass, given by the sum of the photon energies divided by the speed of light squared.)

226 The area of a sphere is equal to 4π times its radius squared. The area of a black hole event horizon, logically enough, is 4π times the Schwarzschild radius squared. This is actually the definition of the Schwarzschild radius, since the highly curved spacetime inside the black hole makes it difficult to sensibly define the distance from the singularity to the horizon. (Remember—that distance is timelike!) So the area of the event horizon is proportional to the square of the mass of the black hole. This is all for black holes with zero rotation and no net electric charge; if the hole is spinning or charged, the formulas are slightly more complicated.

227 The holographic principle is discussed in Susskind (2008); for technical details, see Bousso (2002).

228 Maldacena (1998). The title of Maldacena’s paper, “The Large N Limit of Superconformal Field Theories and Supergravity,” doesn’t immediately convey the excitement of his result. When Juan came to Santa Barbara in 1997 to give a seminar, I stayed in my office to work, having not been especially intrigued by his title. Had the talk been advertised as “An Equivalence Between a Five-Dimensional Theory with Gravity and a Four-Dimensional Theory Without Gravity,” I probably would have attended the seminar. Afterward, it was easy to tell from the conversations going on in the hallway—excited, almost frantic, scribbling on blackboards to work out implications of these new ideas—that I had missed something big.

229 The good thing about string theory is that it seems to be a unique theory; the bad thing is that this theory seems to have many different phases, which look more or less like completely different theories. Just like water can take the form of ice, liquid, or water vapor, depending on the circumstances, in string theory spacetime itself can come in many different phases, with different kinds of particles and even different numbers of observable dimensions of space. And when we say “many,” we’re not kidding—people throw around numbers like 10500 different phases, and it could very well be an infinite number. So the theoretical uniqueness of string theory seems to be of little practical help in understanding the particles and interactions of our particular world. See Greene (2000) or Musser (2008) for overviews of string theory, and Susskind (2006) for a discussion (an optimistic one) of the problem of many different phases.

230 Strominger and Vafa (1996). For a popular-level account, see Susskind (2008).

231 While the Strominger-Vafa work implies that the space of states for a black hole in string theory has the right size to account for the entropy, it doesn’t quite tell us what those states look like when gravity is turned on. Samir Mathur and collaborators have suggested that they are “fuzzballs”—configurations of oscillating strings that fill up the volume of the black hole inside the event horizon (Mathur, 2005).


232 In the eighteenth century, Gottfried Wilhelm Leibniz posed the Primordial Existential Question: “Why is there something rather than nothing?” (One might answer, “Why not?”) Subsequently, some philosophers have tried to argue that the very existence of the universe should be surprising to us, on the grounds that “nothing” is simpler than “something” (e.g., Swinburne, 2004). But that presupposes a somewhat dubious definition of “simplicity,” as well as the idea that this particular brand of simplicity is something a universe ought to have—neither of which is warranted by either experience or logic. See Grünbaum (2004) for a discussion.

233 Some would argue that God plays the role of the Universal Chicken, creating the universe in a certain state that accounts for the low-entropy beginning. This doesn’t seem like a very parsimonious explanatory framework, as it’s unclear why the entropy would be quite so low, and why (for one thing among many) there should be a hundred billion galaxies in the universe. More important, as scientists we want to explain the most with the least, so if we can come up with naturalistic theories that account for the low entropy of our observed universe without recourse to anything other than the laws of physics, that would be a triumph. Historically, this has been a very successful strategy; pointing at “gaps” in naturalistic explanations of the world and insisting that only God can fill them has, by contrast, had a dismal track record.

234 This isn’t exactly true, although it’s a pretty good approximation. If a certain kind of particle couples very weakly to the rest of the matter and radiation in the universe, it can essentially stop interacting, and drop out of contact with the surrounding equilibrium configuration. This is a process known as “freeze-out,” and it is crucially important to cosmologists—for example, when they would like to calculate the abundance of dark matter particles, which plausibly froze out at a very early time. In fact, the matter and radiation in the late universe (today) has frozen out long ago, and we are no longer in equilibrium even when you ignore gravity. (The temperature of the cosmic microwave background is about 3 Kelvin, so if we were in equilibrium, everything around you would be at a temperature of 3 Kelvin.)

235 The speed of light divided by the Hubble constant defines the “Hubble length,” which works out to about 14 billion light-years in the current universe. For not-too-crazy cosmologies, this quantity is almost the same as the age of the universe times the speed of light, so they can be used interchangeably. Because the universe expands at different rates at different times, the current size of our comoving patch can actually be somewhat larger than the Hubble length.

236 See, for example, Kofman, Linde, and Mukhanov (2002). That paper was written in response to a paper by Hollands and Wald (2002) that raised some similar issues to those we’re exploring in this chapter, in the specific context of inflationary cosmology. For a popular-level discussion that takes a similar view, see Chaisson (2001).

237 Indeed, Eric Schneider and Dorion Sagan (2005) have argued that the “purpose of life” is to accelerate the rate of entropy production by smoothing out gradients in the universe. It’s hard to make a proposal like that rigorous, for various reasons; one is that, while the Second Law says that entropy tends to increase, there’s no law of nature that says entropy tends to increase as fast as it can.

238 Also in contrast with the gravitational effects of sources of energy density other than “particles.” This loophole is relevant to the real world because of dark energy. The dark energy isn’t a collection of particles; it’s a smooth field that pervades the universe, and its gravitational impact is to push things apart. Nobody ever said things would be simple.

239 Other details are also important. In the early universe, ordinary matter is ionized—electrons are moving freely, rather than being attached to atomic nuclei. The pressure in an ionized plasma is generally larger than in a collection of atoms.

240 Penrose (2005), 706. An earlier version of this argument can be found in Penrose (1979).

241 Most of the matter in the universe—between 80 percent and 90 percent by mass—is in the form of dark matter, not the ordinary matter of atoms and molecules. We don’t know what the dark matter is, and it’s conceivable that it takes the form of small black holes. But there are problems with that idea, including the difficulty of making so many black holes in the first place. So most cosmologists tend to believe that the dark matter is very likely to be some sort of new elementary particle (or particles) that hasn’t yet been discovered.

242 Black-hole entropy increases rapidly as the black hole gains mass—it’s proportional to the mass squared. (Entropy goes like area, which goes like radius squared, and the Schwarzschild radius is proportional to the mass.) So a black hole of 10 million solar masses would have 100 times the entropy of one coming in at 1 million solar masses.

243 Penrose (2005), 707.

244 The argument here closely follows a paper I wrote in collaboration with Jennifer Chen (Carroll and Chen, 2004).

245 See, for example, Zurek (1982).

246 It’s also very far from being accepted wisdom among physicists. Not that there is any accepted answer to the question “What do the highest-entropy states look like when gravity is taken into account?” other than “We don’t know.” But hopefully you’ll become convinced that “empty space” is the best answer we have at the moment.

247 This is peeking ahead a bit, but note that we could also play this game backward in time. That is: start from some configuration of matter in the universe, a slice of spacetime at one moment in time. In some places we’ll see expansion and dilution, in others contraction and collapse and ultimately evaporation. But we can also ask what would happen if we evolved that “initial” state backward in time, using the same reversible laws of physics. The answer, of course, is that we would find the same kind of behavior. The regions that are expanding toward the future are contracting toward the past, and vice versa. But ultimately space would empty out as the “expanding” regions took over. The very far past looks just like the very far future: empty space.

248 Here in our own neighborhood, NASA frequently uses a similar effect—the “gravitational slingshot”—to help accelerate probes to the far reaches of the Solar System. If a spacecraft passes by a massive planet in just the right way, it can pick up some of the planet’s energy of motion. The planet is so heavy that it hardly notices, but the spacecraft gets flung away at a much higher velocity.

249 Wald (1983).

250 In particular, we can define a “horizon” around every observable patch of de Sitter space, just as we can with black holes. Then the entropy formula for that patch is precisely the same formula as the entropy of a black hole—it’s the area of that horizon, measured in Planck units, divided by four.

251 If H is the Hubble parameter in de Sitter space, the temperature is T = (ħ/2πk)H, where ħ is Planck’s constant and k is Boltzmann’s constant. This was first worked out by Gary Gibbons and Stephen Hawking (1977).

252 You might think this prediction is a bit too bold, relying on uncertain extrapolations into regimes of physics that we don’t really understand. It’s undeniably true that we don’t have direct experimental access to an eternal de Sitter universe, but the scenario we have sketched out relies only on a few fairly robust principles: the existence of thermal radiation in de Sitter space, and the relative frequency of different kinds of random fluctuations. In particular, it’s tempting to wonder whether there is some special kind of fluctuation that makes a Big Bang, and that kind of fluctuation is more likely than a fluctuation that makes a Boltzmann brain. That might be what actually happens, according to the ultimately correct laws of physics—indeed, we’ll propose something much like that later in the book—but it’s absolutely not what happens under the assumptions we are making here. The nice thing about thermal fluctuations in eternal de Sitter space is that we understand thermal fluctuations very well, and we can calculate with confidence how frequently different fluctuations occur. Specifically, fluctuations involving large changes in entropy are enormously less likely than fluctuations involving small changes in entropy. It will always be easier to fluctuate into a brain than into a universe, unless we depart from this scenario in some profound way.

253 Dyson, Kleban, and Susskind (2002); Albrecht and Sorbo (2004).


254 Toulmin (1988), 393.

255 See Guth (1997), also Overbye (1991).

256 Space can be curved even if spacetime is flat. A space with negative curvature, expanding with a size proportional to time, corresponds to a spacetime that is completely flat. Likewise, space can be flat even if spacetime is curved; if a spatially flat universe is expanding (or contracting) in time, the spacetime will certainly be curved. (The point is that the expansion contributes to the total curvature of spacetime, and the curvature of space also contributes. That’s why an expanding negatively curved space can correspond to a spacetime with zero curvature; the contribution from spatial curvature is negative and can precisely cancel the positive contribution from the expansion.) When cosmologists refer to “a flat universe” they mean a spatially flat universe, and likewise for positive or negative curvature.

257 They add up to less than 180 degrees.

258 One way of measuring the curvature of the universe is indirectly, using Einstein’s equation. General relativity implies a relationship between the curvature, the expansion rate, and the amount of energy in the universe. For a long time, astronomers measured the expansion rate and the amount of matter in the universe (which they assumed was the most important part of the energy), and kept finding that the universe was pretty close to flat, but it should have a tiny amount of negative curvature. The discovery of dark energy changed all that; it provided exactly the right amount of energy to make the universe flat. Subsequently, astronomers have been able to measure the curvature directly, by using the pattern of temperature fluctuations in the cosmic microwave background as a kind of giant triangle (Miller et al., 1999; de Bernardis et al., 2000; Spergel et al., 2003). This method indicates strongly that the universe really is spatially flat, which is a nice consistency check with the indirect reasoning.

259 Nobody else calls it that. Because this form of dark energy serves the purpose of driving inflation, it is usually postulated to arise from a hypothetical field dubbed the “inflaton.” It would be nice if the inflaton field served some other purpose, or fit snugly into some more complete theory of particle physics, but as yet we don’t know enough to say.

260 You might think that, because the Big Bang itself is a point, the past light cones of any event in the universe must necessarily meet at the Big Bang. But that’s misleading. For one thing, the Big Bang is not a point in space—it’s a moment in time. More important, the Big Bang in classical general relativity is a singularity and shouldn’t even be included in the spacetime; we should talk only about what happens after the Big Bang. And even if we included moments immediately after the Big Bang, the past light cones would not overlap.

261 The original papers are by Andrei Linde (1981) and Andreas Albrecht and Paul Steinhardt (1982). See Guth (1997) for an accessible discussion.

262 See, for example, Spergel et al. (2003).

263 See Vilenkin (1983), Linde (1986), Guth (2007).

264 This scenario was invented under the slightly misleading name of “open inflation” (Bucher, Goldhaber, and Turok, 1995). At the time, before the discovery of dark energy, cosmologists had begun to get a bit nervous—inflation seemed to robustly predict that the universe should be spatially flat, but observations of the density of matter kept implying that there wasn’t enough energy to make it work out. Some people panicked, and tried to invent models of inflation that didn’t necessarily predict a flat universe. That turned out not to be necessary—the dark energy has exactly the right amount of energy density to make the universe flat, and observations of the cosmic microwave background strongly indicate that it really is flat (Spergel et al., 2003). But that’s okay, because out of the panic came a clever idea—how to make a realistic universe inside a bubble embedded in a false-vacuum background.

265 In fact, the early papers on eternal inflation were set in the context of new inflation, not old-inflation-with-new-inflation-inside-the-bubbles. In new inflation it is actually more surprising that inflation is eternal, as you would think the field would just roll down the hill defined by its potential energy. But we should remember that the rolling field has quantum fluctuations; if conditions are right, those fluctuations can be quite large. In fact, they can be large enough that in some regions of space the field actually moves up the hill, even though on average it is rolling down. Regions where it rolls up are rare, but they expand faster because the energy density is larger. We end up with a picture similar to the old-inflation story; lots of the universe sees an inflaton roll down and convert to matter and radiation, but an increasing volume stays stuck in the inflating stage, and inflation never ends.

266 See Susskind (2006), or Vilenkin (2006). An earlier, related version of a landscape of different vacuum states was explored by Smolin (1993).

267 In the original papers about inflation, it was implicitly assumed that the particles in the early universe were close to thermal equilibrium. The scenario described here, which seems a bit more robust, goes under the name of “chaotic inflation,” and was originally elucidated by Andrei Linde (1983, 1986).

268 See for example Penrose (2005), Hollands and Wald (2002).

269 This is not to imply that choosing a configuration of the universe randomly from among all possible allowed states is something we are ordered to do, or that there is some reason to believe that it’s actually what happens. Rather, that if the state of the universe is clearly not chosen randomly, then there must be something that determines how it is chosen; that’s a clue we would like to use to help understand how the universe works.

270 You may object that there is another candidate for a “high-entropy state”—the chaotic mess into which our universe would evolve if we let it collapse. (Or equivalently, if we started with a typical microstate consistent with the current macrostate of the universe, and ran the clock backward.) It’s true that such a state is much lumpier than the current universe, as singularities and black holes would form in the process of collapse. But that’s exactly the point; even among states that pack the entire current universe into a very small region, an incredibly small fraction take the form of a smooth patch dominated by dark super-energy, as required by inflation. Most such states, on the contrary, are in a regime where quantum field theory doesn’t apply, because quantum gravity is absolutely necessary to describe them. But “we don’t know how to describe such states” is a very different statement than “such states don’t exist” or even “we can ignore such states when we enumerate the possible initial states of the universe.” If the dynamics are reversible, we have no choice but to take those states very seriously.

271 For example, Guth (1997).


272 Pascal (1995), 66.

273 What would be even better is if some young person read this book, became convinced that this was a serious problem worthy of our attention, and went on to solve it. Or an older person, it doesn’t really matter. In either case, if you end up finding an explanation for the arrow of time that becomes widely accepted within the physics community, please let me know if this book had anything to do with it.

274 Perhaps the closest to something along these lines is the “Holographic Cosmology” scenario advocated by Tom Banks and Willy Fischler (2005; also Banks, 2007). They suggest that the effective dynamical laws of quantum gravity could be very different in different spacetimes. In other words, the laws of physics themselves could be time-dependent. This is a speculative scenario, but worth paying attention to.

275 A related strategy is to posit a particular form for the wave function of the universe, as advocated by James Hartle and Stephen Hawking (1983). They rely on a technique known as “Euclidean quantum gravity”; attempting to do justice to the pros and cons of this approach would take us too far afield from our present concerns. It has been suggested that the Hartle-Hawking wave function implies that the universe must be smooth near the Big Bang, which would help explain the arrow of time (Halliwell and Hawking, 1985), but the domain of validity of the approximations used to derive this result is a bit unclear. My own suspicion is that the Hartle-Hawking wave function predicts that we should live in empty de Sitter space, just as a straightforward contemplation of entropy would lead us to expect.

276 Penrose (1979). When you dig deeply into the mathematics of spacetime curvature, you find that it comes in two different forms: “Ricci curvature,” named after Italian mathematician Gregorio Ricci-Curbastro, and “Weyl curvature,” named after German mathematician Hermann Weyl. Ricci curvature is tied directly to the matter and energy in spacetime—where there’s stuff, the Ricci curvature is nonzero, and where there’s not, the Ricci curvature vanishes. Weyl curvature, on the other hand, can exist all by itself; a gravitational wave, for example, propagates freely through space, and leads to Weyl curvature but no Ricci curvature. The Weyl curvature hypothesis states that singularities in one direction of time always have vanishing Weyl curvature, while those at the other are unconstrained. We would assign the descriptive adjectives initial and final after the fact, since the low-Weyl-curvature direction would have a lower entropy.

277 Another problem is the apparent danger of Boltzmann brains if the universe enters an eternal de Sitter phase in the future. Also, the concept of a “singularity” from classical general relativity is unlikely to survive intact in a theory of quantum gravity. A more realistic version of the Weyl curvature hypothesis would have to be phrased in quantum-gravity language.

278 Gold (1962).

279 For a brief while, Stephen Hawking believed that his approach to quantum cosmology predicted that the arrow of time would actually reverse if the universe re-collapsed (Hawking, 1985). Don Page convinced him that this was not the case—the right interpretation was that the wave function had two branches, oriented oppositely in time (Page, 1985). Hawking later called this his “greatest blunder,” in a reference to Einstein’s great blunder of suggesting the cosmological constant rather than predicting the expansion of the universe (Hawking, 1988).

280 Price (1996).

281 See, for example, Davies and Twamley (1993), Gell-Mann and Hartle (1996). A different form of future boundary condition, which does not lead to a reversal of the arrow of time, has been investigated in particle physics; see Lee and Wick (1970), Grinstein, O’Connell, and Wise (2009).

282 Once again, the English language lacks the vocabulary for nonstandard arrows of time. We will choose the convention that the “direction of time” is defined by us, here in the “ordinary” post-Big-Bang phase of the universe; with respect to this choice, entropy decreases “toward the future” in the collapsing phase. Of course, organisms that actually live in that phase will naturally define things in the opposite sense; but it’s our book, and the choice is simply a matter of convention, so we can make the rules.

283 Greg Egan worked through the dramatic possibilities of this scenario, in his short story “The Hundred Light-Year Diary” (reprinted in Egan, 1997).

284 Cf. Callender’s Fabergé eggs, discussed in Chapter Nine.

285 See also Carroll (2008).

286 One of the first bouncing scenarios was simply called the “Pre-Big-Bang scenario.” It makes use of a new field called the “dilaton” from string theory, which affects the strength of gravity as it changes (Gasperini and Veneziano, 1993). A related example is the “ekpyrotic universe” scenario, which was later adapted into the “cyclic universe.” In this picture, the energy that powers what we see as the “Bang” comes when a hidden, compact dimension squeezes down to zero size. The cyclic universe idea is discussed in depth in a popular book by Paul Steinhardt and Neil Turok (2007); its predecessor, the ekpyrotic universe, was proposed by Khoury et al. (2001). There are also bouncing cosmologies that don’t rely on strings or extra dimensions, but on the quantum properties of spacetime itself, under the rubric of “loop quantum cosmology” (Bojowald, 2006).

287 Hopefully, after the appearance of this book, that will all change.

288 The same argument holds for Steinhardt and Turok ’s cyclic universe. Despite the label, their model is not recurrent in the way that the Boltzmann-Lucretius model would be. In an eternal universe with a finite-sized state space, allowed sequences of events happen both forward and backward in time, equally often. But in the Steinhardt-Turok model, the arrow of time always points in the same direction; entropy grows forever, requiring an infinite amount of fine-tuning at any one moment. Interestingly, Richard Tolman (1931) long ago discussed problems of entropy in a cyclic universe, although he talked about only the entropy of matter, not including gravity. See also Bojowald and Tavakol (2008).

289 This discussion assumes that the assumptions we previously made in discussing the entropy of our comoving patch remain valid—in particular, that it makes sense to think of the patch as an autonomous system. That is certainly not necessarily correct, but it is usually implicitly assumed by people who study these scenarios.

290 Aguirre and Gratton (2003). Hartle, Hawking, and Hertog (2008) also investigated universes with high entropy in the past and future and low entropy in the middle, in the context of Euclidean quantum gravity.

291 This is true even in ordinary nongravitational situations, where the total energy is strictly conserved. When a high-energy state decays into a lower-energy one, like a ball rolling down a hill, energy isn’t created or destroyed; it’s just transformed from a useful low-entropy form into a useless high-entropy form.

292 Farhi, Guth, and Guven (1990). See also Farhi and Guth (1987), and Fischler, Morgan, and Polchinski (1990a, 1990b). Guth writes about this work in his popular-level book (1997).

293 The most comprehensive recent work on this question was carried out by Anthony Aguirre and Matthew Johnson (2006). They catalogued all the different ways that baby universes might be created by quantum tunneling, but in the end were unable to make a definitive statement about what actually happens. (“The unfortunate bottom line, then, is that while the relation between the various nucleation processes is much clearer, the question of which ones actually occur remains open.”) From a completely different perspective, Freivogel et al. (2006) considered inflation in an anti-de Sitter background, using Maldacena’s correspondence. They concluded that baby universes were not created. But our interest is de Sitter backgrounds, not anti-de Sitter backgrounds; it’s unclear whether the results can be extended from one context to the other. For one more take on the evolution of de Sitter space, see Bousso (1998).

294 Carroll and Chen (2004).

295 One assumption here is that the de Sitter space is in a true vacuum state; in particular, that there is no other state of the theory where the vacuum energy vanishes, and spacetime could look like Minkowski space. To be honest, that is not necessarily a realistic assumption. In string theory, for example, we are pretty sure that 10-dimensional Minkowski space is a good solution of the theory. Unlike de Sitter, Minkowski space has zero temperature, so can plausibly avoid the creation of baby universes. To make the scenario described here work, we have to imagine either that there are no states with zero vacuum energy, or that the amount of spacetime that is actually in such a state is sufficiently small compared to the de Sitter regions.


296 And that’s despite the fact that, just as the manuscript was being completed, another book with exactly the same title appeared on the market! (Viola, 2009). His subtitle is quite different, however: “Rediscovering the Ageless Purpose of God.” I do hope nobody orders the wrong book by accident.

297 Feynman, Leighton, and Sands (1970), 46-8.

298 Popper (1959). Note that Popper went a bit further than the demarcation problem; he wanted to understand all of scientific progress as a series of falsified conjectures. Compared to how science is actually done, this is a fairly impoverished way of understanding the process; ruling out conjectures is important, but there’s a lot more that goes into the real workings of science.

299 See Deutsch (1997) for more on this point.

300 For one example among many, see Swinburne (2004).

301 Lemaître (1958).

302 Steven Weinberg put it more directly: “The more the universe seems comprehensible, the more it also seems pointless” (Weinberg 1977, 154).

303 I regret that this book has paid scant attention to current and upcoming new experiments in fundamental physics. The problem is that, as fascinating and important as those experiments are, it’s very hard to tell ahead of time what we are going to learn from them, especially about a subject as deep and all-encompassing as the arrow of time. We’re not going to build a telescope that will use tachyons to peer into other universes, unfortunately. What we might do is build particle accelerators that reveal something about supersymmetry, which in turn teaches us something about string theory, which we can use to understand more about quantum gravity. Or we might gather data from giant telescopes—collecting not only photons of light, but also cosmic rays, neutrinos, gravitational waves, or even particles of dark matter—that reveal something surprising about the evolution of the universe. The real world surprises us all the time: dark matter and dark energy are obvious examples. As a theoretical physicist, I’ve written this book from a rather theoretical perspective, but as a matter of history it’s often new experiments that end up awakening us from our dogmatic slumbers.


304 These properties are behind the “magic of mathematics” appealed to above. For example, suppose we wanted to figure out what was meant by 10 to the power 0.5. I know that, whatever that number is, it has to have the property that 100.5 • 100.5 = 10(0.5 + 0.5) = 101 = 10.

In other words, the number 100.5 times itself gives us 10; that means that 100.5 must simply be the square root of 10. (And likewise for any other base raised to the power 0.5.) By similar tricks, we can figure out the exponential of any number we like.