Superintelligence: Paths, Dangers, Strategies - Nick Bostrom (2014)
Chapter 11. Multipolar scenarios
We have seen (particularly in Chapter 8) how menacing a unipolar outcome could be, one in which a single superintelligence obtains a decisive strategic advantage and uses it to establish a singleton. In this chapter, we examine what would happen in a multipolar outcome, a post-transition society with multiple competing superintelligent agencies. Our interest in this class of scenarios is twofold. First, as alluded to in Chapter 9, social integration might be thought to offer a solution to the control problem. We already noted some limitations with that approach, and this chapter paints a fuller picture. Second, even without anybody setting out to create a multipolar condition as a way of handling the control problem, such an outcome might occur anyway. So what might such an outcome look like? The resulting competitive society is not necessarily attractive, nor long-lasting.
In singleton scenarios, what happens post-transition depends almost entirely on the values of the singleton. The outcome could thus be very good or very bad, depending on what those values are. What the values are depends, in turn, on whether the control problem was solved, and—to the degree to which it was solved—on the goals of the project that created the singleton.
If one is interested in the outcome of singleton scenarios, therefore, one really only has three sources of information: information about matters that cannot be affected by the actions of the singleton (such as the laws of physics); information about convergent instrumental values; and information that enables one to predict or speculate about what final values the singleton will have.
In multipolar scenarios, an additional set of constraints comes into play, constraints having to do with how agents interact. The social dynamics emerging from such interactions can be studied using techniques from game theory, economics, and evolution theory. Elements of political science and sociology are also relevant insofar as they can be distilled and abstracted from some of the more contingent features of human experience. Although it would be unrealistic to expect these constraints to give us a precise picture of the post-transition world, they can help us identify some salient possibilities and challenge some unfounded assumptions.
We will begin by exploring an economic scenario characterized by a low level of regulation, strong protection of property rights, and a moderately rapid introduction of inexpensive digital minds.1 This type of model is most closely associated with the American economist Robin Hanson, who has done pioneering work on the subject. Later in this chapter, we will look at some evolutionary considerations and examine the prospects of an initially multipolar post-transition world subsequently coalescing into a singleton.
Of horses and men
General machine intelligence could serve as a substitute for human intelligence. Not only could digital minds perform the intellectual work now done by humans, but, once equipped with good actuators or robotic bodies, machines could also substitute for human physical labor. Suppose that machine workers—which can be quickly reproduced—become both cheaper and more capable than human workers in virtually all jobs. What happens then?
Wages and unemployment
With cheaply copyable labor, market wages fall. The only place where humans would remain competitive may be where customers have a basic preference for work done by humans. Today, goods that have been handcrafted or produced by indigenous people sometimes command a price premium. Future consumers might similarly prefer human-made goods and human athletes, human artists, human lovers, and human leaders to functionally indistinguishable or superior artificial counterparts. It is unclear, however, just how widespread such preferences would be. If machine-made alternatives were sufficiently superior, perhaps they would be more highly prized.
One parameter that might be relevant to consumer choice is the inner life of the worker providing a service or product. A concert audience, for instance, might like to know that the performer is consciously experiencing the music and the venue. Absent phenomenal experience, the musician could be regarded as merely a high-powered jukebox, albeit one capable of creating the three-dimensional appearance of a performer interacting naturally with the crowd. Machines might then be designed to instantiate the same kinds of mental states that would be present in a human performing the same task. Even with perfect replication of subjective experiences, however, some people might simply prefer organic work. Such preferences could also have ideological or religious roots. Just as many Muslims and Jews shun food prepared in ways they classify as haram or treif, so there might be groups in the future that eschew products whose manufacture involved unsanctioned use of machine intelligence.
What hinges on this? To the extent that cheap machine labor can substitute for human labor, human jobs may disappear. Fears about automation and job loss are of course not new. Concerns about technological unemployment have surfaced periodically, at least since the Industrial Revolution; and quite a few professions have in fact gone the way of the English weavers and textile artisans who in the early nineteenth century united under the banner of the folkloric “General Ludd” to fight against the introduction of mechanized looms. Nevertheless, although machinery and technology have been substitutes for many particular types of human labor, physical technology has on the whole been a complement to labor. Average human wages around the world have been on a long-term upward trend, in large part because of such complementarities. Yet what starts out as a complement to labor can at a later stage become a substitute for labor. Horses were initially complemented by carriages and ploughs, which greatly increased the horse’s productivity. Later, horses were substituted for by automobiles and tractors. These later innovations reduced the demand for equine labor and led to a population collapse. Could a similar fate befall the human species?
The parallel to the story of the horse can be drawn out further if we ask why it is that there are still horses around. One reason is that there are still a few niches in which horses have functional advantages; for example, police work. But the main reason is that humans happen to have peculiar preferences for the services that horses can provide, including recreational horseback riding and racing. These preferences can be compared to the preferences we hypothesized some humans might have in the future, that certain goods and services be made by human hand. Although suggestive, this analogy is, however, inexact, since there is still no complete functional substitute for horses. If there were inexpensive mechanical devices that ran on hay and had exactly the same shape, feel, smell, and behavior as biological horses—perhaps even the same conscious experiences—then demand for biological horses would probably decline further.
With a sufficient reduction in the demand for human labor, wages would fall below the human subsistence level. The potential downside for human workers is therefore extreme: not merely wage cuts, demotions, or the need for retraining, but starvation and death. When horses became obsolete as a source of moveable power, many were sold off to meatpackers to be processed into dog food, bone meal, leather, and glue. These animals had no alternative employment through which to earn their keep. In the United States, there were about 26 million horses in 1915. By the early 1950s, 2 million remained.2
Capital and welfare
One difference between humans and horses is that humans own capital. A stylized empirical fact is that the total factor share of capital has for a long time remained steady at approximately 30% (though with significant short-term fluctuations).3 This means that 30% of total global income is received as rent by owners of capital, the remaining 70% being received as wages by workers. If we classify AI as capital, then with the invention of machine intelligence that can fully substitute for human work, wages would fall to the marginal cost of such machine-substitutes, which—under the assumption that the machines are very efficient—would be very low, far below human subsistence-level income. The income share received by labor would then dwindle to practically nil. But this implies that the factor share of capital would become nearly 100% of total world product. Since world GDP would soar following an intelligence explosion (because of massive amounts of new labor-substituting machines but also because of technological advances achieved by superintelligence, and, later, acquisition of vast amounts of new land through space colonization), it follows that the total income from capital would increase enormously. If humans remain the owners of this capital, the total income received by the human population would grow astronomically, despite the fact that in this scenario humans would no longer receive any wage income.
The human species as a whole could thus become rich beyond the dreams of Avarice. How would this income be distributed? To a first approximation, capital income would be proportional to the amount of capital owned. Given the astronomical amplification effect, even a tiny bit of pre-transition wealth would balloon into a vast post-transition fortune. However, in the contemporary world, many people have no wealth. This includes not only individuals who live in poverty but also some people who earn a good income or who have high human capital but have negative net worth. For example, in affluent Denmark and Sweden 30% of the population report negative wealth—often young, middle-class people with few tangible assets and credit card debt or student loans.4 Even if savings could earn extremely high interest, there would need to be some seed grain, some starting capital, in order for the compounding to begin.5
Nevertheless, even individuals who have no private wealth at the start of the transition could become extremely rich. Those who participate in a pension scheme, for instance, whether public or private, should be in a good position, provided the scheme is at least partially funded.6 Have-nots could also become rich through the philanthropy of those who see their net worth skyrocket: because of the astronomical size of the bonanza, even a very small fraction donated as alms would be a very large sum in absolute terms.
It is also possible that riches could still be made through work, even at a post-transition stage when machines are functionally superior to humans in all domains (as well as cheaper than even subsistence-level human labor). As noted earlier, this could happen if there are niches in which human labor is preferred for aesthetic, ideological, ethical, religious, or other non-pragmatic reasons. In a scenario in which the wealth of human capital-holders increases dramatically, demand for such labor could increase correspondingly. Newly minted trillionaires or quadrillionaires could afford to pay a hefty premium for having some of their goods and services supplied by an organic “fair-trade” labor force. The history of horses again offers a parallel. After falling to 2 million in the early 1950s, the US horse population has undergone a robust recovery: a recent census puts the number at just under 10 million head.7 The rise is not due to new functional needs for horses in agriculture or transportation; rather, economic growth has enabled more Americans to indulge a fancy for equestrian recreation.
Another relevant difference between humans and horses, beside capital-ownership, is that humans are capable of political mobilization. A human-run government could use the taxation power of the state to redistribute private profits, or raise revenue by selling appreciated state-owned assets, such as public land, and use the proceeds to pension off its constituents. Again, because of the explosive economic growth during and immediately after the transition, there would be vastly more wealth sloshing around, making it relatively easy to fill the cups of all unemployed citizens. It should be feasible even for a single country to provide every human worldwide with a generous living wage at no greater proportional cost than what many countries currently spend on foreign aid.8
The Malthusian principle in a historical perspective
So far we have assumed a constant human population. This may be a reasonable assumption for short timescales, since biology limits the rate of human reproduction. Over longer timescales, however, the assumption is not necessarily reasonable.
The human population has increased a thousandfold over the past 9,000 years.9 The increase would have been much faster except for the fact that throughout most of history and prehistory, the human population was bumping up against the limits of the world economy. An approximately Malthusian condition prevailed, in which most people received subsistence-level incomes that just barely allowed them to survive and raise an average of two children to maturity.10 There were temporary and local reprieves: plagues, climate fluctuations, or warfare intermittently culled the population and freed up land, enabling survivors to improve their nutritional intake—and to bring up more children, until the ranks were replenished and the Malthusian condition reinstituted. Also, thanks to social inequality, a thin elite stratum could enjoy consistently above-subsistence income (at the expense of somewhat lowering the total size of the population that could be sustained). A sad and dissonant thought: that in this Malthusian condition, the normal state of affairs during most of our tenure on this planet, it was droughts, pestilence, massacres, and inequality—in common estimation the worst foes of human welfare—that may have been the greatest humanitarians: they alone enabling the average level of well-being to occasionally bop up slightly above that of life at the very margin of subsistence.
Superimposed on local fluctuations, history shows a macro-pattern of initially slow but accelerating economic growth, fueled by the accumulation of technological innovations. The growing world economy brought with it a commensurate increase in global population. (More precisely, a larger population itself appears to have strongly accelerated the rate of growth, perhaps mainly by increasing humanity’s collective intelligence.11) Only since the Industrial Revolution, however, did economic growth become so rapid that population growth failed to keep pace. Average income thus started to rise, first in the early-industrializing countries of Western Europe, subsequently in most of the world. Even in the poorest countries today, average income substantially exceeds subsistence level, as reflected in the fact that the populations of these countries are growing.
The poorest countries now have the fastest population growth, as they have yet to complete the “demographic transition” to the low-fertility regime that has taken hold in more developed societies. Demographers project that the world population will rise to about 9 billion by mid-century, and that it might thereafter plateau or decline as the poorer countries join the developed world in this low-fertility regime.12 Many rich countries already have fertility rates that are below replacement level; in some cases, far below.13
Yet there are reasons, if we take a longer view and assume a state of unchanging technology and continued prosperity, to expect a return to the historically and ecologically normal condition of a world population that butts up against the limits of what our niche can support. If this seems counterintuitive in light of the negative relationship between wealth and fertility that we are currently observing on the global scale, we must remind ourselves that this modern age is a brief slice of history and very much an aberration. Human behavior has not yet adapted to contemporary conditions. Not only do we fail to take advantage of obvious ways to increase our inclusive fitness (such as by becoming sperm or egg donors) but we actively sabotage our fertility by using birth control. In the environment of evolutionary adaptedness, a healthy sex drive may have been enough to make an individual act in ways that maximized her reproductive potential; in the modern environment, however, there would be a huge selective advantage to having a more direct desire for being the biological parent to the largest possible number of children. Such a desire is currently being selected for, as are other traits that increase our propensity to reproduce. Cultural adaptation, however, might steal a march on biological evolution. Some communities, such those of the Hutterites or the adherents of the Quiverfull evangelical movement, have natalist cultures that encourage large families, and they are consequently undergoing rapid expansion.
Population growth and investment
If we imagine current socioeconomic conditions magically frozen in their current shape, the future would be dominated by cultural or ethnic groups that sustain high levels of fertility. If most people had preferences that were fitness-maximizing in the contemporary environment, the population could easily double in each generation. Absent population control policies—which would have to become steadily more rigorous and effective to counteract the evolution of stronger preferences to circumvent them—the world population would then continue to grow exponentially until some constraint, such as land scarcity or depletion of easy opportunities for important innovation, made it impossible for the economy to keep pace: at which point, average income would start to decline until it reached the level where crushing poverty prevents most people from raising much more than two children to maturity. Thus the Malthusian principle would reassert itself, like a dread slave master, bringing our escapade into the dreamland of abundance to an end, and leading us back to the quarry in chains, there to resume the weary struggle for subsistence.
This longer-term outlook could be telescoped into a more imminent prospect by the intelligence explosion. Since software is copyable, a population of emulations or AIs could double rapidly—over the course of minutes rather than decades or centuries—soon exhausting all available hardware.
Private property might offer partial protection against the emergence of a universal Malthusian condition. Consider a simple model in which clans (or closed communities, or states) start out with varying amounts of property and independently adopt different policies about reproduction and investment. Some clans discount the future steeply and spend down their endowment, whereafter their impoverished members join the global proletariat (or die, if they cannot support themselves through their labor). Other clans invest some of their resources but adopt a policy of unlimited reproduction: such clans grow more populous until they reach an internal Malthusian condition in which their members are so poor that they die at almost the same rate as they reproduce, at which point the clan’s population growth slows to equal the growth of its resources. Yet other clans might restrict their fertility to below the rate of growth of their capital: such clans could slowly increment their numbers while their members also grow richer per capita.
If wealth is redistributed from the wealthy clans to the members of the rapidly reproducing or rapidly discounting clans (whose children, copies, or offshoots, through no fault of their own, were launched into the world with insufficient capital to survive and thrive) then a universal Malthusian condition would be more closely approximated. In the limiting case, all members of all clans would receive subsistence level income and everybody would be equal in their poverty.
If property is not redistributed, prudent clans might hold on to a certain amount of capital, and it is possible that their wealth could grow in absolute terms. It is, however, unclear whether humans could earn as high rates of return on their capital as machine intelligences could earn on theirs, because there may be synergies between labor and capital such that an single agent who can supply both (e.g. an entrepreneur or investor who is both skilled and wealthy) can attain a private rate of return on her capital exceeding the market rate obtainable by agents who possess financial but not cognitive resources. Humans, being less skilled than machine intelligences, may therefore grow their capital more slowly—unless, of course, the control problem had been completely solved, in which case the human rate of return would equal the machine rate of return, since a human principal could task a machine agent to manage her savings, and could do so costlessly and without conflicts of interest: but otherwise, in this scenario, the fraction of the economy owned by machines would asymptotically approach one hundred percent.
A scenario in which the fraction of the economy that is owned by machines asymptotically approaches one hundred percent is not necessarily one in which the size of the human slice declines. If the economy grows at a sufficient clip, then even a relatively diminishing fraction of it may still be increasing in its absolute size. This may sound like modestly good news for humankind: in a multipolar scenario in which property rights are protected—even if we completely fail to solve the control problem—the total amount of wealth owned by human beings could increase. Of course, this effect would not take care of the problem of population growth in the human population pulling down per capita income to subsistence level, nor the problem of humans who ruin themselves because they discount the future.
In the long run, the economy would become increasingly dominated by those clans that have the highest savings rates—misers who own half the city and live under a bridge. Only in the fullness of time, when there are no more opportunities for investment, would the maximally prosperous misers start drawing down their savings.14 However, if there is less than perfect protection for property rights—for example if the more efficient machines on net succeed, by hook or by crook, in transferring wealth from humans to themselves—then human capitalists may need to spend down their capital much sooner, before it gets depleted by such transfers (or the ongoing costs incurred in securing their wealth against such transfers). If these developments take place on digital rather than biological timescales, then the glacial humans might find themselves expropriated before they could say Jack Robinson.15
Life in an algorithmic economy
Life for biological humans in a post-transition Malthusian state need not resemble any of the historical states of man (as hunter-gatherer, farmer, or office worker). Instead, the majority of humans in this scenario might be idle rentiers who eke out a marginal living on their savings.16 They would be very poor, yet derive what little income they have from savings or state subsidies. They would live in a world with extremely advanced technology, including not only superintelligent machines but also anti-aging medicine, virtual reality, and various enhancement technologies and pleasure drugs: yet these might be generally unaffordable. Perhaps instead of using enhancement medicine, they would take drugs to stunt their growth and slow their metabolism in order to reduce their cost of living (fast-burners being unable to survive at the gradually declining subsistence income). As our numbers increase and our average income declines further, we might degenerate into whatever minimal structure still qualifies to receive a pension—perhaps minimally conscious brains in vats, oxygenized and nourished by machines, slowly saving up enough money to reproduce by having a robot technician develop a clone of them.17
Further frugality could be achieved by means of uploading, since a physically optimized computing substrate, devised by advanced superintelligence, would be more efficient than a biological brain. The migration into the digital realm might be stemmed, however, if emulations were regarded as non-humans or non-citizens ineligible to receive pensions or to hold tax-exempt savings accounts. In that case, a niche for biological humans might remain open, alongside a perhaps vastly larger population of emulations or artificial intelligences.
So far we have focused on the fate of the humans, who may be supported by savings, subsidies, or wage income deriving from other humans who prefer to hire humans. Let us now turn our attention to some of the entities that we have so far classified as “capital”: machines that may be owned by human beings, that are constructed and operated for the sake of the functional tasks they perform, and that are capable of substituting for human labor in a very wide range of jobs. What may the situation be like for these workhorses of the new economy?
If these machines were mere automata, simple devices like a steam engine or the mechanism in a clock, then no further comment would be needed: there would be a large amount of such capital in a post-transition economy, but it would seem not to matter to anybody how things turn out for pieces of insentient equipment. However, if the machines have conscious minds—if they are constructed in such a way that their operation is associated with phenomenal awareness (or if they for some other reason are ascribed moral status)—then it becomes important to consider the overall outcome in terms of how it would affect these machine minds. The welfare of the working machine minds could even appear to be the most important aspect of the outcome, since they may be numerically dominant.
Voluntary slavery, casual death
A salient initial question is whether these working machine minds are owned as capital (slaves) or are hired as free wage laborers. On closer inspection however, it become doubtful that anything really hinges on the issue. There are two reasons for this. First, if a free worker in a Malthusian state gets paid a subsistence-level wage, he will have no disposable income left after he has paid for food and other necessities. If the worker is instead a slave, his owner will pay for his maintenance and again he will have no disposable income. In either case, the worker gets the necessities and nothing more. Second, suppose that the free laborer were somehow in a position to command an above-subsistence-level income (perhaps because of favorable regulation). How will he spend the surplus? Investors would find it most profitable to create workers who would be “voluntary slaves”—who would willingly work for subsistence-level wages. Investors may create such workers by copying those workers who are compliant. With appropriate selection (and perhaps some modification to the code) investors might be able to create workers who not only prefer to volunteer their labor but who would also choose to donate back to their owners any surplus income they might happen to receive. Giving money to the worker would then be but a roundabout way of giving money to the owner or employer, even if the worker were a free agent with full legal rights.
Perhaps it will be objected that it would be difficult to design a machine so that it wants to volunteer for any job assigned to it or so that it wants to donate its wages to its owner. Emulations, in particular, might be imagined to have more typically human desires. But note that even if the original control problem is difficult, we are here considering a condition after the transition, a time when methods for motivation selection have presumably been perfected. In the case of emulations, one might get quite far simply by selecting from the pre-existing range of human characters; and we have described several other motivation selection methods. The control problem may also in some ways be simplified by the current assumption that the new machine intelligence enters into a stable socioeconomic matrix that is already populated with other law-abiding superintelligent agents.
Let us, then, consider the plight of the working-class machine, whether it be operating as a slave or a free agent. We focus first on emulations, the easiest case to imagine.
Bringing a new biological human worker into the world takes anywhere between fifteen and thirty years, depending on how much expertise and experience is required. During this time the new person must be fed, housed, nurtured, and educated—at great expense. By contrast, spawning a new copy of a digital worker is as easy as loading a new program into working memory. Life thus becomes cheap. A business could continuously adapt its workforce to fit demands by spawning new copies—and terminating copies that are no longer needed, to free up computer resources. This could lead to an extremely high death rate among digital workers. Many might live for only one subjective day.
There are reasons other than fluctuations in demand why employers or owners of emulations might want to “kill” or “end” their workers frequently.18 If an emulation mind, like a biological mind, requires periods of rest and sleep in order to function, it might be cheaper to erase a fatigued emulation at the end of a day and replace it with a stored state of a fresh and rested emulation. As this procedure would cause retrograde amnesia for everything that had been learned during that day, emulations performing tasks requiring long cognitive threads would be spared such frequent erasure. It would be difficult, for example, to write a book if each morning when one sat down at one’s desk, one had no memory of what one had done before. But other jobs could be performed adequately by agents that are frequently recycled: a shop assistant or a customer service agent, once trained, may only need to remember new information for twenty minutes.
Since recycling emulations would prevent memory and skill formation, some emulations may be placed on a special learning track where they would run continuously, including for rest and sleep, even in jobs that do not strictly require long cognitive threads. For example, some customer service agents might run for many years in optimized learning environments, assisted by coaches and performance evaluators. The best of these trainees would then be used like studs, serving as templates from which millions of fresh copies are stamped out each day. Great effort would be poured into improving the performance of such worker templates, because even a small increment in productivity would yield great economic value when applied in millions of copies.
In parallel with efforts to train worker-templates for particular jobs, intense efforts would also be made to improve the underlying emulation technology. Advances here would be even more valuable than advances in individual worker-templates, since general technology improvements could be applied to all emulation workers (and potentially to non-worker emulations also) rather than only to those in a particular occupation. Enormous resources would be devoted to finding computational shortcuts allowing for more efficient implementations of existing emulations, and also into developing neuromorphic and entirely synthetic AI architectures. This research would probably mostly be done by emulations running on very fast hardware. Depending on the price of computer power, millions, billions, or trillions of emulations of the sharpest human research minds (or enhanced versions thereof) may be working around the clock on advancing the frontier of machine intelligence; and some of these may be operating orders of magnitude faster than biological brains.19 This is a good reason for thinking that the era of human-like emulations would be brief—a very brief interlude in sidereal time—and that it would soon give way to an era of greatly superior artificial intelligence.
We have already encountered several reasons why employers of emulation workers may periodically cull their herds: fluctuations in demand for different kinds of laborers, cost savings of not having to emulate rest and sleep time, and the introduction of new and improved templates. Security concerns might furnish another reason. To prevent workers from developing subversive plans and conspiracies, emulations in some sensitive positions might be run only for limited periods, with frequent resets to an earlier stored ready-state.20
These ready-states to which emulations would be reset would be carefully prepared and vetted. A typical short-lived emulation might wake up in a well-rested mental state that is optimized for loyalty and productivity. He remembers having graduated top of his class after many (subjective) years of intense training and selection, then having enjoyed a restorative holiday and a good night’s sleep, then having listened to a rousing motivational speech and stirring music, and now he is champing at the bit to finally get to work and to do his utmost for his employer. He is not overly troubled by thoughts of his imminent death at the end of the working day. Emulations with death neuroses or other hang-ups are less productive and would not have been selected.21
Would maximally efficient work be fun?
One important variable in assessing the desirability of a hypothetical condition like this is the hedonic state of the average emulation.22 Would a typical emulation worker be suffering or would he be enjoying the experience of working hard on the task at hand?
We must resist the temptation to project our own sentiments onto the imaginary emulation worker. The question is not whether you would feel happy if you had to work constantly and never again spend time with your loved ones—a terrible fate, most would agree.
It is moderately more relevant to consider the current human average hedonic experience during working hours. Worldwide studies asking respondents how happy they are find that most rate themselves as “quite happy” or “very happy” (averaging 3.1 on a scale from 1 to 4).23 Studies on average affect, asking respondents how frequently they have recently experienced various positive or negative affective states, tend to get a similar result (producing a net affect of about 0.52 on a scale from -1 to 1). There is a modest positive effect of a country’s per capita income on average subjective well-being.24 However, it is hazardous to extrapolate from these findings to the hedonic state of future emulation workers. One reason that could be given for this is that their condition would be so different: on the one hand, they might be working much harder; on the other hand, they might be free from diseases, aches, hunger, noxious odors, and so forth. Yet such considerations largely miss the mark. The much more important consideration here is that hedonic tone would be easy to adjust through the digital equivalent of drugs or neurosurgery. This means that it would be a mistake to infer the hedonic state of future emulations from the external conditions of their lives by imagining how we ourselves and other people like us would feel in those circumstances. Hedonic state would be a matter of choice. In the model we are currently considering, the choice would be made by capital-owners seeking to maximize returns on their investment in emulation-workers. Consequently, the question of how happy emulations would feel boils down to the question of which hedonic states would be most productive (in the various jobs that emulations would be employed to do).
Here, again, one might seek to draw an inference from observations about human happiness. If it is the case, across most times, places, and occupations, that people are typically at least moderately happy, this would create some presumption in favor of the same holding in a post-transition scenario like the one we are considering. To be clear, the argument in this case would not be that human minds have a predisposition towards happiness so they would probably find satisfaction under these novel conditions; but rather that a certain average level of happiness has proved adaptive for human minds in the past so maybe a similar level of happiness will prove adaptive for human-like minds in the future. Yet this formulation also reveals the weakness of the inference: to wit, that the mental dispositions that were adaptive for hunter-gatherer hominids roaming the African savanna may not necessarily be adaptive for modified emulations living in post-transition virtual realities. We can certainly hope that the future emulation-workers would be as happy as, or happier than, typical workers were in human history; but we have yet to see any compelling reason for supposing it would be so (in the laissez-faire multipolar scenario currently under examination).
Consider the possibility that the reason happiness is prevalent among humans (to whatever limited extent it is prevalent) is that cheerful mood served a signaling function in the environment of evolutionary adaptedness. Conveying the impression to other members of the social group of being in flourishing condition—in good health, in good standing with one’s peers, and in confident expectation of continued good fortune—may have boosted an individual’s popularity. A bias toward cheerfulness could thus have been selected for, with the result that human neurochemistry is now biased toward positive affect compared to what would have been maximally efficient according to simpler materialistic criteria. If this were the case, then the future of joie de vivre might depend on cheer retaining its social signaling function unaltered in the post-transition world: an issue to which we will return shortly.
What if glad souls dissipate more energy than glum ones? Perhaps the joyful are more prone to creative leaps and flights of fancy—behaviors that future employers might disprize in most of their workers. Perhaps a sullen or anxious fixation on simply getting on with the job without making mistakes will be the productivity-maximizing attitude in most lines of work. The claim here is not that this is so, but that we do not know that it is not so. Yet we should consider just how bad it could be if some such pessimistic hypothesis about a future Malthusian state turned out to be true: not only because of the opportunity cost of having failed to create something better—which would be enormous—but also because the state could be bad in itself, possibly far worse than the original Malthusian state.
We seldom put forth full effort. When we do, it is sometimes painful. Imagine running on a treadmill at a steep incline—heart pounding, muscles aching, lungs gasping for air. A glance at the timer: your next break, which will also be your death, is due in 49 years, 3 months, 20 days, 4 hours, 56 minutes, and 12 seconds. You wish you had not been born.
Again the claim is not that this is how it would be, but that we do not know that it is not. One could certainly make a more optimistic case. For example, there is no obvious reason that emulations would need to suffer bodily injury and sickness: the elimination of physical wretchedness would be a great improvement over the present state of affairs. Furthermore, since such stuff as virtual reality is made of can be fairly cheap, emulations may work in sumptuous surroundings—in splendid mountaintop palaces, on terraces set in a budding spring forest, or on the beaches of an azure lagoon—with just the right illumination, temperature, scenery and décor; free from annoying fumes, noises, drafts, and buzzing insects; dressed in comfortable clothing, feeling clean and focused, and well nourished. More significantly, if—as seems perfectly possible—the optimum human mental state for productivity in most jobs is one of joyful eagerness, then the era of the emulation economy could be quite paradisiacal.
There would, in any case, be a great option value in arranging matters in such a manner that somebody or something could intervene to set things right if the default trajectory should happen to veer toward dystopia. It could also be desirable to have some sort of escape hatch that would permit bailout into death and oblivion if the quality of life were to sink permanently below the level at which annihilation becomes preferable to continued existence.
In the longer run, as the emulation era gives way to an artificial intelligence era (or if machine intelligence is attained directly via AI without a preceding whole brain emulation stage) pain and pleasure might possibly disappear entirely in a multipolar outcome, since a hedonic reward mechanism may not be the most effective motivation system for an complex artificial agent (one that, unlike the human mind, is not burdened with the legacy of animal wetware). Perhaps a more advanced motivation system would be based on an explicit representation of a utility function or some other architecture that has no exact functional analogs to pleasure and pain.
A related but slightly more radical multipolar outcome—one that could involve the elimination of almost all value from the future—is that the universal proletariat would not even be conscious. This possibility is most salient with respect to AI, which might be structured very differently than human intelligence. But even if machine intelligence were initially achieved though whole brain emulation, resulting in conscious digital minds, the competitive forces unleashed in a post-transition economy could easily lead to the emergence of progressively less neuromorphic forms of machine intelligence, either because synthetic AI is created de novo or because the emulations would, through successive modifications and enhancements, increasingly depart their original human form.
Consider a scenario in which after emulation technology has been developed, continued progress in neuroscience and computer science (expedited by the presence of digital minds to serve as both researchers and test subjects) makes it possible to isolate individual cognitive modules in an emulation, and to hook them up to modules isolated from other emulations. A period of training and adjustment may be required before different modules can collaborate effectively; but modules that conform to common standards could more quickly interface with other standard modules. This would make standardized modules more productive, and create pressure for more standardization.
Emulations can now begin to outsource increasing portions of their functionality. Why learn arithmetic when you can send your numerical reasoning task to Gauss-Modules, Inc.? Why be articulate when you can hire Coleridge Conversations to put your thoughts into words? Why make decisions about your personal life when there are certified executive modules that can scan your goal system and manage your resources to achieve your goals better than if you tried to do it yourself? Some emulations may prefer to retain most of their functionality and handle tasks themselves that could be done more efficiently by others. Those emulations would be like hobbyists who enjoy growing their own vegetables or knitting their own cardigans. Such hobbyist emulations would be less efficient; and if there is a net flow of resources from less to more efficient participants of the economy, the hobbyists would eventually lose out.
The bouillon cubes of discrete human-like intellects thus melt into an algorithmic soup.
It is conceivable that optimal efficiency would be attained by grouping capabilities in aggregates that roughly match the cognitive architecture of a human mind. It might be the case, for example, that a mathematics module must be tailored to a language module, and that both must be tailored to the executive module, in order for the three to work together. Cognitive outsourcing would then be almost entirely unworkable. But in the absence of any compelling reason for being confident that this is so, we must countenance the possibility that human-like cognitive architectures are optimal only within the constraints of human neurology (or not at all). When it becomes possible to build architectures that could not be implemented well on biological neural networks, new design space opens up; and the global optima in this extended space need not resemble familiar types of mentality. Human-like cognitive organizations would then lack a niche in a competitive post-transition economy or ecosystem.25
There might be niches for complexes that are either less complex (such as individual modules), more complex (such as vast clusters of modules), or of similar complexity to human minds but with radically different architectures. Would these complexes have any intrinsic value? Should we welcome a world in which such alien complexes have replaced human complexes?
The answer may depend on the specific nature of those alien complexes. The present world has many levels of organization. Some highly complex entities, such as multinational corporations and nation states, contain human beings as constituents; yet we usually assign these high-level complexes only instrumental value. Corporations and states do not (it is generally assumed) have consciousness, over and above the consciousness of the people who constitute them: they cannot feel phenomenal pain or pleasure or experience any qualia. We value them to the extent that they serve human needs, and when they cease to do so we “kill” them without compunction. There are also lower-level entities, and those, too, are usually denied moral status. We see no harm in erasing an app from a smartphone, and we do not think that a neurosurgeon is wronging anyone when she extirpates a malfunctioning module from an epileptic brain. As for exotically organized complexes of a level similar to that of the human brain, most of us would perhaps judge them to have moral significance only if we thought they had a capacity or potential for conscious experience.26
We could thus imagine, as an extreme case, a technologically highly advanced society, containing many complex structures, some of them far more intricate and intelligent than anything that exists on the planet today—a society which nevertheless lacks any type of being that is conscious or whose welfare has moral significance. In a sense, this would be an uninhabited society. It would be a society of economic miracles and technological awesomeness, with nobody there to benefit. A Disneyland without children.
Evolution is not necessarily up
The word “evolution” is often used as a synonym of “progress,” perhaps reflecting a common uncritical image of evolution as a force for good. A misplaced faith in the inherent beneficence of the evolutionary process can get in the way of a fair evaluation of the desirability of a multipolar outcome in which the future of intelligent life is determined by competitive dynamics. Any such evaluation must rest on some (at least implicit) opinion about the probability distribution of different phenotypes turning out to be adaptive in a post-transition digital life soup. It would be difficult in the best of circumstances to extract a clear and correct answer from the unavoidable goo of uncertainty that pervades these matters: more so, if we superadd a layer of Panglossian muck.
A possible source for faith in freewheeling evolution is the apparent upward directionality exhibited by the evolutionary process in the past. Starting from rudimentary replicators, evolution produced increasingly “advanced” organisms, including creatures with minds, consciousness, language, and reason. More recently, cultural and technological processes, which bear some loose similarities to biological evolution, have enabled humans to develop at an accelerated pace. On a geological as well as a historical timescale, the big picture seems to show an overarching trend toward increasing levels of complexity, knowledge, consciousness, and coordinated goal-directed organization: a trend which, not to put too fine a point on it, one might label “progress.”27
The image of evolution as a process that reliably produces benign effects is difficult to reconcile with the enormous suffering that we see in both the human and the natural world. Those who cherish evolution’s achievements may do so more from an aesthetic than an ethical perspective. Yet the pertinent question is not what kind of future it would be fascinating to read about in a science fiction novel or to see depicted in a nature documentary, but what kind of future it would be good to live in: two very different matters.
Furthermore, we have no reason to think that whatever progress there has been was in any way inevitable. Much might have been luck. This objection derives support from the fact that an observation selection effect filters the evidence we can have about the success of our own evolutionary development.28 Suppose that on 99.9999% of all planets where life emerged it went extinct before developing to the point where intelligent observers could begin to ponder their origin. What should we expect to observe if that were the case? Arguably, we should expect to observe something like what we do in fact observe. The hypothesis that the odds of intelligent life evolving on a given planet are low does not predict that we should find ourselves on a planet where life went extinct at an early stage; rather, it may predict that we should find ourselves on a planet where intelligent life evolved, even if such planets constitute a very small fraction of all planets where primitive life evolved. Life’s long track record on Earth may therefore offer scant support to the claim that there was a high chance—let alone anything approaching inevitability—involved in the rise of higher organisms on our planet.29
Thirdly, even if present conditions had been idyllic, and even if they could have been shown to have arisen ineluctably from some generic primordial state, there would still be no guarantee that the melioristic trend is set to continue into the indefinite future. This holds even if we disregard the possibility of a cataclysmic extinction event and indeed even if we assume that evolutionary developments will continue to produce systems of increasing complexity.
We suggested earlier that machine intelligence workers selected for maximum productivity would be working extremely hard and that it is unknown how happy such workers would be. We also raised the possibility that the fittest life forms within a competitive future digital life soup might not even be conscious. Short of a complete loss of pleasure, or of consciousness, there could be a wasting away of other qualities that many would regard as indispensible for a good life. Humans value music, humor, romance, art, play, dance, conversation, philosophy, literature, adventure, discovery, food and drink, friendship, parenting, sport, nature, tradition, and spirituality, among many other things. There is no guarantee that any of these would remain adaptive. Perhaps what will maximize fitness will be nothing but nonstop high-intensity drudgery, work of a drab and repetitive nature, destitute of ludic frisson, aimed only at improving the eighth decimal place of some economic output measure. The phenotypes selected would then have lives lacking in the aforesaid qualities, and depending on one’s axiology the result might strike one as either abhorrent, worthless, or merely impoverished, but at any rate a far cry from a utopia one would feel worthy of one’s commendation.
It might be wondered how such a bleak picture could be consistent with the fact that we do now indulge in music, humor, romance, art, etc. If these behaviors are really so “wasteful,” then how come they have been tolerated and indeed promoted by the evolutionary processes that shaped our species? That modern man is in an evolutionary disequilibrium does not account for this; for our Pleistocene forebears, too, engaged in most of these dissipations. Many of the behaviors in question are not even unique to Homo sapiens. Flamboyant display is found in a wide variety of contexts, from sexual selection in the animal kingdom to prestige contests among nation states.30
Although a full evolutionary explanation for each of these behaviors is beyond the scope of the present inquiry, we can note that some of them serve functions that may not be as relevant in a machine intelligence context. Play, for example, which occurs only in some species and predominantly among juveniles, is mainly a way for the young animal to learn skills that it will need later in life. When emulations can be created as adults, already in possession of a mature repertoire of skills, or when knowledge and techniques acquired by one AI can be directly ported into another AI, the need for playful behavior might become less widespread.
Many of the other examples of humanistic behaviors may have evolved as hard-to-fake signals of qualities that are difficult to observe directly, such as bodily or mental resilience, social status, quality of allies, ability and willingness to prevail in a fight, or possession of resources. The peacock’s tail is the classic instance: only fit peacocks can afford to sprout truly extravagant plumage, and peahens have evolved to find it attractive. No less than morphological traits, behavioral traits too can signal genetic fitness or other socially relevant attributes.31
Given that flamboyant display is so common among both humans and other species, one might consider whether it would not also be part of the repertoire of technologically more advanced life forms. Even if there were to be no narrowly instrumental use for playfulness or musicality or even for consciousness in the future ecology of intelligent information processing, might not these traits nonetheless confer some evolutionary advantage to their possessors by virtue of being reliable signals of other adaptive qualities?
While the possibility of a pre-established harmony between what is valuable to us and what would be adaptive in a future digital ecology is hard to rule out, there are reasons for skepticism. Consider, first, that many of the costly displays we find in nature are linked to sexual selection.32 Reproduction among technologically mature life forms, in contrast, may be predominantly or exclusively asexual.
Second, technologically advanced agents might have available new means of reliably communicating information about themselves, means that do not rely on costly display. Even today, when professional lenders assess creditworthiness they tend to rely more on documentary evidence, such as ownership certificates and bank statements, than on costly displays, such as designer suits and Rolex watches. In the future, it might be possible to employ auditing firms that verify through detailed examination of behavioral track records, testing in simulated environments, or direct inspection of source code, that a client agent possesses a claimed attribute. Signaling one’s qualities by agreeing to such auditing might be more efficient than signaling via flamboyant display. Such a professionally mediated signal would still be costly to fake—this being the essential feature that makes the signal reliable—but it could be much cheaper to transmit when truthful than it would be to communicate an equivalent signal flamboyantly.
Third, not all possible costly displays are intrinsically valuable or socially desirable. Many are simply wasteful. The Kwakiutl potlatch ceremonies, a form of status competition between rival chiefs, involved the public destruction of vast amounts of accumulated wealth.33 Record-breaking skyscrapers, megayachts, and moon rockets may be viewed as contemporary analogs. While activities like music and humor could plausibly be claimed to enhance the intrinsic quality of human life, it is doubtful that a similar claim could be sustained with regard to the costly pursuit of fashion accessories and other consumerist status symbols. Worse, costly display can be outright harmful, as in macho posturing leading to gang violence or military bravado. Even if future intelligent life forms would use costly signaling, therefore, it is an open question whether the signal would be of a valuable sort—whether it would be like the rapturous melody of a nightingale or instead like the toad’s monosyllabic croak (or the incessant barking of a rabid dog).
Post-transition formation of a singleton?
Even if the immediate outcome of the transition to machine intelligence were multipolar, the possibility would remain of a singleton developing later. Such a development would continue an apparent long-term trend toward larger scales of political integration, taking it to its natural conclusion.34 How might this occur?
A second transition
On way in which an initially multipolar outcome could converge into a singleton post-transition is if there is, after the initial transition, a second technological transition big enough and steep enough to give a decisive strategic advantage to one of the remaining powers: a power which might then seize the opportunity to establish a singleton. Such a hypothetical second transition might be occasioned by a breakthrough to a higher level of superintelligence. For instance, if the first wave of machine superintelligence is emulation-based, then a second surge might result when the emulations now doing the research succeed in developing effective self-improving artificial intelligence.35(Alternatively, a second transition might be triggered by a breakthrough in nanotechnology or some other military or general-purpose technology as yet unenvisaged.)
The pace of development after the initial transition would be extremely rapid. Even a short gap between the leading power and its closest competitor could therefore plausibly result in a decisive strategic advantage for the leading power during a second transition. Suppose, for example, that two projects enter the first transition only a few days apart, and that the takeoff is slow enough that this gap does not give the leading project a decisive strategic advantage at any point during the takeoff. The two projects both emerge as superintelligent powers, though one of them remains a few days ahead of the other. But developments are now occurring on the research timescales characteristic of machine superintelligence—perhaps thousands or millions of times faster than research conducted on a biological human timescale. Development of the second-transition technology might therefore be completed in days, hours, or minutes. Even though the frontrunner’s lead is a mere few days, a breakthrough could thus catapult it into a decisive strategic advantage. Note, however, that if technological diffusion (via espionage or other channels) speeds up as much as technological development, then this effect would be negated. What would remain relevant would be the steepness of the second transition, that is, the speed at which it would unfold relative to the general speed of events in the period after the first transition. (In this sense, the faster things are happening after the first transition, the less steep the second transition would tend to be.)
One might also speculate that a decisive strategic advantage would be more likely to be actually used to establish a singleton if it arises during a second (or subsequent) transition. After the first transition, decision makers would either be superintelligent or have access to advice from a superintelligence, which would clarify the implications of available strategic options. Furthermore, the situation after the first transition might be one in which a preemptive move against potential competitors would be less dangerous for the aggressor. If the decision-making minds after the first transition are digital, they could be copied and thereby rendered less vulnerable to a counterattack. Even if a defender had the ability to kill nine-tenths of the aggressor’s population in a retaliatory strike, this would scarcely offer much deterrence if the deceased could be immediately resurrected from redundant backups. Devastation of infrastructure (which can be rebuilt) might also be tolerable to digital minds with effectively unlimited lifespans, who might be planning to maximize their resources and influence on a cosmological timescale.
Superorganisms and scale economies
The size of coordinated human aggregates, such as firms or nations, is influenced by various parameters—technological, military, financial, and cultural—that can vary from one historical epoch to another. A machine intelligence revolution would entail profound changes in many these parameters. Perhaps these changes would facilitate the rise of a singleton. Although we cannot, without looking in detail at what these prospective changes are, exclude the opposite possibility—that the changes would facilitate fragmentation rather than unification—we can nevertheless note that the increased variance or uncertainty that we confront here may itself be a ground for giving greater credence to the potential emergence of a singleton than we would otherwise do. A machine intelligence revolution might, so to speak, stir things up—might reshuffle the deck to make possible geopolitical realignments that seemed perhaps otherwise not to have been in the cards.
A comprehensive analysis of all the factors that may influence the scale of political integration would take us far beyond the scope of this book: a review of the relevant political science and economics literature could itself easily fill an entire volume. We must confine ourselves to making brief allusion to a couple of factors, aspects of the digitization of agents that may make it easier to centralize control.
Carl Shulman has argued that in a population of emulations, selection pressures would favor the emergence of “superorganisms,” groups of emulations ready to sacrifice themselves for the good of their clan.36 Superorganisms would be spared the agency problems that beset organizations whose members pursue their own self-interest. Like the cells in our bodies, or the individual animals in a colony of eusocial insects, emulations that were wholly altruistic toward their copy-siblings would cooperate with one another even in the absence of elaborate incentive schemes.
Superorganisms would have a particularly strong advantage if nonconsensual deletion (or indefinite suspension) of individual emulations is disallowed. Firms or countries that employ emulations insisting on self-preservation would be saddled with an unending commitment to pay upkeep for obsolete or redundant workers. In contrast, organizations whose emulations willingly deleted themselves when their services were no longer required could more easily adapt to fluctuations in demand; and they could experiment freely, proliferating variations of their workers and retaining only the most productive.
If involuntary deletion is not disallowed, then the comparative advantage of eusocial emulations is reduced, though perhaps not eliminated. Employers of cooperative self-sacrificers might still reap efficiency gains from reduced agency problems throughout the organization, including being spared the trouble of having to defeat whatever resistance emulations could put up against their own deletion. In general, the productivity gains of having workers willing to sacrifice their individual lives for the common weal are a special case of the benefits an organization can derive from having members who are fanatically devoted to it. Such members would not only leap into the grave for the organization, and work long hours for little pay: they would also shun office politics and try consistently to act in what they took to be the organization’s best interest, reducing the need for supervision and bureaucratic constraints.
If the only way to achieve such dedication were by restricting membership to copy-siblings (so that all emulations in a particular superorganism were stamped out from the same template), then superorganisms would suffer some disadvantage in being able to draw only from a range of skills narrower than that of rival organizations, a disadvantage which might or might not be large enough to outweigh the advantages of avoiding internal agency problems.37This disadvantage would be greatly alleviated if a superorganism could at least contain members with different training. Even if all its members were derived from a single ur-template, its workforce could then still contribute a diversity of skills. Starting with a polymathically talented emulation ur-template, lineages could be branched off into different training programs, one copy learning accounting, another electrical engineering, and so forth. This would produce a membership with diverse skills though not of diverse talents. (Maximum diversity might require that more than one ur-template be used.)
The essential property of a superorganism is not that it consists of copies of a single progenitor but that all the individual agents within it are fully committed to a common goal. The ability to create a superorganism can thus be viewed as requiring a partial solution to the control problem. Whereas a completely general solution to the control problem would enable somebody to create an agent with any arbitrary final goal, the partial solution needed for the creation of a superorganism requires merely the ability to fashion multiple agents with the same final goal (for some nontrivial but not necessarily arbitrary final goal).38
The main consideration put forward in this subsection is thus not really limited to monoclonal emulation groups, but can be stated more generally in a way that makes clear that it applies to a wide range of multipolar machine intelligence scenarios. It is that certain types of advances in motivation selection techniques, which may become feasible when the actors are digital, may help overcome some of the inefficiencies that currently hamper large human organizations and that counterbalance economies of scale. With these limits lifted, organizations—be they firms, nations, or other economic or political entities—could increase in size. This is one factor that could facilitate the emergence of a post-transition singleton.
One area in which superorganisms (or other digital agents with partially selected motivations) might excel is coercion. A state might use motivation selection methods to ensure that its police, military, intelligence service, and civil administration are uniformly loyal. As Shulman notes,
Saved states [of some loyal emulation that has been carefully prepared and verified] could be copied billions of times to staff an ideologically uniform military, bureaucracy, and police force. After a short period of work, each copy would be replaced by a fresh copy of the same saved state, preventing ideological drift. Within a given jurisdiction, this capability could allow incredibly detailed observation and regulation: there might be one such copy for every other resident. This could be used to prohibit the development of weapons of mass destruction, to enforce regulations on brain emulation experimentation or reproduction, to enforce a liberal democratic constitution, or to create an appalling and permanent totalitarianism39
The first-order effect of such a capability would seem to be to consolidate power, and possibly to concentrate it in fewer hands.
Unification by treaty
There may be large potential gains to be had from international collaboration in a post-transition multipolar world. Wars and arms races could be avoided. Astrophysical resources could be colonized and harvested at a globally optimum pace. The development of more advanced forms of machine intelligence could be coordinated to avoid a rush and to allow new designs to be thoroughly vetted. Other developments that might pose existential risks could be postponed. And uniform regulations could be enforced globally, including provisions for a guaranteed standard of living (which would require some form of population control) and for preventing exploitation and abuse of emulations and other digital and biological minds. Furthermore, agents with resource-satiable preferences (more on this in Chapter 13) would prefer a sharing agreement that would guarantee them a certain slice of the future to a winner-takes-all struggle in which they would risk getting nothing.
The presence of big potential gains from collaboration, however, does not imply that collaboration will actually be achieved. In the world today, many great boons could be obtained via better global coordination—reductions of military expenditures, wars, overfishing, trade barriers, and atmospheric pollution, among others. Yet these plump fruits are left to spoil on the branch. Why is that? What stops a fully cooperative outcome that would maximize the common good?
One obstacle is the difficulty of ensuring compliance with any treaty that might be agreed, including monitoring and enforcement costs. Two nuclear rivals might each be better off if they both relinquished their atom bombs; yet even if they could reach an in-principle agreement to do so, disarmament could nevertheless prove elusive because of their mutual fear that the other party might cheat. Allaying this fear would require setting up a verification mechanism. There may have to be inspectors to oversee the destruction of existing stockpiles, and then to monitor nuclear reactors and other facilities, and to gather technical and human intelligence, in order to ensure that the weapons program is not reconstituted. One cost is paying for these inspectors. Another cost is the risk that the inspectors will spy and make off with commercial or military secrets. Perhaps most significantly, each party might fear that the other will preserve a clandestine nuclear capability. Many a potentially beneficial deal never comes off because compliance would be too difficult to verify.
If new inspection technologies that reduced monitoring costs became available, one would expect this to result in increased cooperation. Whether monitoring costs would on net be reduced in the post-transition era, however, is not entirely clear. While there would certainly be many powerful new inspection techniques, there would also be new means of concealment. In particular, an increasing portion of the activities one might want to regulate would be taking place in cyberspace, out of reach of physical surveillance. For example, digital minds working on designing a new nanotech weapons system or a new generation of artificial intelligence may do so without leaving much of a physical footprint. Digital forensics may fail to penetrate all the layers of concealment and encryption in which a treaty-violator may cloak its illicit activities.
Reliable lie detection, if it could be developed, would be an extremely useful tool for monitoring compliance.40 An inspection protocol could include provisions for interviewing key officials, to verify that they are intent on implementing all the provisions of the treaty and that they know of no violations despite making strong efforts to find out.
A decision maker planning to cheat might defeat such a lie-detection-based verification scheme by first issuing orders to subordinates to undertake the illicit activity and to conceal the activity even from the decision maker herself, and then subjecting herself to some procedure that erases her memory of having engaged in these machinations. Suitably targeted memory-erasure operations might well be feasible in biological brains with more advanced neurotechnology. It might be even easier in machine intelligences (depending on their architecture).
States could seek to overcome this problem by committing themselves to an ongoing monitoring scheme that regularly tests key officials with a lie detector to check whether they harbor any intent to subvert or circumvent any treaty to which the state has entered or may enter in the future. Such a commitment could be viewed as a kind of meta-treaty, which would facilitate the verification of other treaties; but states might commit themselves to it unilaterally to gain the benefit of being regarded as a trustworthy negotiation partner. However, this commitment or meta-treaty would face the same problem of subversion through a delegate-and-forget ploy. Ideally, the meta-treaty would be put into effect before any party had an opportunity to make the internal arrangements necessary to subvert its implementation. Once villainy has had an unguarded moment to sow its mines of deception, trust can never set foot there again.
In some cases, the mere ability to detect treaty violations is sufficient to establish the confidence needed for a deal. In other cases, however, there is a need for some mechanism to enforce compliance or mete out punishment if a violation should occur. The need for an enforcement mechanism may arise if the threat of the wronged party withdrawing from the treaty is not enough to deter violations, for instance if the treaty-violator would gain such an advantage that he would not subsequently care how the other party responds.
If highly effective motivation selection methods are available, this enforcement problem could be solved by empowering an independent agency with sufficient police or military strength to enforce the treaty even against the opposition of one or several of its signatories. This solution requires that the enforcement agency can be trusted. But with sufficiently good motivation selection techniques, the requisite confidence might be achieved by having all the parties to the treaty jointly oversee the design of the enforcement agency.
Handing over power to an external enforcement agency raises many of the same issues that we confronted earlier in our discussions of a unipolar outcome (one in which a singleton arises prior to or during the initial machine intelligence revolution). In order to be able to enforce treaties concerning the vital security interests of rival states, the external enforcement agency would in effect need to constitute a singleton: a global superintelligent Leviathan. One difference, however, is that we are now considering a post-transition situation, in which the agents that would have to create this Leviathan would have greater competence than we humans currently do. These Leviathan-creators may themselves already be superintelligent. This would greatly improve the odds that they could solve the control problem and design an enforcement agency that would serve the interests of all the parties that have a say in its construction.
Aside from the costs of monitoring and enforcing compliance, are there any other obstacles to global coordination? Perhaps the major remaining issue is what we can refer to as bargaining costs.41 Even when there is a possible bargain that would benefit everybody involved, it sometimes does not get off the ground because the parties fail to agree on how to divide the spoils. For example, if two persons could make a deal that would net them a dollar in profit, but each party feels she deserves sixty cents and refuses to settle for less, the deal will not happen and the potential gain will be forfeited. In general, negotiations can be difficult or protracted, or remain altogether barren, because of strategic bargaining choices made by some of the parties.
In real life, human beings frequently succeed in reaching agreements despite the possibility for strategic bargaining (though often not without considerable expenditure of time and patience). It is conceivable, however, that strategic bargaining problems would have a different dynamic in the post-transition era. An AI negotiator might more consistently adhere to some particular formal conception of rationality, possibly with novel or unanticipated consequences when matched with other AI negotiators. An AI might also have available to it moves in the bargaining game that are either unavailable to humans or very much more difficult for humans to execute, including the ability to precommit to a policy or a course of action. While humans (and human-run institutions) are occasionally able to precommit—with imperfect degrees of credibility and specificity—some types of machine intelligence might be able to make arbitrary unbreakable precommitments and to allow negotiating partners to confirm that such a precommitment has been made.42
The availability of powerful precommitment techniques could profoundly alter the nature of negotiations, potentially giving an immense edge to an agent that has a first-mover advantage. If a particular agent’s participation is necessary for the realization of some prospective gains from cooperation, and if that agent is able to make the first move, it would be in a position to dictate the division of the spoils by precommitting not to accept any deal that gives it less than, say, 99% of the surplus value. Other agents would then be faced with the choice of either getting nothing (by rejecting the unfair proposal) or getting 1% of the value (by caving in). If the first-moving agent’s precommitment is publicly verifiable, its negotiating partners could be sure that these are their only two options.
To avoid being exploited in this manner, agents might precommit to refuse blackmail and to decline all unfair offers. Once such a precommitment has been made (and successfully publicized), other agents would not find it in their interest to make threats or to precommit themselves to only accepting deals tilted in their own favor, because they would know that threats would fail and that unfair proposals would be rejected. But this just demonstrates again that the advantage is with the first-mover. The agent who moves first can choose whether to parlay its position of strength only to deter others from taking unfair advantage, or to make a grab for the lion’s share of future spoils.
Best situated of all, it might seem, would be the agent who starts out with a temperament or a value system that makes him impervious to extortion or indeed to any offer of a deal in which his participation is indispensable but he is not getting almost all of the gains. Some humans seem already to possess personality traits corresponding to various aspects of an uncompromising spirit.43 A high-strung disposition, however, could backfire should it turn out that there are other agents around who feel entitled to more than their fair share and are committed to not backing down. The unstoppable force would then encounter the unmovable object, resulting in a failure to reach agreement (or worse: total war). The meek and the akratic would at least get something, albeit less than their fair share.
What kind of game-theoretic equilibrium would be reached in such a post-transition bargaining game is not immediately obvious. Agents might choose more complicated strategies than the ones considered here. One hopes that an equilibrium would be reached centered on some fairness norm that would serve as a Schelling point—a salient feature in a big outcome space which, because of shared expectations, becomes a likely coordination point in an otherwise underdetermined coordination game. Such an equilibrium might be bolstered by some of our evolved dispositions and cultural programming: a common preference for fairness could, assuming we succeed in transferring our values into the post-transition era, bias expectations and strategies in ways that lead to an attractive equilibrium.44
In any case, the upshot is that with the possibility of strong and flexible forms of precommitment, outcomes of negotiations might take on an unfamiliar guise. Even if the post-transition era started out multipolar, it might be that a singleton would arise almost immediately as a consequence of a negotiated treaty that resolves all important global coordination problems. Some transaction costs, perhaps including monitoring and enforcement costs, might plummet with the new technological capabilities available to advanced machine intelligences. Other costs, in particular costs related to strategic bargaining, might remain significant. But however strategic bargaining affects the nature of the agreement that is reached, there is no clear reason why it would long delay the reaching of some agreement if an agreement were ever to be reached. If no agreement is reached, then some form of fighting might take place; and either one faction might win, and form a singleton around the winning coalition, or the result might be an interminable conflict, in which case a singleton may never form and the overall outcome may fall terribly short of what could and should have been achieved if humanity and its descendants had acted in a more coordinated and cooperative fashion.
We have seen that multipolarity, even if it could be achieved in a stable form, would not guarantee an attractive outcome. The original principal-agent problem remains unsolved, and burying it under a new set of problems related to post-transition global coordination failures may only make the situation worse. Let us therefore return to the question of how we could safely keep a single superintelligent AI.