Superintelligence: Paths, Dangers, Strategies - Nick Bostrom (2014)

Chapter 15. Crunch time

We find ourselves in a thicket of strategic complexity, surrounded by a dense mist of uncertainty. Though many considerations have been discerned, their details and interrelationships remain unclear and iffy—and there might be other factors we have not even thought of yet. What are we to do in this predicament?

Philosophy with a deadline

A colleague of mine likes to point out that a Fields Medal (the highest honor in mathematics) indicates two things about the recipient: that he was capable of accomplishing something important, and that he didn’t. Though harsh, the remark hints at a truth.

Think of a “discovery” as an act that moves the arrival of information from a later point in time to an earlier time. The discovery’s value does not equal the value of the information discovered but rather the value of having the information available earlier than it otherwise would have been. A scientist or a mathematician may show great skill by being the first to find a solution that has eluded many others; yet if the problem would soon have been solved anyway, then the work probably has not much benefited the world. There are cases in which having a solution even slightly sooner is immensely valuable, but this is most plausible when the solution is immediately put to use, either being deployed for some practical end or serving as a foundation to further theoretical work. And in the latter case, where a solution is immediately used only in the sense of serving as a building block for further theorizing, there is great value in obtaining a solution slightly sooner only if the further work it enables is itself both important and urgent.¹

The question, then, is not whether the result discovered by the Fields Medalist is in itself “important” (whether instrumentally or for knowledge’s own sake). Rather, the question is whether it was important that the medalist enabled the publication of the result to occur at an earlier date. The value of this temporal transport should be compared to the value that a world-class mathematical mind could have generated by working on something else. At least in some cases, the Fields Medal might indicate a life spent solving the wrong problem—for instance, a problem whose allure consisted primarily in being famously difficult to solve.

Similar barbs could be directed at other fields, such as academic philosophy. Philosophy covers some problems that are relevant to existential risk mitigation—we encountered several in this book. Yet there are also subfields within philosophy that have no apparent link to existential risk or indeed any practical concern. As with pure mathematics, some of the problems that philosophy studies might be regarded as intrinsically important, in the sense that humans have reason to care about them independently of any practical application. The fundamental nature of reality, for instance, might be worth knowing about, for its own sake. The world would arguably be less glorious if nobody studied metaphysics, cosmology, or string theory. However, the dawning prospect of an intelligence explosion shines a new light on this ancient quest for wisdom.

The outlook now suggests that philosophic progress can be maximized via an indirect path rather than by immediate philosophizing. One of the many tasks on which superintelligence (or even just moderately enhanced human intelligence) would outperform the current cast of thinkers is in answering fundamental questions in science and philosophy. This reflection suggests a strategy of deferred gratification. We could postpone work on some of the eternal questions for a little while, delegating that task to our hopefully more competent successors—in order to focus our own attention on a more pressing challenge: increasing the chance that we will actually have competent successors. This would be high-impact philosophy and high-impact mathematics.²

What is to be done?

We thus want to focus on problems that are not only important but urgent in the sense that their solutions are needed prior to the intelligence explosion. We should also take heed not to work on problems that are negative-value (such that solving them is harmful). Some technical problems in the field of artificial intelligence, for instance, might be negative-value inasmuch as their solution would speed the development of machine intelligence without doing as much to expedite the development of control methods that could render the machine intelligence revolution survivable and beneficial.

It can be hard to identify problems that are both urgent and important and are such that we can confidently take them to be positive-value. The strategic uncertainty surrounding existential risk mitigation means that we must worry that even well-intentioned interventions may turn out to be not only unproductive but counterproductive. To limit the risk of doing something actively harmful or morally wrong, we should prefer to work on problems that seem robustly positive-value (i.e., whose solution would make a positive contribution across a wide range of scenarios) and to employ means that are robustly justifiable (i.e., acceptable from a wide range of moral views).

There is a further desideratum to consider in selecting which problems to prioritize. We want to work on problems that are elastic to our efforts at solving them. Highly elastic problems are those that can be solved much faster, or solved to a much greater extent, given one extra unit of effort. Encouraging more kindness in the world is an important and urgent problem—one, moreover, that seems quite robustly positive-value: yet absent a breakthrough idea for how to go about it, probably a problem of quite low elasticity. Achieving world peace, similarly, would be highly desirable; but considering the numerous efforts already targeting that problem, and the formidable obstacles arrayed against a quick solution, it seems unlikely that the contributions of a few extra individuals would make a large difference.

To reduce the risks of the machine intelligence revolution, we will propose two objectives that appear to best meet all those desiderata: strategic analysis and capacity-building. We can be relatively confident about the sign of these parameters—more strategic insight and more capacity being better. Furthermore, the parameters are elastic: a small extra investment can make a relatively large difference. Gaining insight and capacity is also urgent because early boosts to these parameters may compound, making subsequent efforts more effective. In addition to these two broad objectives, we will point to a few other potentially worthwhile aims for initiatives.

Seeking the strategic light

Against a backdrop of perplexity and uncertainty, analysis stands out as being of particularly high expected value.³ Illumination of our strategic situation would help us target subsequent interventions more effectively. Strategic analysis is especially needful when we are radically uncertain not just about some detail of some peripheral matter but about the cardinal qualities of the central things. For many key parameters, we are radically uncertain even about their sign—that is, we know not which direction of change would be desirable and which undesirable. Our ignorance might not be irremediable. The field has been little prospected, and glimmering strategic insights could still be awaiting their unearthing just a few feet beneath the surface.

What we mean by “strategic analysis” here is a search for crucial considerations: ideas or arguments with the potential to change our views not merely about the fine-structure of implementation but about the general topology of desirability.⁴ Even a single missed crucial consideration could vitiate our most valiant efforts or render them as actively harmful as those of a soldier who is fighting on the wrong side. The search for crucial considerations (which must explore normative as well as descriptive issues) will often require crisscrossing the boundaries between different academic disciplines and other fields of knowledge. As there is no established methodology for how to go about this kind of research, difficult original thinking is necessary.

Building good capacity

Another high-value activity, one that shares with strategic analysis the robustness property of being beneficial across a wide range of scenarios, is the development of a well-constituted support base that takes the future seriously. Such a base can immediately provide resources for research and analysis. If and when other priorities become visible, resources can be redirected accordingly. A support base is thus a general-purpose capability whose use can be guided by new insights as they emerge.

One valuable asset would be a donor network comprising individuals devoted to rational philanthropy, informed about existential risk, and discerning about the means of mitigation. It is especially desirable that the early-day funders be astute and altruistic, because they may have opportunities to shape the field’s culture before the usual venal interests take up position and entrench. The focus during these opening gambits should thus be to recruit the right kinds of people into the field. It could be worth foregoing some technical advances in the short term in order to fill the ranks with individuals who genuinely care about safety and who have a truth-seeking orientation (and who are likely to attract more of their own kind).

One important variable is the quality of the “social epistemology” of the AI-field and its leading projects. Discovering crucial considerations is valuable, but only if it affects action. This cannot always be taken for granted. Imagine a project that invests millions of dollars and years of toil to develop a prototype AI, and that after surmounting many technical challenges the system is finally beginning to show real progress. There is a chance that with just a bit more work it could turn into something useful and profitable. Now a crucial consideration is discovered, indicating that a completely different approach would be a bit safer. Does the project kill itself off like a dishonored samurai, relinquishing its unsafe design and all the progress that had been made? Or does it react like a worried octopus, puffing out a cloud of motivated skepticism in the hope of eluding the attack? A project that would reliably choose the samurai option in such a dilemma would be a far preferable developer.⁵ Yet building processes and institutions that are willing to commit seppuku based on uncertain allegations and speculative reasoning is not easy. Another dimension of social epistemology is the management of sensitive information, in particular the ability to avoid leaking information that ought be kept secret. (Information continence may be especially challenging for academic researchers, accustomed as they are to constantly disseminating their results on every available lamppost and tree.)

Particular measures

In addition to the general objectives of strategic light and good capacity, some more specific objectives could also present cost-effective opportunities for action.

One such is progress on the technical challenges of machine intelligence safety. In pursing this objective, care should be taken to manage information hazards. Some work that would be useful for solving the control problem would also be useful for solving the competence problem. Work that burns down the AI fuse could easily be a net negative.

Another specific objective is to promote “best practices” among AI researchers. Whatever progress has been made on the control problem needs to be disseminated. Some forms of computational experimentation, particularly if involving strong recursive self-improvement, may also require the use of capability control to mitigate the risk of an accidental takeoff. While the actual implementation of safety methods is not so relevant today, it will increasingly become so as the state of the art advances. And it is not too soon to call for practitioners to express a commitment to safety, including endorsing the common good principle and promising to ramp up safety if and when the prospect of machine superintelligence begins to look more imminent. Pious words are not sufficient and will not by themselves make a dangerous technology safe: but where the mouth goeth, the mind might gradually follow.

Other opportunities may also occasionally arise to push on some pivotal parameter, for example to mitigate some other existential risk, or to promote biological cognitive enhancement and improvements of our collective wisdom, or even to shift world politics into a more harmonious register.

Will the best in human nature please stand up

Before the prospect of an intelligence explosion, we humans are like small children playing with a bomb. Such is the mismatch between the power of our plaything and the immaturity of our conduct. Superintelligence is a challenge for which we are not ready now and will not be ready for a long time. We have little idea when the detonation will occur, though if we hold the device to our ear we can hear a faint ticking sound.

For a child with an undetonated bomb in its hands, a sensible thing to do would be to put it down gently, quickly back out of the room, and contact the nearest adult. Yet what we have here is not one child but many, each with access to an independent trigger mechanism. The chances that we will all find the sense to put down the dangerous stuff seem almost negligible. Some little idiot is bound to press the ignite button just to see what happens.

Nor can we attain safety by running away, for the blast of an intelligence explosion would bring down the entire firmament. Nor is there a grown-up in sight.

In this situation, any feeling of gee-wiz exhilaration would be out of place. Consternation and fear would be closer to the mark; but the most appropriate attitude may be a bitter determination to be as competent as we can, much as if we were preparing for a difficult exam that will either realize our dreams or obliterate them.

This is not a prescription of fanaticism. The intelligence explosion might still be many decades off in the future. Moreover, the challenge we face is, in part, to hold on to our humanity: to maintain our groundedness, common sense, and good-humored decency even in the teeth of this most unnatural and inhuman problem. We need to bring all our human resourcefulness to bear on its solution.

Yet let us not lose track of what is globally significant. Through the fog of everyday trivialities, we can perceive—if but dimly—the essential task of our age. In this book, we have attempted to discern a little more feature in what is otherwise still a relatively amorphous and negatively defined vision—one that presents as our principal moral priority (at least from an impersonal and secular perspective) the reduction of existential risk and the attainment of a civilizational trajectory that leads to a compassionate and jubilant use of humanity’s cosmic endowment.