Machines of Loving Grace: The Quest for Common Ground Between Humans and Robots - John Markoff (2015)
Chapter 5. WALKING AWAY
As a young navy technician in the 1950s, Robert Taylor had a lot of experience flying, even without a pilot’s license. He had become a favorite copilot of the real pilots, who needed both lots of flight hours and time to study for exams. So they would take Taylor along in their training jets, and after they took off he would fly the plane—gently—while the real pilots studied in the backseat. He even practiced instrument landing approaches, in which the plane is guided to a landing by radio communications, while the pilot wears a hood blocking the view of the outside terrain.
As a young NASA program administrator in the early 1960s, Taylor was confident when he received an invitation to take part in a test flight at a Cornell University aerospace laboratory. On arrival they put him in an uncomfortable anti-g suit and plunked him down in the front seat of a Lockheed T-33 jet trainer while the real pilot sat behind him. They took off and the pilot flew up to the jet’s maximum altitude, almost fifty thousand feet, then Taylor was offered the controls to fly around a bit. After a while the pilot said, “Let’s try something a little more interesting. Why don’t you put the plane in a dive?” So Taylor pushed the joystick forward until he thought he was descending steeply enough, then he began to ease the stick backward. Suddenly he froze in panic. As he pulled back, the plane entered a steeper dive. It felt like going over the top of a roller-coaster ride. He pulled the stick back farther but the plane was still descending almost vertically.
Finally he said to the pilot behind him, “Okay, you’ve got it, you better take over!” The pilot laughed, leveled the plane out, and said, “Let’s try this again.” They tried again and this time when he pushed the stick forward the plane went unexpectedly upward. As he pushed a bit harder, the plane tilted up farther. This time he panicked, about to stall the plane, and again the pilot leveled the plane out.
Taylor should have guessed. He had had such an odd piloting experience because he was unwittingly flying a laboratory plane that the air force researchers were using to experiment with flight control systems. The air force invited Taylor to Cornell because as a NASA program manager he had granted, unsolicited, $100,000 to a flight research group at Wright-Patterson Air Force Base.
Taylor, first at NASA and then at DARPA, would pave the way for systems used both to augment humans and to replace them. NASA was three years old when, in 1961, President Kennedy had announced the goal of getting an American to the moon—and back—safely during that decade. Taylor found himself at an agency with a unique charter, to fundamentally shape how humans and machines interact, not just in flight, but ultimately in all computer-based systems from the desktop PC to today’s mobile robots.
The term “cyborg,” for “cybernetic organism,” had been coined originally in 1960 by medical researchers thinking about intentionally enhancing humans to prepare them for the exploration of space.1 They foresaw a new kind of creature—half human, half mechanism—capable of surviving in harsh environments.
In contrast, Taylor’s organization was funding the design of electronic systems that closely collaborated with humans while retaining a bright line distinguishing what was human and what was machine.
In the early 1960s NASA was a brand-new government bureaucracy deeply divided by the question of the role of humans in spaceflight. For the first time it was possible to conceive of entirely automated flight in space. The deeply unsettling idea was an obvious future direction—one in which machines would steer and humans would be passengers, then already the default approach pursued by the Soviet space program. In contrast, the U.S. program underscored the deep division that was highlighted by a series of incidents where American astronauts had intervened, thus proving the survival value of what in NASA parlance came to be called “human in the loop.” On Gemini VI,for example, Wally Schirra was hailed as a hero after he held off pushing the abort button during a launch sequence, even though he was violating a NASA mission rule.2
The human-in-the-loop debates became a series of intensely fought battles inside NASA during the 1950s and 1960s. When Taylor arrived at the agency in 1961 he found an engineering culture in love with a body of mathematics known as control theory, Norbert Wiener’s cybernetic legacy. These NASA engineers were designing the nation’s aeronautic as well as astronautic flight systems. These were systems of such complexity that the engineers found them abstractly, some might say inherently, beautiful. Taylor could see early on that the aerospace designers were wedded to the aesthetics of control as much as the fact that the systems needed to be increasingly automated because humans weren’t fast or reliable enough to control them.
He had stumbled into an almost intractable challenge, and hence a deeply divided technical culture. NASA was split on the question of the role of humans in spaceflight. Taylor saw that the dispute pervaded even the highest echelons of the agency, and that it was easy to predict which side of the debate each particular manager would take. Former jet pilots would be in favor of keeping a human in the system, while experts in control theory would choose full automation.
As a program manager in 1961, Taylor was responsible for several areas of research funding, one of them called “manned flight control systems.” Another colleague in the same funding office was responsible for “automatic control systems.” The two got along well enough, but they were locked in a bitter budgetary zero-sum game. Taylor began to understand the arguments his colleagues made in support of automated control, though he was responsible for mastering arguments for manned control. His best card in the debate was that he had the astronauts on his side and they had tremendous clout. NASA’s corps of astronauts had mostly been test pilots. They were the pride of the space agency and proved Taylor’s invaluable allies. Taylor had funded the design and construction of simulator technology used extensively in astronaut training—systems for practicing a series of spacecraft maneuvers, like docking—since the early days of the Mercury program, and had spent hours talking with astronauts about the strengths and weaknesses of the different virtual training environments. He found that the astronauts were keenly aware of the debate over the proper role of humans in the space programs. They had a huge stake in whether they would have a role in future space systems or be little more than another batch of dogs and smart monkeys coming along for the ride.
The political battle over the human in the loop was waged over two divergent narratives: that of the heroic astronauts landing on the surface of the moon and that of the specter of a catastrophic accident culminating in the deaths of the astronauts—and potentially, as a consequence, the death of the agency. The issue, however, was at least temporarily settled when during the first human moon landing Neil Armstrong heroically took command after a computer malfunction and piloted the Apollo 11 spacecraft safely to the lunar surface. The moon landing and other similar feats of courage, such as Wally Schirra’s decision not to abort the earlier Gemini flight, have firmly established a view of human-machine interaction that elevates human decision-making beyond the fallible machines of our mythology. Indeed, the macho view of astronauts as modern-day Lewises and Clarks was from the beginning deeply woven into the NASA ethos, as well as being a striking contrast with the early Soviet decision to train women cosmonauts.3 The American view of human-controlled systems was long partially governed by perceived distinctions between U.S. and Soviet approaches to aeronautics as well as astronautics. The Vostok spacecraft were more automated, and so Soviet astronauts were basically passengers rather than pilots. Yet the original American commitment to human-controlled spaceflight was made when aeronautical technology was in its infancy. In the ensuing half century, computers and automated systems have become vastly more reliable.
For Taylor, the NASA human-in-the-loop wars were a formative experience that governed his judgment at both NASA and DARPA, where he projected and sponsored technological breakthroughs in computing, robotics, and artificial intelligence. While at NASA, Taylor fell into the orbit of J. C. R. Licklider, whose interests in psychology and information technology led him to anticipate the full potential of interactive computing. In his seminal 1960 paper “Man-Computer Symbiosis,” Licklider foresaw an era when computerized systems would entirely displace humans. However, he also predicted an interim period that might span from fifteen to five hundred years in which humans and computers would cooperate. He believed that that period would be the most “intellectually and most creative and exciting [time] in the history of mankind.”
Taylor moved to ARPA in 1965 as Licklider’s protégé. He set about funding the ARPAnet, the first nationwide research-oriented computer network. In 1968 the two men coauthored a follow-up to Licklider’s symbiosis paper titled “The Computer as a Communication Device.” In it, Licklider and Taylor were possibly the first to delineate the coming impact of computer networks on society.
Today, even after decades of research in human-machine and human-computer interaction in the airplane cockpit, the argument remains unsettled—and has emerged again with the rise of autonomous navigation in trains and automobiles. While Google leads in research in driverless cars, the legacy automobile industry has started to deploy intelligent systems that can offer autonomous driving in some well-defined cases, such as during stop-and-go traffic jams, but then return the car to human control in situations recognized as too complex or risky to autopilot. It may take seconds for a human sitting in the driver’s seat, possibly distracted by an email or worse, to return to “situational awareness” and safely resume control of the car. Indeed the Google researchers may have already come up against the limits to autonomous driving. There is currently a growing consensus that the “handoff” problem—returning manual control of an autonomous car to a human in the event of an emergency—may not actually be a solvable one. If that proves true, the development of the safer cars of the future will tend toward augmentation technology rather than automation technology. Completely autonomous driving might ultimately be limited to special cases like low-speed urban services and freeway driving.
Nevertheless, the NASA disputes were a harbinger of the emerging world of autonomous machines. During the first fifty years of interactive computing, beginning in the mid-sixties, computers largely augmented humans instead of replacing them. The technologies that became the hallmark of Silicon Valley—personal computing and the Internet—largely amplified human intellect, although it was undeniably the case that an “augmented” human could do the work of several (former) coworkers. Today, in contrast, system designers have a choice. As AI technologies including vision, speech, and reasoning have begun to mature, it is increasingly possible to design humans either in or out of “the loop.”
Funded first by J. C. R. Licklider and then, beginning in 1965, by Bob Taylor, John McCarthy and Doug Engelbart worked in laboratories just miles apart from each other at the outset of the modern computing era. They might as well have been in different universes. Both were funded by ARPA, but they had little if any contact. McCarthy was a brilliant, if somewhat cranky, mathematician and Engelbart was an Oregon farm boy and a dreamer.
The outcome of their competing pioneering research was unexpected. When McCarthy came to Stanford to create the Stanford Artificial Intelligence Laboratory in the mid-1960s, his work was at the very heart of computer science, focusing on big concepts like artificial intelligence and proof of software program correctness using formal logic. Engelbart, on the other hand, set out to build a “framework” for augmenting the human intellect. It was initially a more nebulous concept viewed as far outside the mainstream of academic computer science, and yet for the first three decades of the interactive computing era Engelbart’s ideas had more worldly impact. Within a decade both the first modern personal computers and then later information-sharing technologies like the World Wide Web—both of which can be traced in part to Engelbart’s research—emerged.
Since then Engelbart’s adherents have transformed the world. They have extended human capabilities everywhere in modern life. Today, shrunk into smartphones, personal computers will soon be carried by all but the allergic or iconoclastic adult and teenager. Smartphones are almost by definition assembled into a vast distributed computing fabric woven together by the wireless Internet. They are also relied on as artificial memories. Today many people are literally unable to hold a conversation or find their way around town without querying them.
While Engelbart’s original research led directly to the PC and the Internet, McCarthy’s lab was most closely associated with two other technologies—robotics and artificial intelligence. There had been no single dramatic breakthrough. Rather, the falling cost of computing (both in processing and storage), the gradual shift from the symbolic logic-based approach of the first generation of AI research to more pragmatic statistics and machine-learning algorithms of the second generation of AI, and the declining price of sensors now offer engineers and programmers the canvas to create computerized systems that see, speak, listen, and move around in the world.
The balance has shifted. Computing technologies are emerging that can be used to replace and even outpace humans. At the same time, in the ensuing half century there has been little movement toward unification in the two fields, IA and AI, the offshoots of Engelbart’s and McCarthy’s original work. Rather, as computing and robotics systems have grown from laboratory curiosities into the fabric that weaves together modern life, the opposing viewpoints of those in each community have for the most part continued to speak past each other.
The human-computer interaction community keeps debating metaphors ranging from windows and mice to autonomous agents, but has largely operated within the philosophical framework originally set down by Engelbart—that computers should be used to augment humans. In contrast, the artificial intelligence community has for the most part pursued performance and economic goals elaborated in equations and algorithms, largely unconcerned with defining or in any way preserving a role for individual humans. In some cases the impact is easily visible, such as manufacturing robots that directly replace human labor. In other cases it is more difficult to discern the direct effect on employment caused by deployment of new technologies. Winston Churchill said: “We shape our buildings, and afterwards our buildings shape us.” Today our systems have become immense computational edifices that define the way we interact with our society, from how our physical buildings function to the very structure of our organizations, whether they are governments, corporations, or churches.
As the technologies marshaled by the AI and IA communities continue to reshape the world, alternative visions of the future play out: In one world humans coexist and prosper with the machines they’ve created—robots care for the elderly, cars drive themselves, and repetitive labor and drudgery vanish, creating a new Athens where people do science, make art, and enjoy life. It will be wonderful if the Information Age unfolds in that fashion, but how can it be a foregone conclusion? It is equally possible to make the case that these powerful and productive technologies, rather than freeing humanity, will instead facilitate a further concentration of wealth, fomenting vast new waves of technological unemployment, casting an inescapable surveillance net around the globe, while unleashing a new generation of autonomous superweapons.
When Ed Feigenbaum finished speaking the room was silent. No polite applause, no chorus of boos. Just a hush. Then the conference attendees filed out of the room and left the artificial intelligence pioneer alone at the podium.
Shortly after Barack Obama was elected president in 2008, it seemed possible that the Bush administration plan for space exploration, which focused on placing a manned base on the moon, might be replaced with an even more audacious program that would involve missions to asteroids and possibly even manned flights to Mars with human landings on the Martian moons Phobos and Deimos.4 Shorter-term goals included the possibility of sending astronauts to Lagrangian points one million miles from Earth where the Earth’s and Sun’s gravitational pull cancel each other and create convenient long-term parking for ambitious devices like a next-generation Hubble Space Telescope.
Human exploration of the solar system was the pet project of G. Scott Hubbard, a head of NASA’s Ames Research Center in Mountain View, California, who was heavily backed by the Planetary Society, a nonprofit that advocates for space exploration and science. As a result, NASA organized a conference to discuss the possible resurrection of human exploration of the solar system. A star-studded cast of space luminaries, including astronaut Buzz Aldrin, the second human to set foot on the moon, and celebrity astrophysicist Neil deGrasse Tyson, showed up for the day. One of the panels focused on the role of robots, which were envisioned by the conference organizers as providing intelligent systems that would assist humans on long flights to other worlds.
Feigenbaum had been a student of one of the founders of the field of AI, Herbert Simon, and he had led the development of the first expert systems as a young professor at Stanford. A believer in the potential of artificial intelligence and robotics, he had been irritated by a past run-in with a Mars geologist who had insisted that sending a human to Mars would provide more scientific information in just a few minutes than a complete robot mission might return. Feigenbaum also had a deep familiarity with the design of space systems. Moreover, having once served as chief scientist of the air force, he was a veteran of the human-in-the-loop debates stretching back to the space program.
He showed up to speak at the panel with a chip on his shoulder. Speaking from a simple set of slides, he sketched out an alternative to the manned flight to Mars vision. He rarely used capital letters in his slides, but he did this time:
ALMOST EVERYTHING THAT HAS BEEN LEARNED ABOUT THE SOLAR SYSTEM AND SPACE BEYOND HAS BEEN LEARNED BY PEOPLE ON EARTH ASSISTED BY THEIR NHA (NON-HUMAN AGENTS) IN SPACE OR IN ORBIT5
The whole notion of sending humans to another planet when robots could perform just as well—and maybe even better—for a fraction of the cost and with no risk of human life seemed like a fool’s errand to Feigenbaum. His point was that AI systems and robots in the broader sense of the term were becoming so capable so quickly that the old human-in-the-loop idea had lost its mystique as well as its value. All the coefficients on the nonhuman side of the equation had changed. He wanted to persuade the audience to start thinking in terms of agents, to shift gears and think about humans exploring the solar system with augmented senses. It was not a message that the audience wanted to hear. As the room emptied, a scientist who worked at NASA’s Goddard Space Flight Center came to the table and quietly said that she was glad that Feigenbaum had said what he did. In her job, she whispered, she could not say that.
Feigenbaum’s encounter underscores the reality that there isn’t a single “right” answer in the dichotomy between AI and IA. Sending humans into space is a passionate ideal for some. For others like Feigenbaum, however, the vast resources the goal entails are wasted. Intelligent machines are perfectly suited for the hostile environment beyond Earth, and in designing them we can perfect technologies that can be used to good effect on Earth. His quarrel is also indicative that there won’t be any easy synthesis of the two camps.
While the separate fields of artificial intelligence and human-computer interaction have largely remained isolated domains, there are people who have lived in both worlds and researchers who have famously crossed from one camp to the other. Microsoft cognitive psychologist Jonathan Grudin first noted that the two fields have risen and fallen in popularity, largely in opposition to each other. When the field of artificial intelligence was more prominent, human-computer interaction generally took a backseat, and vice versa.
Grudin thinks of himself as an optimist. He has written that he believes it is possible that in the future there will be a grand convergence of the fields. Yet the relationship between the two fields remains contentious and the human-computer interaction perspective as pioneered by Engelbart and championed by people like Grudin and his mentor Donald Norman is perhaps the most significant counterweight to artificial intelligence-oriented technologies that have the twin potential for either liberating or enslaving humanity.
While Grudin has oscillated back and forth between the AI and IA worlds throughout his career, Terry Winograd became the first high-profile deserter from the world of AI. He chose to walk away from the field after having created one of the defining software programs of the early artificial intelligence era and has devoted the rest of his career to human-centered computing, or IA. He crossed over.
Winograd’s interest in computing was sparked while he was a junior studying math at Colorado College, when a professor of medicine asked his department for help doing radiation therapy calculations.6 The computer available at the medical center was a piano-sized Control Data minicomputer, the CDC 160A, one of Seymour Cray’s first designs. One person at a time used it, feeding in programs written in Fortran by way of a telex-like punched paper tape. On one of Winograd’s first days using the machine, it was rather hot so there was a fan sitting behind the desk that housed the computer terminal. He managed to feed his paper tape into the computer and then, by mistake, right into the fan.7
Terry Winograd was a brilliant young graduate student at MIT who developed an early program capable of processing natural language. Years later he rejected artificial intelligence research in favor of human-centered software design. (Photo courtesy of Terry Winograd)
In addition to his fascination with computing, Winograd had become intrigued by some of the early papers about artificial intelligence. As a math whiz with an interest in linguistics, the obvious place for graduate studies was MIT. When he arrived, at the height of the Vietnam War, Winograd discovered there was a deep gulf between the rival fiefdoms of Marvin Minsky and Noam Chomsky, leaders in the respective fields of artificial intelligence and linguistics. The schism was so deep that when Winograd would bump into Chomsky’s students at parties and mention that he was in the AI Lab, they would turn and walk away.
Winograd tried to bridge the gap by taking a course from Chomsky, but he received a C on a paper in which he argued for the AI perspective. Despite the conflict, it was a heady time for AI research. The Vietnam War had opened the Pentagon’s research coffers and ARPA was essentially writing blank checks to researchers at the major research laboratories. As at Stanford, at MIT there was a clear sense of what “serious” research in computer science was about. Doug Engelbart came around on a tour and showed a film demonstration of his NLS system. The researchers at the MIT AI Lab belittled his accomplishments. After all, they were building systems that would soon have capabilities matching those of humans, and Engelbart was showing off a computer editing system that seemed to do little more than sort grocery lists.
At the time Winograd was very much within the mainstream of computing, and as the zeitgeist pointed toward artificial intelligence, he followed. Most believed that it wouldn’t be long before machines would see, hear, speak, move, and otherwise perform humanlike tasks. Winograd was soon encouraged to pursue linguistic research by Minsky, who was eager to prove that his students could do as well or better at “language” than Chomsky’s. That challenge was fine with Winograd, who was interested in studying how language worked by using computing as a simulation tool.
As a teenager growing up in Colorado, Winograd, like many of his generation, had discovered Mad magazine. The irreverent—and frequently immature—satire journal would play a small role in naming SHRDLU, a program he wrote as a graduate student at MIT in the late 1960s that “understood” natural language and responded to commands. It has remained one of the most influential artificial intelligence programs.
Winograd had set out to build a system that could respond to typed commands in natural language and perform useful tasks in response. By this time there had already been a wave of initial experiments in building conversational programs. Eliza, written by MIT computer scientist Joseph Weizenbaum in 1964 and 1965, was named after Eliza Doolittle, who learned proper English in Shaw’s Pygmalion and the musical My Fair Lady. Eliza had been a groundbreaking experiment in the study of human interaction with machines: it was one of the first programs to provide users the opportunity to have a humanlike conversation with a computer. In order to skirt the need for real-world knowledge, Eliza parroted a Rogerian therapist and frequently reframed users’ statements as questions. The conversation was mostly one-sided because Eliza was programmed simply to respond to certain key words and phrases. This approach led to wild non sequiturs and bizarre detours. For example, Eliza would respond to a user’s statement about their mother with: “You say your mother?” Weizenbaum later said that he was stunned to discover Eliza users became deeply engrossed in conversations with the program, and even revealed intimate personal details. It was a remarkable insight not into the nature of machines but rather into human nature. Humans, it turns out, have a propensity to find humanity in almost everything they interact with, ranging from inanimate objects to software programs that offer the illusion of human intelligence.
Was it possible that in the cyber-future, humans, increasingly isolated from each other, would remain in contact with some surrogate computer intelligence? What kind of world did that foretell? Perhaps it was the one described in the movie Her, released in 2013, in which a shy guy connects with a female AI. Today, however, it is still unclear whether the emergence of cyberspace is a huge step forward for humanity as described by cyber-utopians such as Grateful Dead lyricist John Perry Barlow in his 1996 Wired manifesto, “A Declaration of the Independence of Cyberspace,” or the much bleaker world described by Sherry Turkle in her book Alone Together: Why We Expect More from Technology and Less from Each Other. For Barlow, cyberspace would become a utopian world free from crime and degradation of “meatspace.” In contrast, Turkle describes a world in which computer networks increasingly drive a wedge between humans, leaving them lonely and isolated. For Weizenbaum, computing systems risked fundamentally diminishing the human experience. In very much the same vein that Marxist philosopher Herbert Marcuse attacked advanced industrial society, he was concerned that the approaching Information Age might bring about a “One-Dimensional Man.”
In the wake of the creation of Eliza, a group of MIT scientists, including information theory pioneer Claude Shannon, met in Concord, Massachusetts, to discuss the social implications of the phenomenon.8 The seductive quality of the interactions with Eliza concerned Weizenbaum, who believed that an obsessive reliance on technology was indicative of a moral failing in society, an observation rooted in his experiences as a child growing up in Nazi Germany. In 1976, he sketched out a humanist critique of computer technology in his book Computer Power and Human Reason: From Judgment to Calculation. The book did not argue against the possibility of artificial intelligence but rather was a passionate indictment of computerized systems that substituted automated decision-making for the human mind. In the book, he argued that computing served as a conservative force in society by propping up bureaucracies as well as by reductively redefining the world as a narrow and more sterile place by restricting the potential of human relationships.
Weizenbaum’s criticism largely fell on deaf ears in the United States. Years later his ideas would receive a more positive reception in Europe, where he moved at the end of his life. At the time, however, in the United States, where the new computing technologies were taking root, there was more optimism about artificial intelligence.
In the late 1960s as a graduate student, Winograd was immersed in the hothouse world of the MIT AI Lab, the birthplace of the computing hacker culture, which would lead both to personal computing and the “information wants to be free” ideology that would later become the foundation of the open-source computing movement of the 1990s. Many at the lab staked their careers on the faith that cooperative and autonomous intelligent machines would soon be a reality. Eliza, and then several years later Winograd’s SHRDLU, were the direct predecessors of the more sophisticated computerized personal assistants that would follow in the coming decades. There had been earlier efforts at MIT to build microworlds or “block worlds,” which were restricted, simulated environments in which AI researchers would create programs capable of reasoning about their surroundings and planning. Some of those environments had used real robot arms and blocks. When Winograd began working on his project, another student was already building a system that could book airline reservations, but that was less interesting to Winograd. Instead, he set out to build a constrained world to explore and rationalize and chose to create his system in the form of a virtual computer world.
He built a computer simulation as a world populated by colored blocks that could be explored and manipulated by an artificial intelligence programming language named MicroPlanner, based on the work of Carl Hewitt, another MIT graduate student. Given the relatively primitive state of computing at the time, much was left to the imagination. There was no fancy graphical animation of the blocks world. The user simply sat at a Teletype terminal and entered questions at the keyboard and the computer responded, in natural language. Winograd chose the name SHRDLU because no better one came to mind. He later said the sequence SHRDLU was probably hidden in his subconscious. The sequence “etaoin shrdlu,” representing the most frequently used letters in English, are on the left-hand side of the Linotype keyboard, as “1qaz 2wsx” appear on the typewriter—and as a consequence the string of characters showed up occasionally in Mad magazine, as an inside joke on the accidental appearance of the character string in print. The Linotype, which cast an entire line of type from molten metal, had no backspace. Because it was faster to cast an erroneous line than correct it, typesetters would “run down” the rest of the line with easy-to-type nonsense, later removing the entire line after it had cooled down, or if they forgot, hope a proofreader caught it.9
He wasn’t concerned at the time about any ethical implications involved in building a natural language processing system that could “understand” and respond in a virtual world. In SHRDLU “understanding” meant that the program analyzed the structure of the typed questions and attempted to answer them and respond to the commands. It was an early effort at disambiguation, a thorny problem for natural language processing even today. For example, in the sentence “he put the glass on the table and it broke,” does “it” refer to the glass or the table? Without more context, neither a human nor an AI program could decide.
Winograd’s system used its general knowledge of the microworld to answer and respond to various questions. He illustrated the conversation as follows in his dissertation:
Person: Pick up a big red block.
Computer: OK.
Person: Grasp the pyramid.
Computer: I don’t understand which pyramid you mean.
Person (changing their mind): Find a block which is taller than the one you are holding and put it into the box.
Computer: By “it,” I assume you mean the block which is taller than the one I am holding.
Computer: OK.
Person: What does the box contain?
Computer: The blue pyramid and the blue block.
Person: What is the pyramid supported by?
Computer: The box.
Person: How many blocks are not in the box?
Computer: Four of them.
Person: Is at least one of them narrower than the one which I told you to pick up?
Computer: Yes, the red cube.10
In 1970, when it was completed, SHRDLU proved to be one of the most successful demonstrations of AI’s potential—perhaps too successful. Years later, Winograd’s tour de force would be blamed for helping generate the optimistic view that it would be possible to “scale up” similar programs to deal with real-world complexity. For example, during the 1980s and 1990s the AI research community widely accepted that it would be possible to build a machine with the reasoning power at least of a kindergartener—by simply accumulating a vast number of common-sense rules.
The attack on the AI optimists, however, had begun even before Winograd built SHRDLU. Although Weizenbaum’s critique was about the morality of building intelligent machines, the more heated debate was over whether such machines were even possible. Seymour Papert, Winograd’s thesis advisor, had become engaged in a bitter debate with Hubert Dreyfus, a philosopher and Heidegger acolyte, who, just one decade after McCarthy had coined the term, would ridicule the field in a scathing paper entitled “Alchemy and Artificial Intelligence,” published in 1965 by the RAND Corporation.11 (Years later, in the 2014 movie remake of RoboCop, the fictional U.S. senator who sponsors legislation banning police robots is named Hubert Dreyfus in homage.)
Dreyfus ran afoul of AI researchers in the early sixties when they showed up in his Heidegger course and belittled philosophers for failing to understand human intelligence after studying it for centuries.12 It was a slight he would not forget. For the next four decades, Dreyfus would become the most pessimistic critic of the possibility of work-as-promised artificial intelligence, summing up his argument in an attack on two Stanford AI researchers: “Feigenbaum and Feldman claim that tangible progress is indeed being made, and they define progress very carefully as ‘displacement toward the ultimate goal.’ According to this definition, the first man to climb a tree could claim tangible progress toward flight to the moon.”13 Three years later, Papert fired back in “The Artificial Intelligence of Hubert L. Dreyfus, A Budget of Fallacies”: “The perturbing observation is not that Dreyfus imports metaphysics into engineering but that his discussion is irresponsible,” he wrote. “His facts are almost always wrong; his insight into programming is so poor that he classifies as impossible programs a beginner could write; and his logical insensitivity allows him to take his inability to imagine how a particular algorithm can be carried out, as reason to believe no algorithm can achieve the desired purpose.”14
Winograd would eventually break completely with Papert, but this would not happen for many years. He came to Stanford as a professor in 1973, when his wife, a physician, accepted an offer as a medical resident in the Bay Area. It was just two years after Intel had introduced the first commercial 4004 microprocessor chip, and trade journalist Don Hoefler settled on “Silicon Valley U.S.A.” as shorthand for the region in his newsletter Microelectronics News. Winograd continued to work for several years on the problem of machine understanding of natural language very much in the original tradition of SHRDLU. Initially he spent almost half his time at Xerox Palo Alto Research Center working with Danny Bobrow, another AI researcher interested in natural language understanding. Xerox had opened a beautiful new building in March 1975 in a location next to Stanford, as it gave the “document company” easy access to the best computer scientists. Later Winograd would tell friends, “You know all the famous personal computing technology that was invented at PARC? Well, that’s not what I worked on.”
Instead he spent his time trying to elaborate and expand on the research he pursued at MIT, research that would bear fruit almost four decades later. During the 1970s, however, it seemed to present an impossible challenge, and many started to wonder how, or even if, science could come to understand how humans process language. After spending a half decade on language-related computing Winograd found himself growing more and more skeptical that real progress in AI would be possible. In addition to making little headway, he rejected artificial intelligence in part because of the influence of a new friendship with a Chilean political refugee named Fernando Flores, and in part because of his recent engagement with a group of Berkeley philosophers, led by Dreyfus, intent on stripping away the hype around the new AI industry now emerging. Flores, a bona fide technocrat who had been finance minister during the Allende government, barely escaped his office in the palace when it was bombed during the coup. He spent three years in prison before arriving in the United States, his release coming in response to political pressure by Amnesty International. Stanford had appointed Flores as a visiting scholar in computer science, but he left Palo Alto instead to pursue a Ph.D. at Berkeley under the guidance of a quartet of anti-AI philosophers: Hubert and Stuart Dreyfus, John Searle, and Ann Markussen.
Winograd thought Flores was one of the most impressive intellectuals he had ever met. “We started talking in a casual way, then he handed me a book on philosophy of science and said, ‘You should read this.’ I read it, and we started talking about it, and we decided to write a paper about it, that turned into a monograph, and that turned into a book. It was a gradual process of finding him interesting, and finding the stuff we were talking about intellectually stimulating,” Winograd recalled.15 The conversations with Flores put the young computer scientist “in touch” with the ways in which he was unhappy with what he thought of as the “ideology” of AI. Flores aligned himself with the charismatic Werner Erhard, whose cultlike organization EST (Erhard Seminars Training) had a large following in the Bay Area during the 1970s. (At Stanford Research Institute, Engelbart sent the entire staff of his lab through EST training and joined the board of the organization.)
Although the computing world was tiny at the time, the tensions between McCarthy and Minsky’s AI design approach and Engelbart’s IA approach were palpable around Stanford. PARC was inventing the personal computer; the Stanford AI Lab was doing research on everything from robot arms to mobile robots to chess-playing AI systems. At the recently renamed SRI (which changed its name from Stanford Research Institute due to student antiwar protests) researchers were working on projects that ranged from Engelbart’s NLS system to Shakey the robot, as well as early speech recognition research and “smart” weapons. Winograd would visit Berkeley for informal lunchtime discussions with Searle and Dreyfus, the Berkeley philosophers, their grad students, and Fernando Flores. While Hubert Dreyfus objected to the early optimistic predictions by AI researchers, it was John Searle who raised the stakes and asked one of the defining philosophical questions of the twentieth century: Is it possible to build an intelligent machine?
Searle, a dramatic lecturer with a flair for showmanship, was never one to avoid an argument. Before teaching philosophy he had been a political activist. While at the University of Wisconsin in the 1950s he had been a member of Students Against Joseph McCarthy, and in 1964 he would become the first tenured Berkeley faculty to join the Free Speech Movement. As a young philosopher Searle had been drawn to the interdisciplinary field of cognitive science. At the time, the core assumption of the field was that the biological mind was analogous to the software that animated machines. If this was the case, then understanding the processes of human thought would merely be a matter of teasing out the program inside the intertwined billions of neurons making up the human brain.
The Sloan Foundation had sent Searle to Yale to discuss the subject of artificial intelligence. While on the plane to the meeting he began reading a book about artificial intelligence written by Roger Schank and Robert Abelson, the leading Yale AI researchers during the second half of the 1970s. Scripts, Plans, Goals, and Understanding16 made the assertion that artificial intelligence programs could “understand” stories that had been designed by their developers. For example, developers could present the computer with a simple story, such as a description of a man going into a restaurant, ordering a hamburger, and then storming out without paying for it. In response to a query, the program was able to infer that the man had not eaten the hamburger. “That can’t be right,” Searle thought to himself, “because you could give me a story in Chinese with a whole lot of rules for shuffling the Chinese symbols, and I don’t understand a word of Chinese but all the same I could give the right answer.”17 He decided that it just didn’t follow that the computer had the ability to understand anything just because it could interpret a set of rules.
While flying to his lecture, he came up with what has been called the “Chinese Room” argument against sentient machines. Searle’s critique was that there could be no simulated “brains in a box.” His argument was different from the original Dreyfus critique, which asserted that obtaining human-level performance from AI software was impossible. Searle simply argued that a computing machine is little more than a very fast symbol shuffler that uses a set of syntactical rules. What it lacks is what the biological mind has—the ability to interpret semantics. The biological origin of semantics, the formal study of meaning, remains a great mystery. Searle’s argument was infuriating to the AI community in part because he implied that their argument implicitly linked them with a theological argument that the mind is outside the physical, biological world. His argument was that mental processes are entirely caused by biological processes in the brain and they are realized there, and if you want to make a machine that can think, you must duplicate, rather than simulate, those processes. At the time Searle thought that they had probably already considered his objection and the discussion wouldn’t last a week, let alone decades. But it has. Searle’s original article generated thirty published refutations. Three decades later, the debate is anything but settled. To date, there are several hundred published attacks on his idea. And Searle is still alive and busy defending his position.
It is also notable that the lunchtime discussions about the possibility of intelligent and conceivably self-aware machines took place against a backdrop of the Reagan military buildup. The Vietnam War had ended, but there were still active pockets of political dissent around the country. The philosophers would meet at the Y across the street from the Berkeley campus. Winograd and Danny Bobrow from Xerox PARC had become regular visitors at these lunches, and Winograd found that they challenged his intellectual biases about the philosophical underpinnings of AI.
He would eventually give up the AI “faith.” Winograd concluded that there was nothing mystical about human intelligence. In principle, if you could discover the way the brain worked, you could build a functional artificially intelligent machine, but you couldn’t build that same machine with symbolic logic and computing, which was the dominant approach in the 1970s and 1980s. Winograd’s interest in artificial intelligence had been twofold: AI served both as a model for understanding language and the human brain and as a system that could perform useful tasks. At that point, however, he took an “Engelbartian” turn. Philosophically and politically, human-centered computing was a better fit with his view of the world. Winograd had gotten intellectually involved with Flores, which led to a book, Understanding Computers and Cognition: A New Foundation for Design, a critique of artificial intelligence. Understanding Computers, though, was philosophy, not science, and Winograd still had to figure out what to do with his career. Eventually, he set down his effort to build smarter machines and focused instead on the question of how to use computers to make people smarter. Winograd crossed the chasm. From designing systems that were intended to supplant humans he turned his focus to working on technologies that enhanced the way people interact with computers.
Though Winograd would argue years later that politics had not directly played a role in his turn away from artificial intelligence, the political climate of the time certainly influenced many other scientists’ decisions to abandon the artificial intelligence camp. During a crucial period from 1975 to 1985, artificial intelligence research was overwhelmingly funded by the Defense Department. Some of the nation’s most notable computer scientists—including Winograd—had started to worry about the increasing involvement of the military in computing technology R & D. For a generation who had grown up watching the movie Dr. Strangelove, the Reagan administration Star Wars antimissile program seemed like dangerous brinkmanship. It was at least a part of Winograd’s moral background and was clearly part of the intellectual backdrop during the time when he decided to leave the field he had helped to create. Winograd was a self-described “child of the ’60s,”18 and during the crucial years when he turned away from AI, he simultaneously played a key role in building a national organization of computer scientists, led by researchers at Xerox PARC and Stanford, who had become alarmed at the Star Wars weapons buildup. The group shared a deep fear that the U.S. military command would push the country into a nuclear confrontation with the Soviet Union. As a graduate student Winograd had been active against the war in Vietnam while he was in Boston as part of a group called “Computer People for Peace.” In 1981 he became active again as a leader in helping create a national organization of computer scientists who opposed nuclear weapons.
In response to the highly technical Strategic Defense Initiative, the disaffected computer scientists believed they could use the weight of their expertise to create a more effective anti-nuclear weapons group. They evolved from being “people” and became “professionals.” In 1981, they founded a new organization called Computer Professionals for Social Responsibility. Winograd ran the first planning meeting, held in a large classroom at Stanford. Those who attended recalled that unlike many political meetings from the antiwar era that were marked by acrimony and debate, the evening was characterized by an unusual sense of unity and common purpose. Winograd proved an effective political organizer.
In a 1984 essay on the question of whether computer scientists should accept military funding, Winograd pointed out that he had avoided applying for military funding in the past, but by keeping his decision private, he had ducked what he would come to view as a broader responsibility. He had, of course, received his training in a military-funded laboratory at MIT. Helping establish Computer Professionals for Social Responsibility was the first of a set of events that would eventually lead Winograd to “desert” the AI community and turn his attention from building intelligent machines to augmenting humans.
Indirectly it was a move that would have a vast impact on the world. Winograd was recognized enough in the artificial intelligence community that, if he had decided to pursue a more typical academic career, he could have built an academic empire based on his research interests. Personally, however, he had no interest in building a large research lab or even supporting postdoctoral researchers. He was passionate about one-to-one interaction with his students.
One of these was Larry Page, a brash young man with a wide range of ideas for possible dissertation topics. Under Winograd’s guidance Page settled on the idea of downloading the entire Web and improving the way information was organized and discovered. He set about doing this by mining human knowledge, which was embodied in existing Web hyperlinks. In 1998, Winograd and Page joined with Sergey Brin, another Stanford graduate student and a close friend of Page’s, and Brin’s faculty advisor, Rajeev Motwani, an expert in data mining, to coauthor a journal article titled “What Can You Do with a Web in Your Pocket?”19 In the paper, they described the prototype version of the Google search engine.
Page had been thinking about other more conventional AI research ideas, like self-driving cars. Instead, with Winograd’s encouragement, he would find an ingenious way of mining human behavior and intelligence by exploiting the links created by millions of Web users. He used this information to significantly improve the quality of the results returned by a search engine. This work would be responsible for the most significant “augmentation” tool in human history. In September of that year, Page and Brin left Stanford and founded Google, Inc. with the modest goal of “organizing the world’s knowledge and making it universally useful.”
By the end of the 1990s Winograd believed that the artificial intelligence and human-computer interaction research communities represented fundamentally different philosophies about how computers and humans should interact. The easy solution, he argued, would be to agree that both camps were equally “right” and to stipulate that there will obviously be problems in the world that could be solved by either approach. This answer, however, would obscure the fact that inherent in these differing approaches are design consequences that play out in the nature of the systems. Adherents of the different philosophies, of course, construct these systems. Winograd had come to believe that the way computerized systems are designed has consequences both in how we understand humans and how technologies are designed for their benefit.
The AI approach, which Winograd describes as “rationalistic,” views people as machines. Humans are modeled with internal mechanisms very much like digital computers. “The key assumptions of the rationalistic approach are that the essential aspects of thought can be captured in a formal symbolic representation,” he wrote. “Armed with this logic, we can create intelligent programs and we can design systems that optimize human interaction.”20 In opposition to the rational AI approach was the augmentation method that Winograd describes as “design.” That approach is more common in the human-computer interaction community, in which developers focus not on modeling a single human intelligence, but rather on using the relationship between the human and the environment as the starting point for their investigations, be it with humans or an ensemble of machines. Described as “human-centered” design, this school of thought eschews formal planning in favor of an iterative approach to design, encapsulated well in the words of industrial designer and IDEO founder David Kelley: “Enlightened trial and error outperforms the planning of flawless intellect.”21 Pioneered by psychologists and computer scientists like Donald Norman at the University of California at San Diego and Ben Shneiderman at the University of Maryland, human-centered design would become an increasingly popular approach that veered away from the rationalist AI model that was popularized in the 1980s.
In the wake of the defeats of the AI Winter in the 1980s, in the 1990s, the artificial intelligence community also changed dramatically. It largely abandoned its original formal, rationalist, top-down straitjacket that had been described as GOFAI, or “Good Old-Fashioned Artificial Intelligence,” in favor of statistical and “bottom-up,” or “constructivist,” approaches, such as those pursued by roboticists led by Rod Brooks. Nevertheless, the two communities have remained distant, preoccupied with their contradictory challenges of either replacing or augmenting human skills.
In breaking with the AI community, Winograd became a member of a group of scientists and engineers who took a step back and rethought the relationship between humans and the smart tools they were building. In doing so, he also reframed the concept of “machine” intelligence. By posing the question of whether humans were actually “thinking machines” in the same manner of the computing machines that the AI researchers were trying to create, he argued that the very question makes us engage—wittingly or not—in an act of projection that tells us more about our concept of human intelligence than it does about the machines we are trying to understand. Winograd came to believe that intelligence is an artifact of our social nature, and that we flatten our humaneness by simplifying and distorting what it is to be human as simulated by a machine.
While artificial intelligence researchers rarely spoke to the human-centered design researchers, the two groups would occasionally organize confrontational sessions at technical conferences. In the 1990s, Ben Shneiderman was a University of Maryland computer scientist who had become a passionate advocate of the idea of human-centered design through what became known as “direct manipulation.” During the 1980s, with the advent of Apple’s Macintosh and Microsoft’s Windows software systems, direct manipulation had become the dominant style in computer user interfaces. For example, rather than entering commands on a keyboard, users could change the shape of an image displayed on a computer screen by grabbing its edges or corners with a mouse and dragging them.
Shneiderman was at the top of his game and, during the 1990s, he was a regular consultant at companies like Apple, where he dispensed advice on how to efficiently design computer interfaces. Shneiderman, who considered himself to be an opponent of AI, counted among his influences Marshall McLuhan. During college, after attending a McLuhan lecture at the Ninety-Second Street Y in New York City, he had felt emboldened to pursue his own various interests, which crossed the boundaries between science and the humanities. He went home and printed a business card describing his job title as “General Eclectic” and subtitled it “Progress is not our most important product.”22
He would come to take pride in the fact that Terry Winograd had moved from the AI camp to the HCI world. Shneiderman sharply disagreed with Winograd’s thesis when he read it in the 1970s and had written a critical chapter about SHRDLU in his 1980 book Software Psychology. Some years later, when Winograd and Flores published Understanding Computers and Cognition, which made the point that computers were unable to “understand” human language, he called Winograd up and told him, “You were my enemy, but I see you’ve changed.” Winograd laughed and told Shneiderman that Software Psychology was required reading in his classes. The two men became good friends.
In his lectures and writing, Shneiderman didn’t mince words in his attacks on the AI world. He argued not only that the AI technologies would fail, but also that they were poorly designed and ethically compromised because they were not designed to help humans. With great enthusiasm, he argued that autonomous systems raised profound moral issues related to who was responsible for the actions of the systems, issues that weren’t being addressed by computer researchers. This fervor wasn’t new for Shneiderman, who had previously been involved in legendary shouting matches at technical meetings over the wisdom of designing animated human agents like Microsoft Clippy, the Office assistant, and Bob, the ill-received attempts Microsoft made to design more “friendly” user interfaces.
In the early 1990s anthropomorphic interfaces had become something of a fad in computer design circles. Inspired in part by Apple’s widely viewed Knowledge Navigator video, computer interface designers were adding helpful and chatty animated cartoon figures to systems. Banks were experimenting with animated characters that would interact with customers from the displays of automated teller machines, and car manufacturers started to design cars with speech synthesis that would, for example, warn drivers when their door was ajar. The initial infatuation would come to an abrupt halt, however, with the embarrassing failure of Microsoft Bob. Although it had been designed with the aid of Stanford University user interface specialists, the program was widely derided as a goofy idea.
Did the problem with Microsoft Bob lie with the idea of a “social” interface itself, or instead with the way it was implemented? Microsoft’s bumbling efforts were rooted in the work of Stanford researchers Clifford Nass and Byron Reeves, who had discovered that humans responded well to computer interfaces that offered the illusion of human interaction. The two researchers arrived at the Stanford Communications Department simultaneously in 1986. Reeves had been a professor of communications at the University of Wisconsin, and Nass had studied mathematics at Princeton and worked at IBM and Intel before turning his interests toward sociology.
As a social scientist Nass worked with Reeves to conduct a series of experiments that led to a theory of communications they described as “the Media Equation.” In their book, The Media Equation: How People Treat Computers, Television, and New Media Like Real People and Places, they explored what they saw as the human desire to interact with technological devices—computers, televisions, and other electronic media—in the same “social” fashion with which they interacted with other humans. After writing The Media Equation, Reeves and Nass were hired as consultants for Microsoft in 1992 and encouraged the design of familiar social and natural interfaces. This extended the thinking underlying Apple’s graphical interface for the Macintosh, which, like Windows, had been inspired by the original work done on the Alto at Xerox PARC. Both were designs that attempted to ease the task of using a computer by creating a graphical environment that was evocative of a desk and office environment in the physical world. However, Microsoft Bob, which attempted to extend the “desktop” metaphor by creating a graphical computer environment that evoked the family home, adopted a cartoonish and dumbed-down approach that the computer digerati found insulting to users, and the customer base overwhelmingly rejected it.
Decades later the success of Apple’s Siri has vindicated Nass and Reeves’s early research, suggesting that the failure of Microsoft Bob lay in how Microsoft built and applied the system rather than in the approach itself. Siri speeds people up in contexts where keyboard input might be difficult or unsafe, such as while walking or driving. Both Microsoft Bob and Clippy, on the other hand, slowed down user engagement with the program and came across as overly simplistic and condescending to users: “as if they were being asked to learn to ride a bicycle by starting with a tricycle,” according to Tandy Trower, a veteran Microsoft executive.23 That said, Trower pointed out that Microsoft may have fundamentally mistaken the insights offered by the Stanford social scientists: “Nass and Reeves’ research suggests that user expectations of human-like behavior are raised as characters become more human,” he wrote. “This Einstein character sneezed when you asked it to exit. While no users were ever sprayed upon by the character’s departure, if you study Nass and Reeves, this is considered to be socially inappropriate and rude behavior. It doesn’t matter that they are just silly little animations on the screen; most people still respond negatively to such behavior.”24
Software agents had originally emerged during the first years of the artificial intelligence era when Oliver Selfridge and his student Marvin Minsky, both participants in the original Dartmouth AI conference, proposed an approach to machine perception called “Pandemonium,” in which collaborative programs called “demons,” described as “intelligent agents,” would work in parallel on a computer vision problem. The original software agents were merely programs that ran inside a computer. Over two decades computer scientists, science-fiction authors, and filmmakers embellished the idea. As it evolved, it became a powerful vision of an interconnected, computerized world in which software programs cooperated in pursuit of a common goal. These programs would collect information, perform tasks, and interact with users as animated servants. But was there not a Faustian side to this? Shneiderman worried that leaving computers to complete human tasks would create more problems than it solved. This concern was at the core of his attack on the AI designers.
Before their first debate began, Shneiderman tried to defuse the tension by handing Pattie Maes, who had recently become a mother, a teddy bear. At two technical meetings in 1997 he squared off against Maes over AI and software agents. Maes was a computer scientist at the MIT Media Lab who, under the guidance of laboratory founder Nicholas Negroponte, had started developing software agents to perform useful tasks on behalf of a computer user. The idea of agents was just one of many future-of-computing ideas pursued at Negroponte’s laboratory, which started out as ArcMac, the Architecture Machine Group, and groomed multiple generations of researchers who took the lab’s “demo or die” ethos to heart. His original ArcMac research group and its follow-on, the MIT Media Laboratory, played a significant role in generating many of the ideas that would filter into computing products at both Apple and Microsoft.
In the 1960s and 1970s, Negroponte, who had trained as an architect, traced a path from the concept of a human-machine design partnership to a then far-out vision of “architecture without architects” in his books The Architecture Machine and Soft Architecture Machines.
In his 1995 book Being Digital, Negroponte, a close friend to AI researchers like Minsky and Papert, described his view of the future of human-computer interaction: “What we today call ‘agent-based interfaces’ will emerge as the dominant means by which computers and people talk with one another.”25 In 1995, Maes founded Agents, Inc., a music recommendation service, with a small group of Media Lab partners. Eventually the company would be sold to Microsoft, which used the privacy technologies her company had developed but did not commercialize its original software agent ideas.
At first the conference organizers had wanted Shneiderman and Maes to debate the possibility of artificial intelligence. Shneiderman declined and the topic was changed. The two researchers agreed to debate the contrasting virtues of software agents that acted on a user’s behalf, on the one hand, and software technologies that directly empowered a computer user, on the other.
The high-profile debate took place in March of 1997 at the Association for Computing Machinery’s Computers and Human Interaction (CHI) Conference in Atlanta. The event was given top billing along with other questions of pressing concern like, “Why Aren’t We All Flying Personal Helicopters?” and “The Only Good Computer Is an Invisible Computer?” In front of an audience of the world’s best computer interface designers, the two computer scientists spent an hour laying out the pros and cons of designs that directly augment humans and those that work more or less independently of them.
“I believe the language of ‘intelligent autonomous agents’ undermines human responsibility,” Shneiderman said. “I can show you numerous articles in the popular press which suggest the computer is the active and responsible party. We need to clarify that either programmers or operators are the cause of computer failures.”26
Maes responded pragmatically. Shneiderman’s research was in the Engelbart tradition of building complex systems to give users immense power, and as a result they required significant training. “I believe that there are real limits to what we can do with visualization and direct manipulation because our computer environments are becoming more and more complex,” she responded. “We cannot just add more and more sliders and buttons. Also, there are limitations because the users are not computer-trained. So, I believe that we will have to, to some extent, delegate certain tasks or certain parts of tasks to agents that can act on our behalf or that can at least make suggestions to us.”27
Perhaps Maes’s most effective retort was that it might be wrong to believe that humans always wanted to be in control and to be responsible. “I believe that users sometimes want to be couch-potatoes and wait for an agent to suggest a movie for them to look at, rather than using 4,000 sliders, or however many it is, to come up with a movie that they may want to see,” she argued. Things politely concluded with no obvious winner, but it was clear to Jonathan Grudin, who was watching from the audience, that Pattie Maes had been brave to debate this at a CHI conference, on Shneiderman’s home turf. The debate took place a decade and a half before Apple unveiled Siri, which successfully added an entirely artificial human element to human-computer interaction. Years later Shneiderman would acknowledge that there were some cases in which using speech and voice recognition might be appropriate. He did, however, remain a staunch critic of the basic idea of software agents, and pointed out that aircraft cockpit designers had for decades tried and failed to use speech recognition to control airplanes.
When Siri was introduced in 2010, the “Internet of Things” was approaching the peak in the hype cycle. This had originally been Xerox PARC’s next big idea after personal computing. In the late 1980s PARC computer scientist Mark Weiser had predicted that as microprocessor cost, size, and power collapsed, it would be possible to discreetly integrate computer intelligence into everyday objects. He called this “UbiComp” or ubiquitous computing. Computing would disappear into the woodwork, he argued, just as electric motors, pulleys, and belts are now “invisible.” Outside Weiser’s office was a small sign: UBICOMP IS UPWARDLY COMPATIBLE WITH REALITY. (A popular definition of “ubiquitous” is “notable only for its absence.”)
It would be Steve Jobs who once again most successfully took advantage of PARC’s research results. In the 1980s he had borrowed the original desktop computing metaphor from PARC to design the Lisa and then the Macintosh computers. Then, a little more than a decade later, he would be the first to successfully translate Xerox’s ubiquitous computing concept for a broad consumer audience. The iPod, first released in October of 2001, was a music player reconceptualized for the ubiquitous computing world, and the iPhone was a digital transformation of the telephone. Jobs also understood that while Clippy and Bob were tone deaf on the desktop, on a mobile phone, a simulated human assistant made complete sense.
Shneiderman, however, continued to believe that he had won the debate handily and that the issue of software agents had been put to bed.