Machines of Loving Grace: The Quest for Common Ground Between Humans and Robots - John Markoff (2015)
Chapter 6. COLLABORATION
In a restored brick building in Boston, a humanoid figure turned its head. The robot was no more than an assemblage of plastic, motors, and wires, all topped by a movable flat LCD screen with a cartoon visage of eyes and eyebrows. Yet the distinctive motion elicited a small shock of recognition and empathy from an approaching human. Even in a stack of electronic boxes, sensors, and wires, the human mind has an uncanny ability to recognize the human form.
Meet Baxter, a robot designed to work alongside human workers that was unveiled with some fanfare in the fall of 2012. Baxter is relatively ponderous and not particularly dexterous. Instead of moving around on wheels or legs, it sits in one place on an inflexible fixed stand. Its hands are pincers capable of delicately picking up objects and putting them down. It is capable of little else. Despite its limitations, however, Baxter represents a new chapter in robotics. It is one of the first examples of Andy Rubin’s credo that personal computers are sprouting legs and beginning to move around in the environment. Baxter is the progeny of Rodney Brooks, whose path to building helper robots traces directly from the founders of artificial intelligence.
McCarthy and Minsky went their separate ways in 1962, but the Stanford AI Laboratory where McCarthy settled attracted a ragtag crowd of hackers, a mirror image of the original MIT AI Lab remaining under Minsky’s guidance. In 1969 the two labs were electronically linked via the ARPAnet, a precursor of the modern Internet, thus making it simple for researchers to share information. It was the height of the Vietnam War and artificial intelligence and robotics were heavily funded by the military, but the SAIL ethos was closer to the countercultural style of the San Francisco’s Fillmore Auditorium than it was to the Pentagon on the Potomac.
Hans Moravec, an eccentric young graduate student, was camping in the attic of SAIL, while working on the Stanford Cart, an early four-wheeled mobile robot. A sauna had been installed in the basement, and psychodrama groups shared the lab space in the evenings. Available computer terminals displayed the message “Take me, I’m yours.” “The Prancing Pony”—a fictional wayfarer’s inn in Tolkien’s Lord of the Rings—was a mainframe-connected vending machine selling food suitable for discerning hackers. Visitors were greeted in a small lobby decorated with an ungainly “You Are Here” mural echoing the famous Leo Steinberg New Yorker cover depicting a relativistic view of the most important place in the United States. The SAIL map was based on a simple view of the laboratory and the Stanford campus, but lots of people had added their own perspectives to the map, ranging from placing the visitor at the center of the human brain to placing the laboratory near an obscure star somewhere out on the arm of an average-sized spiral galaxy.
It provided a captivating welcome for Rodney Brooks, another new Stanford graduate student. A math prodigy from Adelaide, Australia, raised by working-class parents, Brooks had grown up far from the can-do hacker culture in the United States. However, in 1969—along with millions of others around the world—he saw Kubrick’s 2001: A Space Odyssey. Like Jerry Kaplan, Brooks was not inspired to train to be an astronaut. He was instead seduced by HAL, the paranoid (or perhaps justifiably suspicious) AI.
Brooks puzzled about how he might create his own AI, and arriving at college, he had his first opportunity. On Sundays he had solo access to the school’s mainframe for the entire day. There, he created his own AI-oriented programming language and designed an interactive interface on the mainframe display.1 Brooks now went on to writing theorem proofs, thus unwittingly working in the formal, McCarthy-inspired artificial intelligence tradition. Building an artificial intelligence was what he wanted to do with his life.
Looking at a map of the United States, he concluded that Stanford was the closest university to Australia with an artificial intelligence graduate program and promptly applied. To his surprise, he was admitted. By the time of his arrival in the fall of 1977, the pulsating world of antiwar politics and the counterculture was beginning to wane in the Bay Area. Engelbart’s group at SRI had been spun off, with his NLS system augmentation technology going to a corporate time-sharing company. Personal computing, however, was just beginning to turn heads—and souls—on the Midpeninsula. This was the heyday of the Homebrew Computer Club, which held its first meeting in March of 1975, the very same week the new Xerox PARC building opened. In his usual inclusive spirit McCarthy had invited the club to meet at his Stanford laboratory, but he remained skeptical about the idea of “personal computing.” McCarthy had been instrumental in pioneering the use of mainframe computers as shared resources, and in his mental calculus it was wasteful to own an underpowered computer that would sit idle most of the time. Indeed, McCarthy’s time-sharing ideas had developed from this desire to use computing systems more efficiently while conducting AI research. Perhaps in a display of wry humor, he placed a small note in the second Homebrew newsletter suggesting the formation of the “Bay Area Home Terminal Club,” chartered to provide shared access on a Digital Equipment Corp. VAX mainframe computer. He thought that seventy-five dollars a month, not including terminal hardware and communications connectivity costs, might be a reasonable fee. He later described PARC’s Alto/Dynabook design prototype—the template for all future personal computers—as “Xerox Heresies.”
Alan Kay, who would become one of the main heretics, passed through SAIL briefly during his time teaching at Stanford. He was already carrying his “interim” Dynabook around and happily showing it off: a wooden facsimile preceding laptops by more than a decade. Kay hated his time in McCarthy’s lab. He had a very different view of the role of computing, and his tenure at SAIL felt like working in the enemy’s camp.
Alan Kay had first envisioned the idea of personal computing while he was a graduate student under Ivan Sutherland at the University of Utah. Kay had seen Engelbart speak when the SRI researcher toured the country giving demonstrations of NLS, the software environment that presaged the modern desktop PC windows-and-mouse environment. Kay was deeply influenced by Engelbart and NLS, and the latter’s emphasis on boosting the productivity of small groups of collaborators—be they scientists, researchers, engineers, or hackers. He would take Engelbart’s ideas a step further. Kay would reinvent the book for an interactive age. He wrote about the possibilities of “Personal Dynamic Media,” inspiring the look and feel of the portable computers and tablets we use today. Kay believed personal computers would become a new universal medium, as ubiquitous as the printed page was in the 1960s and 1970s.
Like Engelbart’s, Kay’s views were radically different from those held by McCarthy’s researchers at SAIL. The labs were not antithetical to each other, but there was a significant difference in emphasis. Kay, like Engelbart, put the human user at the center of his design. He wanted to build technologies to extend the intellectual reach of humans. He did, however, differ from Engelbart in his conception of cyberspace. Engelbart thought the intellectual relation between humans and information could be compared to driving a car; computer users would sail along an information highway. In contrast, Kay had internalized McLuhan’s insight that “the medium is the message.” Computing, he foresaw, would become a universal, overarching medium that would subsume speech, music, text, video, and communications.
Neither of those visions found traction at SAIL. Les Earnest, brought to SAIL by ARPA officials in 1965 to provide management skills that McCarthy lacked, has written that many of the computing technologies celebrated as coming out of SRI and PARC were simultaneously designed at SAIL. The difference was one of philosophy. SAIL’s mission statement had originally been to build a working artificial intelligence in the span of a decade—perhaps a robot that could match wits with a human while physically exceeding their strength, speed, and dexterity. Generations of SAIL researchers would work toward systems supplanting rather than supplementing humans.
When Rod Brooks arrived at Stanford in the fall of 1977, McCarthy was already three years overdue on his ten-year goal for creating a working AI. It had also been two years since Hans Moravec fired his first broadside at McCarthy, arguing that exponentially growing computing power was the baseline ingredient to consider in artificial intelligence systems development. Brooks, whose Australian outsider’s sensibility offered him a different perspective into the goings-on at Stanford, would become Moravec’s night-shift assistant. Both had their quirks. Moravec was living at SAIL around the clock and counted on friends to bring him groceries. Brooks, too, quickly adopted the countercultural style of the era. He had shoulder-length hair and experimented with a hacker lifestyle: he worked a “28-hour day,” which meant that he kept a 20-hour work-cycle, followed by 8 hours of sleep. The core thrust of Brooks’s Ph.D. thesis, on symbolic reasoning about visual objects, followed in the footsteps of McCarthy. Beyond that, however, the Australian was able to pioneer the use of geometric reasoning in extracting a third dimension using only a single-lens camera. In the end, Brooks’s long nights with Moravec seeded his disaffection and break with the GOFAI tradition.
As Moravec’s sidekick, Brooks would also spend a good deal of time working on the Stanford Cart. In the mid-1970s, the mobile robot’s image recognition system took far too long to process its surroundings for anything deserving the name “real-time.” The Cart took anywhere from a quarter of an hour to four hours to compute the next stage of its assigned journey, depending on the mainframe computer load. After it processed one image, it would lurch forward a short distance and resume scanning.2 When the robot operated outdoors, it had even greater difficulty moving by itself. It turned out that moving shadows confused the vision recognition software of the robot. The complexity in moving shadows was an entrancing discovery for Brooks. He was aware of early experiments by W. Grey Walter, a British-American neurophysiologist credited with the design of the first simple electronic autonomous robots in 1948 and 1949, intended to demonstrate how the interconnections in small collections of brain cells might cause autonomous behavior. Grey Walter had built several robotic “tortoises” that used a scanning phototube “eye” and a simple circuit controlling motors and wheels to exhibit “lifelike” movement.
While Moravec considered simple robots the baseline for his model of the evolution of artificial intelligence, Brooks wasn’t convinced. In Britain in the early fifties, Grey Walter had built surprisingly intelligent robots—a species zoologically named Machina speculatrix—costing a mere handful of British pounds. Now more than two decades later, “A robot relying on millions of dollars of equipment did not operate nearly as well,” Brooks observed. He noticed that many U.S. developers used Moravec’s sophisticated algorithms, but he wondered what they were using them for. “Were the internal models truly useless, or were they a down payment on better performance in future generations of the Cart?”3
After receiving his Ph.D. in 1981, Brooks left McCarthy’s “logic palace” for MIT. Here, in effect, he would turn the telescope around and peer through it from the other end. Brooks fleshed out his “bottom-up” approach to robotics in 1986. If the computing requirements for modeling human intelligence dwarfed the limits of human-engineered computers, he reasoned, why not build intelligent behavior as ensembles of simple behaviors that would eventually scale into more powerful symphonies of computing in robots as well as other AI applications? He argued that if AI researchers ever wanted to realize their goal of mimicking biological intelligence, they should start at the lowest level by building artificial insects. The approach precipitated a break with McCarthy and fomented a new wave in AI: Brooks argued in favor of a design that mimicked the simplest biological systems, rather than attempting to match the capability of humans. Since that time the bottom-up view has gradually come to dominate the world of artificial intelligence, ranging from Minsky’s The Society of Mind to the more recent work of electrical engineers such as Jeff Hawkins and Ray Kurzweil, who both have declared that the path to human-level AI is to be found by aggregating the simple algorithms they see underlying cognition in the human brain.
Brooks circulated his critique in a 1990 paper titled “Elephants Don’t Play Chess,”4 arguing that mainstream symbolic AI had failed during the previous thirty years and a new approach was necessary. “Nouvelle AI relies on the emergence of more global behavior from the interaction of smaller behavioral units. As with heuristics there is no a priori guarantee that this will always work,” he wrote. “However, careful design of the simple behaviors and their interactions can often produce systems with useful and interesting emergent properties.”5
Brooks did not win over the AI establishment overnight. At roughly the same time that he started designing his robotic insects, Red Whittaker at Carnegie Mellon was envisioning walking on the surface of Mars with Ambler, a sixteen-foot-tall six-legged robot weighing 5,500 pounds. In contrast, Brooks’s Genghis robot was a hexapod weighing just over two pounds. Genghis became a poster child for the new style of AI: “fast, cheap, and out of control”—as the title read of a 1989 article that Brooks cowrote with his grad student Anita M. Flynn. Brooks and Flynn proposed that the most practical way to explore space was by sending out his low-cost insect-like robots in swarms rather than deploying a monolithic overengineered and expensive system.
Predictably, NASA was initially dismissive of the idea of building robotic explorers that were “fast, cheap, and out of control.” When Brooks presented his ideas at the Jet Propulsion Laboratory, engineers who had been working on costly scientific instruments rejected the idea of a tiny inexpensive robot with limited capabilities. He was undeterred. In the late 1980s and early 1990s, Brooks’s ideas resonated with the design principles underpinning the Internet. A bottom-up ideology, with components assembling themselves into more powerful and complex systems, had captured the popular imagination. With two of his students, Brooks started a company and set out to sell investors on the idea of privately sending small robots into space, first to the moon and later to Mars.6 For $22 million, Brooks proposed, you would not only get your logo on a rover; you could also promote your company with media coverage of the launch. Movies, cartoons, toys, advertising in moondust, a theme park, and remote teleoperation—these were all part of one of the more extravagant marketing campaigns ever conceived. Brooks was aiming for a moon launch in 1990, the first one since 1978, and then planned to send another rocket to Mars just three years later. By 2010, the scheme called for sending micro-robots to Mars, Neptune, its moon Triton, and the asteroids.
What the plan lacked was a private rocket to carry the robots. The trio spoke with six private rocket launch companies, none of which at the time had made a successful launch. All Brooks needed was funding. He didn’t find any investors in the private sector, so the company pitched another space organization called the Ballistic Missile Defense Organization, which was the Pentagon agency previously tasked to build the Strategic Defense Initiative, a feared and ridiculed Star Wars-style missile defense shield. The project, however, had stalled after the fall of the Soviet Union. For a while, however, the BMDO considered competing with NASA by organizing its own moon launch. The MIT trio built a convincing moon launch rover prototype christened Grendel, intended to hitchhike to the moon aboard a converted “Brilliant Pebble,” the Star Wars launch vehicle originally created to destroy ICBMs by colliding with them in space. Grendel was built in accordance to Brooks’s bottom-up behavior approach, and it had a successful trial, but that was as far as it got.
The Pentagon’s missile division lost its turf war with NASA. The nation was unwilling to pay for two spacefaring organizations. Ironically enough, years later the developers of the NASA Sojourner, which landed on Mars in 1997, borrowed heavily from the ideas that Brooks had been proposing. Although he never made it into space, a little more than a decade later, Brooks’s bottom-up approach found a commercial niche. iRobot, successor of Brooks’s spacefaring company, gained success by selling an autonomous vacuum cleaner for the civilian market, while a modified mil-spec version toured the Afghanistan and Iraq terrain sniffing out improvised explosive devices.
Eventually, Brooks would win the battle with the old guard. He found an audience for his ideas about robotics at MIT, and won accolades for what he liked to call nouvelle AI. A new generation of MIT grad students started following him and not Minsky. Nouvelle AI had a widespread impact beyond the United States and especially in Europe, where attention had shifted from the construction of human-level AIs to systems that would exhibit emergent behaviors in which more powerful or intelligent capabilities would be formed from the combination of many simpler ones.
Brooks’s own interests shifted away from autonomous insects and toward social interactions with humans. With graduate students, he began designing socializing robots. Robots like Cog and Kismet, designed with graduate student Cynthia Breazeal, were used to explore human-robot interaction as well as the capabilities of the robots themselves. In 2014 Breazeal announced that she planned to commercialize a home robot growing out of that original research. She has created a plucky Siri-style family companion that remains stationary on a kitchen counter, hopefully assisting with a variety of household tasks.
In 2008, Brooks retired from the MIT AI Lab and started a low-profile company with a high-profile name, Heartland Robotics. The name evoked the problem Brooks was trying to solve: the disappearance of manufacturing from the United States as a consequence of lower overseas wages and production costs. As energy and transportation costs skyrocket, however, manufacturing robots offer a potential way to level the playing field between the United States and low-wage nations. For several years there were tantalizing rumors about what Brooks had in mind. He had been working on humanoid robots for almost a decade, but at that point the robotics industry hadn’t even managed to successfully commercialize toy humanoid robots, let alone robots capable of practical applications.
When Baxter was finally unveiled in 2012, Heartland had changed its name to Rethink, with the humanoid robot receiving mixed reviews. Not everyone understood or agreed with Brooks’s deliberate choice of approximating the human anatomy. Today many of his competitors sell robot arms that make no effort to mimic a human counterpart, opting for simplicity and function. Brooks, however, is undeterred. His intent is to build a robot that is ready to collaborate with rather than replace human workers. Baxter is one of a generation of robots intended to work in proximity to flesh-and-blood coworkers. The technical term for this relationship is “compliance,” and there is widespread belief among roboticists that over the next half decade these machines will be widely used in manufacturing, distribution, and even retail positions. Baxter is designed to be programmed easily by nontechnical workers. To teach the robot a new repetitive task, humans only have to guide the robot’s arms through the requisite motions and Baxter will automatically memorize the routine. When the robot was introduced, Rethink Robotics demonstrated Baxter’s capability to slowly pick up items on a conveyor belt and place them in new locations. This seemed like a relatively limited contribution to the workplace, but Brooks argues that the system will develop a library of capabilities over time and will increase its speed as new versions of its software become available.
Rodney Brooks rejected early artificial intelligence in favor of a new approach he described as “fast, cheap, and out of control.” Later he designed Baxter, an inexpensive manufacturing robot intended to work with, rather than replace, human workers. (Photo courtesy of Evan McGlinn/New York Times/Redux)
It is perhaps telling that one of Rethink’s early venture investors was Jeff Bezos, the chief executive of Amazon. Amazon has increasingly had problems with its nonunionized warehouse workers, who frequently complain about poor working conditions and low wages. When Amazon acquired Kiva Systems, Bezos signaled that he was intent on displacing as much human labor from his warehouses as possible. In modern consumer goods logistics there are two levels of distribution: storing and moving whole cases of goods, and retrieving individual products from those cases. The Kiva system consists of a fleet of mobile robots that are intended to save human workers the time of walking through the warehouse to gather individual products to be shipped together in a composite order. While the humans work in one location, the Kiva robots maneuver bins of individual items at just the right time in the shipping process, and the humans pick out and assemble the products to be shipped. Yet Kiva is clearly an interim solution toward the ultimate goal of building completely automated warehouses. Today’s automation systems cannot yet replace human hands and eyes. The ability to quickly recognize objects among dozens of possibilities and pick them up from different positions remains a uniquely human skill. But for how long? It doesn’t take much imagination to see Baxter, or a competitor from similar companies like Universal Robots or Kuka, working in an Amazon warehouse alongside teams of mobile Kiva robots. Such lights-out warehouses are clearly on the horizon, whether they are made possible by “friendly” robots like Baxter or by more impersonal systems like the ones Google’s new robotics division is allegedly designing.
As the debates over technology and jobs reemerged in 2012 in the United States, many people were eager to criticize Brooks’s Baxter and its humanoid design. Automation fears have ebbed and flowed in the United States for decades, but because Rethink built a robot in the human form, many thought that Rethink was building machines that should be—and now are—capable of replacing human labor. Today, Brooks argues vociferously that robots don’t simply kill jobs. Instead, by lowering the cost of manufacturing in the United States, robots will contribute to rebuilding the nation’s manufacturing base in new factories with jobs for more skilled, albeit perhaps fewer, workers.
The debate over humans and machines continues dogging Brooks wherever he travels. At the end of the school year in 2013, he spoke to the parents of graduating students at Brown University. The ideas in his Baxter pitch were straightforward and ones that he assumed would be palatable to an Ivy League audience. He was creating a generation of more intelligent tools for workers, he argued, and Baxter in particular was an example of the future of the factory floor, designed to be used and programmed by average workers. But the mother of one of the students would have none of it. She put up her hand and indignantly asked, “But what about jobs? Aren’t these robots going to take away jobs?” Patiently, Brooks explained himself again. This was about collaborating with workers, not replacing them outright. As of 2006 the United States was sending vast sums annually to China to pay for manufacturing there. That money could provide more jobs for U.S. workers, he pointed out. You’re just speaking locally, she retorted. What about globally? Brooks threw up his hands. China, he contended, needs robots even more than the United States does. Because of their demographics and particularly the one-child policy, the Chinese will soon face a shortage of manufacturing workers. The deeper point that Brooks felt he couldn’t get across to a group of highly educated upper-middle-class parents was that the repetitive jobs Baxter will destroy are not high-quality ones that should be preserved. When the Rethink engineers went into factories, they asked the workers whether they wanted their children to have similar jobs. “Not one of them said yes,” he noted.
In his office, Brooks keeps photos of the factory manufacturing line of Foxconn, the world’s largest contract maker of consumer electronics products. They are haunting evidence of the kinds of drudgery he wants Baxter to replace. Yet despite his obvious passion for automation and robotics, Brooks has remained more of a realist than many of his robotics and AI brethren in Silicon Valley. Although robots are indeed sprouting legs and moving around in the world among us, they are still very much machines, in Brooks’s view. Despite the deeply ingrained tendency of humans to interact with robots as if they have human qualities, Brooks believes that we have a long way to go before intelligent machines can realistically match humans. “I’ll know they have gotten there,” he said, “when my graduate students feel bad about switching off the robot.”
He likes to torment his longtime friend and MIT colleague Ray Kurzweil, who is now chartered to build a Google-scale artificially intelligent mega-machine, after having previously gained notoriety for an impassioned and detailed argument that immortality is within the reach of the current human generation through computing, AI, and extraordinary dietary supplements. “Ray, we are both going to die,” he has told Kurzweil. Brooks merely hopes that a future iteration of Baxter will be refined enough to provide his elder care when the day comes.
The idea that we may be on the verge of an economy running largely without human intervention or participation (or accidents) isn’t new. Almost all of the arguments in play today harken back to earlier disputes. Lee Felsenstein is the product of this eclectic mix of politics and technology. He grew up in Philadelphia, the son of a mother who was an engineer and a father who was a commercial artist employed in a locomotive factory. Growing up in a tech-centric home was complicated by the fact that he was a “red diaper baby.” His father was a member of the U.S. Communist Party, committed enough to the cause that he named Lee’s brother Joe after Stalin.7 However, like many children of Party members, Lee wouldn’t learn that his parents were Communists until he was a young adult—he abruptly lost his summer college work-study position at Edwards Air Force Base, having failed a background investigation.
Lee’s family was secularly Jewish, and books and learning were an essential part of their childhood. This would mean that bits and pieces of Jewish culture found their way into Lee’s worldview. He grew up aware of the golem legend, Jewish lore that would come to influence his approach to the personal computing world that he in turn would help create. The idea of the golem can be dated back to the earliest days of Judaism. In the Torah, it connotes an unfinished human before God’s eyes. Later it came to represent an animated humanoid creature made from inanimate matter, usually dust, clay, or mud. The golem, animated using kabbalistic methods, would became a fully living, obedient, but only partially human creation of the holy or particularly blessed. In some versions of the tale, the golem is animated by putting a parchment in its mouth, not unlike programming using paper tape. The first modern robot in literature was conceived by Czech writer Karel Čapek in his play R. U. R. (Rossum’s Universal Robots) in 1921, and so the golem precedes it by several thousand years.
“Is there a warning for us today in this ancient fable?” wonders R. H. MacMillan, the author of Automation: Friend or Foe?, a 1956 caution about the dangers of computerization of the workplace. “The perils of unrestricted ‘push-button warfare’ are apparent enough, but I also believe that the rapidly increasing part that automatic devices are playing in the peace-time industrial life of all civilized countries will in time influence their economic life in a way that is equally profound.”8
Felsenstein’s interpretation of the golem fable was perhaps more optimistic than most. Influenced by Jewish folklore and the premonitions of Norbert Wiener, he was inspired to sketch his own vision for robotics. In Felsenstein’s worldview, when robots were sufficiently sophisticated, they would be neither servants nor masters, but human partners. It was a perspective in harmony with Engelbart’s augmentation ideas.
Felsenstein arrived in Berkeley roughly a decade after Engelbart had studied there as a graduate student in the fifties. Felsenstein became a student during the frenetic days of the Free Speech Movement. In 1973, as the Vietnam War wound down, he set out alongside a small collective of radicals to create a computing utility that offered the power of mainframe computers to the community. They found warehouse space in San Francisco and assembled a clever computing system from a cast-off SDS 940 mainframe discarded by Engelbart’s laboratory at Stanford Research Institute. To offer “computing power to the people,” they set up free terminals in public places in Berkeley and San Francisco, allowing for anonymous access.
Governed by a decidedly socialist outlook, the group eschewed the idea of personal computing. Computing should be a social and shared experience, Community Memory decreed. It was an idea before its time. Twelve years before AOL and the Well were founded, and seven years before dial-up BBSes became popular, Community Memory’s innovators built and operated bulletin boards, social media, and electronic communities from other people’s cast-offs. The first version of the project lasted only until 1975 before shutting down.
Felsenstein had none of the anti-PC bias shared by both his radical friends and John McCarthy. Thus unencumbered, he became one of the pioneers of personal computing. Not only was Felsenstein one of the founding members of the Homebrew Computer Club, he also designed the Sol-20, an early hobbyist computer released in 1976, followed up in 1981 with the Osborne 1, the first mass-produced portable computer. Indeed, Felsenstein had a broad view of the impact of computing on society. He had grown up in a household where Norbert Wiener’s The Human Use of Human Beings held a prominent place on the family bookshelf. His father had considered himself not merely a political radical but a modernist as well. Felsenstein would later write that his father, Jake, “was a modernist who believed in the perfectibility of man and the machine as the model for human society. In play with his children he would often imitate a steam locomotive in the same fashion other fathers would imitate animals.”9
The discussion of the impact of technology had been a common trope in the Felsenstein household in the late fifties and the early sixties before Felsenstein left for college. The family discussed the impact of automation and the possibility of technological unemployment with great concern. Lee had even found and read a copy of Wiener’s God and Golem, Inc., published the year that Wiener had unexpectedly died while visiting Stockholm and consisting mostly of his premonitions, both dire and enthusiastic, about the consequences of machines and automation on man and society. To Felsenstein, Wiener was a personal hero.
Despite his early interest in robots and computing, Felsenstein had never been enthralled with rule-based artificial intelligence. Learning about Engelbart’s intelligence amplification ideas would change the way he thought about computing. In the mid-1970s Engelbart’s ideas were in the air among computer hobbyists in Silicon Valley. Felsenstein, and a host of others like him, were dreaming about what they could do with their own computers. In 1977, the second year of the personal computer era, he listened to a friend and fellow computer hobbyist, Steve Dompier, talk about the way he wanted to use a computer. Dompier described a future user interface that would be designed like a flight simulator. The user would “fly” through a computer file structure, much in the way 3-D computer programs now simulate flying over virtual terrain.
Felsenstein’s thinking would follow in Dompier’s footsteps. He developed the idea of “play-based” interaction. Ultimately he extended the idea to both user interface design and robotics. Traditional robotics, Felsenstein decided, would lead to machines that would displace humans, but “golemics,” as he described it, using a term first introduced by Norbert Wiener, was the right relationship between human and machine. Wiener had used “golemic” to describe the pretechnological world. In his “The Golemic Approach,”10 Felsenstein presented a design philosophy for building automated machines in which the human user was incorporated into a system with a tight feedback loop between the machine and the human. In Felsenstein’s design, the human should retain a high level of skill to operate the system. It was a radically different approach compared to conventional robotics, in which human expertise was “canned” in the robot while the human remained passive.
For Felsenstein, the automobile was a good analogy for the ideal golemic device. Automobiles autonomously managed a good deal of their own functions—automatic transmission, braking, and these days advanced cruise control and lane keeping—but in the end people maintained control of their cars. The human, in NASA parlance, remained very much in the loop.
Felsenstein first published his ideas as a manifesto in the 1979 Proceedings of the West Coast Computer Faire. The computer hobbyist movement in the mid-1970s had found its home at this annual computer event, which was created and curated by a former math teacher and would-be hippie named Jim Warren. When Felsenstein articulated his ideas, the sixties had already ended, but he remained very much a utopian: “Given the application of the golemic outlook, we can look forward, I believe, to a society in which rather than bringing about the displacement of people from useful and rewarding work, machines will effect a blurring of the distinction between work and play.”11 Still, when Felsenstein wrote his essay in the late 1970s it was possible that the golem could evolve either as a collaborator or as a Frankenstein-like monster.
Although he was instrumental in elevating personal computing from its hobbyist roots into a huge industry, Felsenstein was largely forgotten until very recently. During the 1990s he had worked as a design engineer at Interval Research Corporation, and then set up a small consulting business just off California Avenue in Palo Alto, down the street from where Google’s robotics division is located today. Felsenstein held on to his political ideals and worked on a variety of engineering projects ranging from hearing aids to parapsychology research tools. He was hurled back onto the national stage in 2014 when he became a target for Evgeny Morozov, the sharp-penned intellectual from Belarus who specializes in quasi-academic takedowns of Internet highfliers and exposing post-dot-com era foibles. In a New Yorker essay12 aiming at what he found questionable about the generally benign and inclusive Maker Movement, Morozov zeroed in on Felsenstein’s Homebrew roots and utopian ideals as expressed in a 1995 oral history. In this interview, Felsenstein described how his father had introduced him to Tools for Conviviality by Ivan Illich, a radical ex-priest who had been an influential voice for the political Left in the 1960s and 1970s counterculture. Felsenstein had been attracted to Illich’s nondogmatic attitude toward technology, which contrasted “convivial,” human-centered technologies with “industrial” ones. Illich had written largely before the microprocessor had decentralized computing and he saw computers as tools for instituting and maintaining centralized, bureaucratic control. In contrast, he had seen how radio had been introduced into Central America and rapidly became a bottom-up technology that empowered, instead of oppressed, people. Felsenstein believed the same was potentially true for computing.13
Morozov wanted to prove that Felsenstein and by extension the Maker Movement that carries on his legacy are naive to believe that society could be transformed through tools alone. He wrote that “society is always in flux” and further that “the designer can’t predict how various political, social, and economic systems will come to blunt, augment, or redirect the power of the tool that is being designed.” The political answer, Morozov argued, should have been to transform the hacker movements into traditional political campaigns to capture transparency and democracy.
It is an impressive rant, but Morozov’s proposed solution was as ineffective as the straw man he set up and sought applause for tearing down. He focused on Steve Jobs’s genius in purportedly not caring whether the personal computing technology he was helping pioneer in the mid-1970s was open or not. He gave Jobs credit for seeing the computer as a powerful augmentation tool. However, Morozov entirely missed the codependency between Jobs the entrepreneur and Wozniak the designer and hacker. It might well be possible to have one without the other, but that wasn’t how Apple became so successful. By focusing on the idea that Illich was only interested in simple technologies that were within the reach of nontechnical users, Morozov rigged an argument so he would win.
However, the power of “convivial” technologies, which was Illich’s name for tools that are under individual control, remains a vitally important design point that is possibly even more relevant today. Evidence of this was apparent in an interaction between Felsenstein and Illich, when the radical scholar visited Berkeley in 1986. Upon meeting him, Illich mocked Felsenstein for trying to substitute communication using computers for direct communication. “Why do you want to go deet-deet-deet to talk to Pearl over there? Why don’t you just go talk to Pearl?” Illich asked.
Felsenstein responded: “What if I didn’t know that it was Pearl that I wanted to talk to?”
Illich stopped, thought, and said, “I see what you mean.”
To which Felsenstein replied: “So you see, maybe a bicycle society needs a computer.”
Felsenstein had convinced Illich that their communication could create community even if it was not face-to-face. Given the rapid progress in robotics, Felsenstein and Illich’s insight about design and control is even more important today. In Felsenstein’s world, drudgery would be the province of machines and work would be transformed into play. As he described it in the context of his proposed “Tom Swift Terminal,”14 which was a hobbyist system that foreshadowed the first PCs, “if work is to become play, then tools must become toys.”
Today, Microsoft’s corporate campus is a sprawling set of interlocking walkways, buildings, sports fields, cafeterias, and parking garages dotted with fir trees. In some distinct ways it feels different from the Googleplex in Silicon Valley. There are no brightly colored bicycles, but the same cadres of young tech workers who could easily pass for college or even high school students amble around the campus.
When you approach the elevator in the lobby of Building 99, where the firm’s corporate research laboratories are housed, the door senses your presence and opens automatically. It feels like Star Trek: Captain Kirk never pushed a button either. The intelligent elevator is the brainchild of Eric Horvitz, a senior Microsoft research scientist and director of Microsoft’s Redmond Research Center. Horvitz is well known among AI researchers as one of the first generation of computer scientists to use statistical techniques to improve the performance of AI applications.
He, like many others, began with an intense interest in understanding how human minds work. He obtained a medical degree at Stanford during the 1980s, and soon immersed himself further in graduate-level neurobiology research. One night in the laboratory he was using a probe to insert a single neuron into the brain of a rat. Horvitz was thrilled. It was a dark room and he had an oscilloscope and an audio speaker. As he listened to the neuron fire, he thought to himself, “I’m finally inside. I am somewhere in the midst of vertebrate thought.” At the same moment he realized that he had no idea what the firing actually suggested about the animal’s thought process. Glancing over toward his laboratory bench he noticed a recently introduced Apple IIe computer with its cover slid off to the side. His heart sank. He realized that he was taking a fundamentally wrong approach. What he was doing was no different from taking the same probe and randomly sticking it inside the computer in search of an understanding of the computer’s software.
He left medicine, shifting his course of study, and started taking cognitive psychology and computer science courses. He adopted Herbert Simon, the Carnegie Mellon cognitive scientist and AI pioneer, as an across-the-country mentor. He also became close to Judea Pearl, the UCLA computer science professor who had pioneered an approach to artificial intelligence breaking with the early logic- and rule-based approach, instead focusing on recognizing patterns by building nesting webs of probabilities. This approach is not conceptually far from the neural network ideas so harshly criticized by Minsky and Papert in the 1960s. As a result, during the 1980s at Stanford, Horvitz was outside the mainstream in computer science research. Many mainstream AI researchers thought his interest in probability theory was dated, a throwback to an earlier generation of “control theory” methods.
After he arrived at Microsoft Research in 1993, Horvitz was given a mandate to build a group to develop AI techniques to improve the company’s commercial products. Microsoft’s Office Assistant, a.k.a. Clippy, was first introduced in 1997 to help users master hard-to-use software, and it was largely a product of the work of Horvitz’s group at Microsoft Research. Unfortunately, it became known as a laughingstock failure in human-computer interaction design. It was so widely reviled that Microsoft’s thriller-style promotional video for Office 2010 featured Clippy’s gravestone, dead in 2004 at the age of seven.15
The failure of Clippy offered a unique window into the internal politics at Microsoft. Horvitz’s research group had pioneered the idea of an intelligent assistant, but Microsoft Research—and hence Horvitz’s group—was at that point almost entirely separate from Microsoft’s product development department. In 2005, after Microsoft had killed the Office Assistant technology, Steven Sinofsky, the veteran head of Office engineering, described the attitude toward the technology during program development: “The actual feature name used in the product is never what we named it during development—the Office Assistant was famously named TFC during development. The ‘C’ stood for clown. I will let your active imagination figure out what the TF stood for.”16 It was clear that the company’s software engineers had no respect for the idea of an intelligent assistant from the outset. Because Horvitz and his group couldn’t secure enough commitment from the product development group for Clippy, Clippy fell by the wayside.
The original, more general concept of the intelligent office assistant, which Horvitz’s research group had described in a 1998 paper, was very different from what Microsoft later commercialized. The final shipping version of the assistant omitted software intelligence that would have prevented the assistant from constantly popping up on the screen with friendly advice. The constant intrusions drove many users to distraction and the feature was irreversibly—perhaps prematurely—rejected by Microsoft’s customers. However, the company chose not to publicly explain why the features required to make Clippy work well were left out. A graduate student once asked Horvitz this after a public lecture and the response given was that the features had bloated Office 97 to such an extent that it would no longer fit on its intended distribution disk.17 (Before the Internet offered feature updates, leaving something out was the only practical option.)
Such are the politics of large corporations, but Horvitz would persist. Today, a helpful personal assistant—who resides inside a computer monitor—greets visitors to his fourth-floor glass-walled corner cubicle. The monitor is perched on a cart outside his office, and the display shows the cartoon head of someone who looks just like Max Headroom, the star of the British television series about a stuttering artificial intelligence that incorporated the dying memories of Edison Carter, an earnest investigative reporter. Today Horvitz’s computerized greeter can inform visitors of where he is, set up appointments, or suggest when he’ll next be available. It tracks almost a dozen aspects of Horvitz’s work life, including his location and how busy he is likely to be at any moment during the day.
Horvitz has remained focused on systems that augment humans. His researchers design applications that can monitor a doctor and patient or other essential conversation, offering support so as to eliminate potentially deadly misperceptions. In another application, his research team maintains a book of morbid transcripts from plane crashes to map what can go wrong between pilots and air traffic control towers. The classic and tragic example of miscommunication between pilots and air traffic control is the Tenerife Airport disaster of 1977, during which two 747 jetliners were navigating a dense fog without ground radar and collided while one was taxiing and the other was taking off, killing 583 people.18 There is a moment in the transcript where two people attempt to speak at the same time, causing interference that renders a portion of the conversation unintelligible. One goal in the Horvitz lab is to develop ways to avoid these kinds of tragedies. When developers integrate machine learning and decision-making capabilities into AI systems, Horvitz believes that those systems will be able to reason about human conversations and then make judgments about what part of a problem people are best capable to solve and what part should be filtered through machines. The ubiquitous availability of cheap computing and the Internet has made it easier for these systems to show results and gain traction, and there are already several examples of this kind of augmentation on the market today. As early as 2005, for example, two chess amateurs used a chess-playing software program to win a match against chess experts and individual chess-playing programs.
Horvitz is continuing to deepen the human-machine interaction by researching ways to couple machine learning and computerized decision-making with human intelligence. For example, his researchers have worked closely with the designers of the crowd-sourced citizen science tool called Galaxy Zoo, harnessing armies of human Web surfers to categorize images of galaxies. Crowd-sourced labor is becoming a significant resource in scientific research: professional scientists can enlist amateurs, who often need to do little more than play elaborate games that exploit human perception, in order to help scientists map tricky problems like protein folding.19 In a number of documented cases teams of human experts have exceeded the capability of some of the most powerful supercomputers.
By assembling ensembles of humans and machines and designating a specific research task for each group, scientists can create a powerful hybrid research team. The computers possess staggering image recognition capabilities and they can create tables of the hundreds of visual and analytic features for every galaxy currently observable by the world’s telescopes. That was very inexpensive but did not yield perfect results. In the next version of the program, dubbed Galaxy Zoo 2, computers with machine-learning models would interpret the images of the galaxies in order to present accurate specimens to human classifiers, who could then catalog galaxies with much less effort than they had in the past. In yet another refinement, the system would add the ability to recognize the particular skills of different human participants and leverage them appropriately. Galaxy Zoo 2 was able to automatically categorize the problems it faced and knew which people could contribute to solving which problem most effectively.
At a TED talk in 2013, Horvitz showed the reaction of a Microsoft intern to her first encounter with his robotic greeter. He played a clip of the interaction from the point of view of the system, which tracked her face. The young woman approached the system and, when it told her that Eric was speaking with someone in his office and offered to put her on his calendar, she balked and declined the computer’s offer. “Wow, this is amazing,” she said under her breath, and then, anxious to end the conversation added, “Nice meeting you!” This was a good sign, Horvitz concluded, and he suggested that this type of interaction presages a world in which humans and machines are partners.
Conversational systems are gradually slipping into our daily interactions. Inevitably, the partnerships won’t always develop in the way we have anticipated. In December of 2013, the movie Her, a love story starring Joaquin Phoenix and the voice of Scarlett Johansson, became a sensation. Her was a science-fiction film set in some unspecified not-far-in-the-future Southern California, and it told the story of a lonely man falling in love with his operating system. This premise seemed entirely plausible to many people who saw it. By the end of 2013 millions of people around the globe already had several years of experience with Apple’s Siri, and there is a growing sense that “virtual agents” are making the transition from novelties to the mainstream.
Part of Her is also about the singularity, the idea that machine intelligence is accelerating at such a pace that it will eventually surpass human intelligence and become independent, rendering humans “left behind.” Both Her and Transcendence, another singularity-obsessed science-fiction movie introduced the following spring, are most intriguing for the way they portray human-machine relationships. In Transcendence the human-computer interaction moves from pleasant to dark, and eventually a superintelligent machine destroys human civilization. In Her, ironically, the relationship between the man and his operating system disintegrates as the computer’s intelligence develops so quickly that, not satisfied even with thousands of simultaneous relationships, it transcends humanity and … departs.
This may be science fiction, but in the real world, this territory had become familiar to Liesl Capper almost a decade earlier. Capper, then the CEO of the Australian chatbot company My Cybertwin, was reviewing logs from a service she had created called My Perfect Girlfriend with growing horror. My Perfect Girlfriend was intended to be a familiar chatbot conversationalist that would show off the natural language technologies offered by Capper’s company. However, the experiment ran amok. As she read the transcripts from the website, Capper discovered that she had, in effect, become an operator of a digital brothel.
Chatbot technology, of course, dates back to Weizenbaum’s early experiments with his Eliza program. The rapid growth of computing technology threw into relief the question of the relationship between humans and machines. In Alone Together: Why We Expect More from Technology and Less from Each Other, MIT social scientist Sherry Turkle expresses discomfort with technologies that increase human interactions with machines at the expense of human-to-human contact. “I believe that sociable technology will always disappoint because it promises what it can’t deliver,” Turkle writes. “It promises friendship but can only deliver ‘performances.’ Do we really want to be in the business of manufacturing friends that will never be friends?”20 Social scientists have long described this phenomenon as the false sense of community—“pseudo-gemeinschaft”—and it is not limited to human-machine interactions. For example, a banking customer might value a relationship with a bank teller, even though it exists only in the context of a commercial transaction and such a relationship might be only a courteous, shallow acquaintanceship. Turkle also felt that the relationships she saw emerging between humans and robots in MIT research laboratories were not genuine. The machines were designed to express synthetic emotions only to provoke or elucidate specific human emotional responses.
Capper would eventually see these kinds of emotional—if not overtly sexual—exchanges in the interactions customers were having with her Perfect Girlfriend chatbots. A young businesswoman who had grown up in Zimbabwe, she had previously obtained a psychology degree and launched a business franchising early childhood development centers. Capper moved to Australia just in time to face the collapse of the dot-com bubble. In Australia, she first tried her hand at search engines and developed Mooter, which personalized search results. Mooter, however, couldn’t hold its own against Google’s global dominance. Although her company would later go public in Australia, she left in 2005, along with her business partner, John Zakos, a bright Australian AI researcher enamored since his teenage years with the idea of building chatbots. Together they built My Cybertwin into a business selling FAQbot technology to companies like banks and insurance companies. These bots would give website users relevant answers to their frequently asked questions about products and services. It proved to be a great way for companies to inexpensively offer personalized information to their customers, saving money by avoiding customer call center staffing and telephony costs. At the time, however, the technology was not yet mature. Though the company had some initial business success, My Cybertwin also had competitors, so Capper looked for ways to expand into new markets. They tried to turn My Cybertwin into a program that created a software avatar that would interact with other people over the Internet, even while its owner was offline. It was a powerful science-fiction-laced idea that yielded only moderately positive results.
Capper has been equivocal and remains uncommitted about whether virtual assistants will take away human jobs. In interviews, she would note that virtual assistants don’t directly displace workers and would focus instead on mundane work her Cybertwins do for many companies, which she argued freed up humans to do more complex and ultimately more satisfying work. At the same time, Zakos attended conferences, making assertions that when companies ran A-B testing that compared the way the Cybertwins responded to text-based questions to the way humans in call centers responded to text-based questions, the Cybertwins outperformed the humans in customer satisfaction. They boasted that when they deployed a commercial system on the website of National Australia Bank, the country’s largest bank, more than 90 percent of visitors to the site believed that they were interacting with a human rather than a software program. In order to be convincing, conversational software on a bank website might need to answer about 150,000 different questions—a capability that is now easily within the range of computing and storage systems.
Despite their unwillingness to confront the human job-displacement question, the consequences of Capper and Zakos’s work are likely to be dramatic. Much of the growth of the U.S. white-collar workforce after World War II was driven by the rapid spread of communications networks: telemarketers, telephone operators, and technical and sales support jobs all involved giving companies the infrastructure to connect customers with employees. Computerization transformed these occupations: call centers moved overseas and the first generation of automated switchboards replaced a good number of switchboard and telephone operators. Software companies like Nuance, the SRI spin-off that offers speaker-independent voice recognition, have begun to radically transform customer call centers and airline reservation systems. Despite consumers’ rejection of “voicemail hell,” system technology like My Cybertwin and Nuance will soon put at risk jobs that involve interacting with customers via the telephone. The My Cybertwin conversational technology might not be good enough to pass a full-on Turing test, but it was a step ahead of most of the chatbots that were available via the Internet at the time.
Capper believes deeply that we will soon live in a world in which virtual robots are routine human companions. She holds none of the philosophical reservations that plagued researchers like Weizenbaum and Turkle. She also had no problem conceptualizing the relationship between a human and a Cybertwin as a master-slave relationship.21 In 2007 she began to experiment with programs called My Perfect Boyfriend and My Perfect Girlfriend. Not surprisingly, there was substantially more traffic on the Girlfriend site, so she set up a paywall for premium parts of the service. Sure enough, 4 percent of the people—presumably mostly men—who had previously visited the site were willing to pay for the privilege of creating an online relationship. These people were told that there was nothing remotely human on the other end of the connection and that they were interacting with an algorithm that could only mimic a human partner. Indeed, they were willing to pay for this service, even though already at the time there was no shortage of “sex chat” websites with actual humans on the other end of the conversation.
Maybe that was the explanation. Early in the personal computer era, there was a successful text-adventure game publisher called Infocom whose marketing slogan was: “The best graphics are in your head.” Perhaps the freedom of interacting with a robot relaxed the mind precisely because there was no messy human at the other end of the line. Maybe it wasn’t about a human relationship at all, but more about having control and being the master. Or, perhaps, the slave.
Whatever the psychology underpinning the interactions, it freaked Capper out. She was seeing more of the human psyche than she had bargained for. And so, despite the fact that she had stumbled onto a nascent business, she backed away and shut down My Perfect Girlfriend in 2014. There must be a better way of building a business, she decided. It would turn out that Capper’s business sense was well timed. Apple’s embrace of Siri had transformed the market for virtual agents. The computing world no longer understood conversational systems as quirky novelties, but rather as a legitimate mainstream form of computer interaction. Before My Perfect Girlfriend, Capper had realized that her business must expand to the United States if it was to succeed. She raised enough money, changed the company’s name from My Cybertwin to Cognea, and set up shop in both Silicon Valley and New York. In the spring of 2014, she sold her company to IBM. The giant computer firm followed its 1997 victory in chess over Garry Kasparov with a comparable publicity stunt in which one of its robots competed against two of the best human players of the TV quiz show Jeopardy! In 2011, the IBM Watson system triumphed over Brad Rutter and Ken Jennings. Many thought the win was evidence that AI technologies had exceeded human capabilities. The reality, however, was more nuanced. The human contestants could occasionally anticipate the brief window of time in which they could press the button and buzz in before Watson. In practice, Watson had an overwhelming mechanical advantage that had little to do with artificial intelligence. When it had a certain statistical confidence that it had the correct answer, Watson was able to press the button with unerring precision, timing its button press with much greater accuracy than its human competitors, literally giving the machine a winning hand.
The irony with regards to Watson’s ascendance is that IBM has historically portrayed itself as an augmentation company rather than a company that sought to replace humans. Going all the way back to the 1950s, when it terminated its first formal foray into AI research, IBM has been unwilling to advertise that the computers it sells often displace human workers.22 In the wake of its Watson victory, the company portrayed its achievement as a step toward augmenting human workers and stated that it planned to integrate Watson’s technology into the health-care field as an intellectual aid to doctors and nurses.
However, Watson was slow to take off as a physicians’ advisor, and the company has broadened its goal for the system. Today the Watson business group is developing applications that will inevitably displace human workers. Watson had originally been designed as a “question-answering” system, making progress toward the fundamental goals in artificial intelligence. With Cognea, Watson gained the ability to carry on a conversation. How will Watson be used? The choice faced by IBM and its engineers is remarkable. Watson can serve as an intelligent assistant to any number of professionals, or it can replace them. At the dawn of the field of artificial intelligence IBM backed away from the field. What will the company do in the future?
Ken Jennings, the human Jeopardy! champion, saw the writing on the wall: “Just as factory jobs were eliminated in the 20th century by new assembly-line robots, Brad and I were the first knowledge-industry workers put out of work by the new generation of ‘thinking’ machines. ‘Quiz show contestant’ may be the first job made redundant by Watson, but I’m sure it won’t be the last.”23