J. Craig Venter, James Watson, and Michael Hunkapiller Race for the Human Genome - The Human Side of Science: Edison and Tesla, Watson and Crick, and Other Personal Stories behind Science's Big Ideas (2016)

The Human Side of Science: Edison and Tesla, Watson and Crick, and Other Personal Stories behind Science's Big Ideas (2016)

images

images

Used with permission from Sidney Harris.

What in the world is a genome? In 1920, Hans Winkler, botany professor at the University of Hamburg, Germany, combined the terms gene and chromosome into one, calling them jointly the genome. The study of genetics in biology percolated right along. Gregor Mendel's work from the 1800s was rediscovered in 1900. It was clear that an organism's traits depended on the prior generation's characteristics through what were called genes and chromosomes.

But what are genes and chromosomes, and by what mechanisms are traits passed to succeeding generations? The answers to these questions came into sharp focus just after World War II, when biology experienced a series of visitations from other fields. They weren't quite strong enough to be called invasions, but their effect on biology was stronger than mere interactions. A related example of this phenomenon would be the reaction of the field of geology to meteorologist/climatologist Alfred Wegener's theory of continental drift, which we saw in chapter 8.

Visitation 1 was accomplished by physicists. That was what chapter 13 was all about. Physics-educated Crick, Wilkins, Franklin, and Randall identified the structure of DNA using X-ray crystallography techniques borrowed from physics. (James Watson was the only participant with degrees in biology.)

Another set of visitors were the biochemists. (Actually, both biophysics and biochemistry had existed prior to the DNA discovery, but their popularity and credibility soared enormously after DNA became a hot topic.) Even before the structure of DNA was worked out, biochemists had learned that most of the workings of living organisms are accomplished by proteins. With the discovery of DNA, it became clear that the organism's plan was contained in its DNA, then transferred to its RNA (ribonucleic acid), which, unlike DNA, ventures out of the cell's nucleus and helps build the necessary proteins. So, a chromosome is a collection of genes, and each gene is a segment of DNA.

DECIPHERING THE GENOME

And now we can finally see just what constitutes a genome. An organism's genome is its complete set of DNA, including all its chromosomes, which in turn contains all its genes. That seems like a large package of information, and it is. The genome contains all the information necessary to build and allow that organism to function and includes input for the organism's next generation. To compare the genomes of different organisms, we need to delve into the innards of DNA.

images

Model of DNA molecule. From Wikimedia Commons, user Spiffistan.

DNA's two helices are joined by what are called nucleotide bases. The helices are linked by only four bases: cytosine (C), guanine (G), adenine (A), and thymine (T). These bases join together only in certain pairings: C with G, and A with T. These base pairs make up the backbone of the DNA molecule. The details of how the DNA molecule functions are truly fascinating. You are encouraged to consult the To Dig Deeper section in the back matter if you want more information.

Now we are in a position to compare the genomes of different organisms, in terms of the number of base pairs in the genome.

Organism

Number of base pairs in the genome

Comment

Mycoplasma genitalium

580,000

Smallest true organism

E. coli K-12

4,639,000

bacterium often studied in labs

S. cerevisiae

12,500,000

brewer's yeast

D. melanogaster

123,000,000

fruit fly

Mus musculus

2.8 Billion

mouse

Homo sapiens

3.3 billion

human

Picea abies

19.2 billion

Norway spruce

To put these numbers of base pairs into perspective, if you counted one base pair per second, it would take you over one hundred years to enumerate the human genome.

In the 1970s, clever methods were developed to read the DNA sequence, but they were not very fast and were quite expensive. A biology graduate student might spend several years to sequence a several-thousand-base-pair portion of a genome. Biology's instruments improved, however, and by the mid-1980s, the lure of sequencing the entire human genome was enormous.

The total human sequence is the grail of human genetics…an incomparable tool for the investigation of every aspect of human function.

—Walter Gilbert (1980 Nobel Laureate)1

Seeking the grail conjures up the image of many knights, each following their own arduous path, seeking an extremely significant object. True. The human genome took all that and more. It required a whole new set of visitors to biology—instrument makers, venture capitalists, information technologists, politicians, Big Science executives, and one talented but extremely impatient fellow: J. Craig Venter. Let's start with him.

images

Used with permission from Sidney Harris.

John Craig Venter was born in 1946 in Salt Lake City, Utah. Shortly thereafter, the family moved to California, where Craig loved boats and surfing, but not school, where he earned lots of Cs and Ds. Although he opposed the Vietnam War, he wound up as a corpsman in a field hospital's intensive care unit. Perhaps motivated by the wounded, maimed, and dying marines, he decided to study medicine when he got out. He began his studies at the College of San Mateo community college in California, then transferred to the University of California, San Diego, where he earned a BS in biochemistry in 1972 and a PhD in physiology and pharmacology in 1975.

images

J. Craig Venter (1946-). From Wikimedia Commons, user Calliopejen.

Venter's first academic position was at the State University of New York at Buffalo, where his major research interest was in the protein that acted as the brain's receptor for adrenaline. Eventually, it became clear to him that he didn't have access to enough of the tools of molecular biology to make the progress he wanted, progress that included finding the DNA sequence of the particular protein he was researching. Even with the collaboration of Caltech's biology professor Leroy (Lee) Hood and his postdoctoral student Michael Hunkapiller, progress was too slow for Venter.

OUT OF THE FRYING PAN…

In late 1983, the answer sought Venter. He was recruited by the National Institutes of Health (NIH) in Bethesda, Maryland, and became chief of the Receptor Biochemistry and Molecular Biology Section, National Institute of Neurological Disorders and Stroke (NINDS). After about a year of using traditional methods, Venter's team had sequenced the genetic code for the first human brain receptor protein. Its results were written up in a paper published in a series of letters sponsored by the Federation of the Societies of Biochemistry and Molecular Biology (FEBS), but Venter was frustrated by the slow pace of the sequencing.2 To his surprise, there was an article in a recent issue of Nature by Lee Hood's group at Caltech about a new machine that performed automated DNA sequencing. Michael Hunkapiller had joined a biotech company named Applied Biosystems (ABI), which was about to market their brand-new sequencer. Venter ordered one of their first machines, the ABI 373A. It was delivered in February 1987 and installed in his own office where he helped work out the bugs. A sixteen-hour run of the new machine analyzed a sample that would have taken a week the old way. Venter was undoubtedly impressed by the speed, but there was still a long way to go to make a complete analysis of adrenaline.

THE LURE OF THE HUMAN GENOME

Venter paid attention to the discussion that swirled through the biotech world about human genome mapping. He read the arguments about whether a human genome mapping project should be government-funded, pursued at research institutions, or investigated by private industry, and whether it should be researched by individual countries or coordinated by an international consortium. Although some weighed in with the notion that it would require all biology graduate students’ efforts for the next fifteen years and cost three billion dollars, Venter thought it would be a worthwhile effort. Venter convinced his boss, Ernst Freese, and his boss, Irwin Kopin, that he had something to contribute. Part of Venter's lab was converted to the NINDS DNA-sequencing facility. With the rapid purchase of three additional sequencers, Venter's lab became the largest DNA-sequencing facility in the world.

THE BIG TIME

In early 1988, the Ad Hoc Advisory Committee on Complex Genomes met and announced the appointment of James Watson (yes, the James Watson) as the associate director of the newly formed Office of Human Genome Research.3 Watson had his plate full with guiding the setup of such a huge operation, but he must have been delighted to meet Venter, who provided a genome-sequencing capability in NIH's own backyard. At their first meeting, Venter told Watson he needed a few million dollars to sequence a single chromosome, the X. Watson told Venter that amount was insufficient, and to write up a proposal asking for five million dollars.

BUREAUCRACY TANGLES

Intense political pressure from several directions snarled Watson's resolve, however well intentioned. Venter's proposal eventually underwent four revisions over a period of two years and was never funded. The overall project became the Center for Human Genome Research4 and officially began its operation in October 1990. It was set up to be a worldwide effort, with the majority of the work being done by various governmental facilities and universities in the United States, and about a third in the United Kingdom, France, Germany, and Japan.

IMPATIENCE, AGAIN

Venter was too impatient to wait for the federal bureaucracy to grind its way to funding his project. He got an idea about how to shortcut the process. To find the active genes in a particular cell type, he first extracted the RNA from the cell. Since RNA is built according to the plan contained in the DNA, it contains the nucleotide base pair sequence from active portions (genes) of the original DNA. The RNA could then be converted to more stable DNA (called complementary DNA, cDNA) and attached to a bacterial chromosome for storage, using the cut-and-paste techniques available through special molecules that severed DNA at known locations. These are called restriction enzymes. Complementary DNA was a standard resource in molecular biology labs all around the world, so its availability was ensured. Next, the cDNA would be sequenced and compared to other sequenced genes. This idea, called expressed sequence tags (EST), was not new to Venter. It was first published by Paul Schimmel, professor of chemical biology at Scripps Institute in 1983, and was used extensively by the renowned geneticist Sydney Brenner of the Laboratory of Molecular Biology at Cambridge and others in the late 1980s. But, thanks to Venter's ABI sequencer and workstations, no one had the sequencing capability of his lab.

In June 1991, Venter reported in Science that by sequencing ESTs he had identified about 330 genes active in the human brain. In one stroke, Venter had identified and sequenced more than 10 percent of the existing world total of known human genes—all in a matter of months. In his customary direct fashion, Venter pointed out that “improvements in DNA sequencing technologies have now made feasible essentially complete screening of the expressed gene complement of an organism.”5

The immediate negative reaction of some biologists was further fueled by Venter's next paper, published in Nature. There, he reported another 2,375 human genes expressed in the brain—double the number of genes sequenced by the rest of the scientific community at the time. A major worry was that the cDNAs sequenced by the EST procedure of Venter would be funded as a cheaper alternative to sequencing the entire human genome. This approach would miss the subtleties of gene regulation, switching, and control because binding sites for activators and repressors would not be sequenced.

PERILS OF PATENTS

A cause of additional trouble was the patenting of ESTs. The NIH Office of Technology Transfer approached Venter, asking him his intention about patents on the large amount of gene-sequencing data his lab was generating. Venter's limited experience with patents when he researched at the University of Buffalo was that they delayed the publication of scientific information, and so he was definitely not thrilled. Intellectual property had become a hot-button issue in biotechnology, and the law was unclear on ESTs. The Office of Technology Transfer's idea was to apply for a patent, then sort out the details. Venter agreed, but on two conditions: first, the research results would be published anyway, and second, the decision to file would be brought to the attention of top management, especially Watson, to make sure this was the right approach. So, a patent application was filed for the first 330-plus genes prior to Venter's first publication in Science, and 2,421 more genes were added to the application before the Nature article was published—all in Venter's name. The furor arose quickly and never abated. French research minister, Hubert Curien, said that “a patent should not be granted for something that is part of our universal heritage.”6

BLOWUP

Less than a month after the Nature article, there was a Senate hearing of the Committee on Energy and Natural Resources. Venter described his EST research and raised concerns about the patent efforts, which had not been publicly disclosed. The room was suddenly filled with shouts from Watson, who said it was “sheer lunacy” to file such patents and that he was “horrified” because “virtually any monkey” could use the EST method. Venter was shocked that Watson had held him responsible for what Watson later referred to as “Venter patents,” when the idea had originated in the Technology Transfer Office.7

But a Washington Post reporter, Larry Thompson, identified his impression of the real combatants: Watson and NIH director Bernadine Healy.

images

Bernadine Healy (1944-2011), Senator Barbara Mikulski, and Dr. Vivian Pinn. From Wikimedia Commons, user Hildabast.

Bernadine Healy (1944-2011) had distinguished educational credentials from Vassar College and Harvard Medical School and had completed her internship and residency in cardiology at Johns Hopkins University. Eventually, she was appointed by President Ronald Reagan to the position of deputy director of the White House Office of Science and Technology Policy. During this tenure, she was involved in a 1985 spat with Watson, in which he complained about genetic technologies regulations by claiming that in the White House, “the person in charge of biology is either a woman or unimportant. They had to put a woman someplace.”8

Subsequently, Healy was director of the Research Institute at the Cleveland Clinic Foundation when President George H. W. Bush tapped her in 1991 to become director of the NIH, its first woman head. As NIH director, Healy was now Watson's boss. She believed the patent application to be appropriate and dismissed Watson's objections as “a tempest in a teapot.” She instructed Watson not to criticize Venter in public and asked Venter to consult with her on human genome research. Watson resigned in April 1992 (by fax), calling his position “untenable.” Meanwhile, Venter had applied for ten million dollars to expand his sequencing operation. His proposal was vigorously rejected by NIH peer review.9

NEXT VISITORS: VENTURE CAPITALISTS

Wallace Steinberg, head of the HealthCare Investment Corporation and inventor of the Reach toothbrush, put up seventy million dollars to lure Venter into the private sector. This venture was called The Institute for Genomic Research (TIGR). Venter bit. He resigned from NIH in July 1992. Starting with thirty ABI 373A Sequencers, seventeen ABI Catalyst workstations and a Sun SPARCenter 2000 computer with relational database software, Venter set out to increase EST-sequencing production and to sequence genes from model organisms as well. At a cost of one hundred thousand dollars per machine, deep pockets were needed to fund the venture, but Venter, once funded, would have free rein to pursue his sequencing ideas. Human Genome Sciences (HGS) was set up as a sister company to explore commercial aspects of genomic research. Venter was delighted. He proclaimed, “It's every scientist's dream to have a benefactor invest in their ideas, dreams and capabilities.”10 The only catch was that HGS would have only six to twelve months to review Venter's data before publication. His scientific colleagues were decidedly less enthusiastic. Some even referred to him as “Darth Venter.”11

Meanwhile, back at the public consortium, a new director for the National Center for Human Genome Research was announced. The well-respected University of Michigan medical geneticist Francis Collins became the center's second director. As work continued, the consortium posted some impressive results. In 1996, the complete genome of brewer's yeast (Saccharomyces cerevisiae) was completed. This single-celled organism contains six thousand genes constructed out of twelve million nucleotide base pairs in its DNA. Over a hundred laboratories, located in Europe, the United States, Canada, and Japan completed the sequence. Yet, as the halfway point of the scheduled time for the Human Genome Project approached, less than 3 percent of the genome was sequenced and the public consortium costs were running way over budget. Francis Collins appealed for more speed and more novel and productive ideas, but only slight progress resulted.

SHOTGUN SEQUENCING

While the public consortium tried to speed up its progress, Venter's lab, TIGR, tried a totally new tactic: shotgun sequencing. Johns Hopkins University researcher Hamilton Smith, who had discovered restriction enzymes almost twenty years earlier, had a radical idea. First, shear the DNA into thousands of random-sized pieces using sound waves, then sequence the pieces individually using the ABI machines. Store all the sequence data in a computer and let specially written software find overlaps so the pieces could be stitched together on the mathematical basis of pattern recognition to form one contiguous DNA. The technique seemed to work in simulations, and J. Craig Venter didn't shy away from the gamble. TIGR sequenced the entire genome of the Haemophilus influenzae bacteria within thirteen months, for less than half the Human Genome Project sequencing cost. In short order, TIGR then completed the sequence of Mycoplasma genitalium, the smallest free-living organism known, as well as several simpler genomes. Venter's reputation soared among his fellow researchers as more valuable sequencing information was made available for study.12

The shotgun sequencing technique worked—for bacteria—but still wasn't fast enough for the Human Genome Project to finish in time. That was soon to change. Late in 1997, the relationship between Venter's TIGR and its sister company Human Genome Sciences unraveled completely. Although HGS still owed TIGR thirty-eight million dollars, Venter released them from the obligation. This action bought Venter the freedom to release sequencing information faster, without a delay for HGS review.13

But Venter had even bigger plans, which revolved around the talents of Mike Hunkapiller. Since developing the original sequencing machine, the ABI 373A, with Leroy Hood in the late 1980s, Hunkapiller had not only made a number of improvements; he had instituted a substantially changed process. The earlier technique involved running DNA fragments down lanes through a gel for a portion of the separation. Now, Hunkapiller had developed a method in which the DNA was sent down thin, liquid-filled capillary tubes. With many more lanes available in a single run as well as other speed-enhancing improvements, the new machine, the ABI PRISM 3700, was about eight times faster than existing machines. After showing Venter the prototype, Hunkapiller popped the question: Would Venter team up with him to sequence the entire human genome? After some initial skepticism, Venter agreed. Something new had to be done, because the techniques that worked so well on bacteria couldn't be applied directly to the thousandfold-larger human genome.

Venter relished the challenge. After some initial consultation—more like a warning—with Human Genome Project director Francis Collins, Venter announced the formation of his new company. Its main goal: sequence the entire human genome and accomplish it within three years, substantially sooner the HGP schedule. His new company's name: Celera, from the Latin celeris, meaning “swift.” The company's motto: “Speed matters. Discovery can't wait.”14

Venter had done it again. The scientific world was sent spinning, but this time Venter's solid record of accomplishment made the critics much more circumspect. Maybe he could do it. After all, from Venter's standpoint, it was quite a risk. He had a barely tested prototype sequencing machine and no computer software because the old methods wouldn't work on the new genome. For his next move, Venter opened the door to biology's newest visitor: computer programmers, who Venter called algorithm scientists. Like the other visitors discussed earlier, some computer programmers became biology's permanent house guests and were rewarded with a new name for their field: bioinformatics. Stitching together overlapping sequences of nucleotide base pairs to create a whole genome was a significant computing problem, but Venter's massive investment in high-end computing equipment and expertise paid off. His team wrote a program that seemed to work.

As a test, Venter sequenced biology's favorite model organism, Drosophila melanogaster, the fruit fly. Both the machines and algorithms functioned well. The 165 million nucleotide base pair, 13,600 gene DNA, was sequenced in less than four months, just in time to burn it into CD-ROMs that graced every seat at a scientific meeting held the day before the release of the genome paper in Science.15

The Human Genome Project wasn't sitting idle through all of Venter's maneuvering. With increased funding from many sources, especially the Wellcome Trust in the United Kingdom, the consortium bought new sequencing machines (some from ABI, and some from Michael Hunkapiller's competitors) and stepped up their efforts, revising the timetables accordingly. If a race was on, so be it.

Although the competing parties spoke periodically, tensions were inflamed mercilessly by the communications media, especially given Venter's direct manner and Collins's personable but firm style. As the end of the “race” neared, news of the rough edges of the two groups’ relationship reached all the way to the White House. President Bill Clinton told his science adviser, Neal Lane, to “fix it…make those guys work together.”16

images

J. Craig Venter, President Clinton (1946-), and Francis Collins (1950-). Courtesy AP/Worldwide Photos.

The job fell to Ari Patrinos, the Department of Energy's genome director. Patrinos had known both Venter and Collins socially, so in May 2000, he invited both of them to his Rockville, Maryland, townhouse for pizza and beer. They came, perhaps grudgingly, but reached an agreement to work together. The completion of the genome sequence was announced on June 26, 2000, the only day available on President Clinton's calendar. In a satellite linkup with UK prime minister Tony Blair, President Clinton said, “Modern science has confirmed what we first learned from ancient faiths. The most important fact of life on this earth is our common humanity.”17

In the press conference that followed, Venter apologized for the absence of Mike Hunkapiller but explained he had contracted chicken pox. Venter said that if Hunkapiller attended, he would have to sit on the Public Consortium side.

THE RACE CONTINUES

Despite the media hype, the race to sequence the human genome was actually a race to a new starting line. To restate the problem: DNA has the plan for an organism's complete function. But before the function can be carried out, the plan must be transcribed into RNA, which in turn is translated into proteins, which then go on to build the cell's structure and help carry out its functions.

Proteins are the molecules that actually do the work of sustaining life. The genome tells RNA what proteins to build, but variations occur (proteins fold, interact, get sugars or methyl groups attached, etc.) before they actually carry out their multiple missions, eventually producing traits. Now you can see why referring to the human genome as the grail may be too simplistic. The complete collection of proteins, called the proteome, would be a logical but extremely difficult next step. Roy Whitfield, CEO of Incyte, one of the leading biotech companies said, “I would describe it as the beginning of thousands of races. If you have colon cancer, the race is about curing colon cancer. If you have arthritis, it's a race to cure arthritis. It's the start of a really long race to have a tremendous impact on human health.”18