Here Is a Human Being (11 page)

Authors: Misha Angrist

BOOK: Here Is a Human Being

10.97Mb size Format: txt, pdf, ePub

As far as I could tell, 454 Life Sciences came into being as a result of genomic determinism. Its founder, scientist Jonathan Rothberg, spent a long dark night in 1999 when his newborn son was rushed to intensive care for an unknown ailment (the boy is healthy today). “Why can’t we just sequence his genome and know if everything is fine or not fine?” wondered Rothberg, presumably sure that a genome sequence would indeed be the definitive test. His son’s illness notwithstanding, that night spawned an idea. Rothberg had read about Intel’s new chip, which incorporated 44 million transistors. Transistors had long since displaced vacuum tubes.
^*Sanger sequencing technology—the molecular version of the vacuum tube—had hit a wall. The time had come, thought Rothberg, to develop DNA sequencing’s answer to the transistor.
¹⁶

The 454 approach was based on three innovations that made it distinct from Sanger sequencing. One was to miniaturize everything: the smaller each DNA reaction is, the fewer chemical reagents it requires, the less space it takes up, and the cheaper it is to run. Rothberg was able to shrink the wells that held each DNA reaction such that four would fit on the end of a single hair.

The second novelty to be commercialized by 454 was to sequence the DNA in “massively parallel” fashion. Typical Sanger sequencing reactions can produce read lengths of eight hundred DNA letters or base pairs. Those are more than long enough to align and use to map back to the twenty-three pairs of human chromosomes that are themselves each composed of tens of millions of base pairs. But the 454 method and several other post-Sanger DNA sequencing methods don’t use gels or capillaries to resolve DNA fragments of different sizes and so therefore produce shorter reads.
¹⁷Imagine two jigsaw puzzles of the same dimensions and depicting the same image, one with a thousand pieces, the other with eight thousand. Obviously the second one is going to be more challenging to assemble. For this reason, other investigators had given up on so-called sequencing by synthesis, in which the DNA sequence is read as the A’s, G’s, T’s, and C’s are incorporated into the growing molecule instead of being read on a gel or a “trace” of a capillary reaction after the fact (more on that in a minute). In sequencing by synthesis, the sequencing reactions would tend to poop out after only a few dozen bases, yielding a jigsaw puzzle with too many pieces to make sense of. To overcome these short read lengths, 454 went the massively parallel route. As the company showed in the pages of
Nature,
each reaction might produce only one hundred bases, but if you sequenced four hundred thousand wells of DNA at a time, you could still get lots of overlap and lots of sequence; a computer could then do the heavy lifting to assemble the pieces in the correct order and align them to a reference genome.
¹⁸Within a couple of years, the company’s sequencer averaged 330 bases.
¹⁹

The other breakthrough of 454 was to use pyrophosphate-based sequencing, or pyrosequencing. Each time a nucleotide (an A, G, T, or C) was incorporated into the growing DNA strand, it would be accompanied by the release of a certain amount of pyrophosphate, a chemical that would provide the energy to stimulate a light-producing reaction. This reaction could be detected and quantified by a camera and easily distinguish among the four bases.
²⁰

In the early days Rothberg often spoke about 454 technology as a revolutionary phenomenon in a way that presaged the arrival of personal genomics and individual genome scans in 2007. “We democratize sequencing and enable everyone to have a sequencing machine on [his or her] bench,” he told
Genetic Engineering News.
²¹“It’s completely analogous to personal computers displacing mainframes,” he said in
Bio-IT World.
“Now anyone can have their own genome center.”
²²

Roche, which had subsidized 454 to the tune of $60 million in the early going, apparently agreed with Rothberg’s assessment. In March 2007, Roche bought 454 outright for $140 million.
²³

But as the Sanger’s Ian Goodhead suggested when he told me that 2007 was the year to show up at Marco Island, 454 was not destined to be the eight-hundred-pound, eight-hundred-base gorilla that Applied Bio-Systems had been in the sequencing world of the 1990s. Yes, 454 had had a two-year head start on other next-gen sequencing companies and its method was up to one hundred times faster than Sanger sequencing: decoding the complete genome of a bacterium took days on 454's Genome Sequencer 20 as opposed to a month or more with the Sanger-based ABI model 3730. But 454's GS 20 machine had its own limitations. Bacteria were small and they had small genomes: typically a few million bases on a single chromosome. The human genome, on the other hand, was 3.2
billion
base pairs. Worse, the human genome was a mess, a largely uncharted galaxy where many of the planets couldn’t be distinguished from most of the others: some 50 percent of our DNA is known to be repetitive and thus difficult to sequence. In the early going, the 454 machine earned a reputation for struggling with homopolymers—that is, repetitive stretches of the same base. If a piece of human genomic DNA or a tumor sample had, say, a string of eleven A’s in a row, the GS 20 might mistakenly call it ten or twelve. That kind of error could really mess up an experiment. The always-tactful Broad Institute sequencing maven Chad Nusbaum initially wondered whether the first iterations of 454 were up to the task of a mammalian genome.
²⁴And 454 was also expensive vis-à-vis other next-generation sequencing technologies. At Marco Island I spoke to several people from university sequencing facilities who were willing to pay the up-front instrument costs of $500,000 for a GS 20, but worried that they couldn’t afford the $100,000 they’d have to lay out for every billion bases they sequenced. Of course the thought of a small university core lab
ever
sequencing a billion bases was itself brand-new; when the public Human Genome Project sequenced its billionth base in the late 1990s, it was cause for a champagne celebration and a PBS camera crew to film the event.
²⁵

Eventually Rothberg turned his attention to making an affordable bench-top next-gen sequencer a reality. In 2010 he introduced the Ion Torrent machine at Marco Island, a cheap ($50,000) box that could still crank out 150 million base pairs per run.
²⁶

One of the companies hoping to breathe down 454's neck in the race to supplant Sanger was Helicos BioSciences, a Cambridge, Massachusetts-based start-up tucked away in a
Blade Runnerish
industrial building near Kendall Square, within a few blocks of the Broad Institute. On the wall in the modest reception area was a slogan that was both mission statement and words to live by: “Focus. Teamwork. Safety First. Time Is Short.” This summed up the impossible cognitive dissonance that was biotech corporate culture in the early twenty-first century: concentrate on your job and excel at what you do, but be a team player and don’t rock the boat, be selfless, be careful, don’t rush, don’t cut corners. But most of all, hurry the hell up.

Helicos’s founder and CEO was Stan Lapidus, a bald, bespectacled engineer and self-described “gizmo guy” with a long history in biotech and molecular diagnostics and, perhaps for that reason, an unflappable air about him (he founded two publicly traded molecular diagnostics companies, Cytyc and Exact Sciences, and endured some bumpy times with both). “I know we will have start-up issues, I know we will have emotion. But I know we will overcome it and things will be fine,” he told me after a quarterly call with somewhat skeptical investment analysts.
²⁷

For Lapidus, the idea to start a next-gen sequencing company came the old-fashioned way: by reading the literature. In a paper in
Proceedings of the National Academy of Sciences,
²⁸Stanford’s Steve Quake described a method to sequence DNA not from an amalgam of enzymatically amplified DNA, but from a
single
molecule, a then-unprecedented approach. Being able to interrogate DNA
directly
without having to amplify it or otherwise manipulate it was a real milestone, even if the Quake team was able to read only a measly five bases. Lapidus devoured the Quake paper and saw the future. “It was an ‘aha’ moment,” he said. “If luck favors the prepared mind, then I had been loitering in this area for many years.”
²⁹

Most DNA sequencing methods, including Sanger and more recent methods developed by 454 (bought by Roche), Solexa (swallowed by Illumina), ABI (Life Technologies), and George Church’s lab (polonies), required an amplification step. To generate enough DNA to detect and read, DNA fragments were amplified enzymatically in a process known as the polymerase chain reaction. PCR takes advantage of the DNA replication process first postulated by Watson and Crick in 1953 and which was really their keenest bit of insight: the structure of DNA had to be such that copies of it could be made easily. And so it is: the double helix unzips, polymerase enzymes attach themselves to the single strands, and the copying process begins.
³⁰Today we use PCR to generate millions or billions of copies of any bit of DNA we want. To PCR-amplify a piece of DNA, one puts it in a tube with DNA polymerase enzyme (similar to the one that our own cells use to synthesize new DNA), bits of known sequence called primers to get the reaction going, and plenty of A’s, G’s, T’s, and C’s to serve as building blocks for the amplification process. The whole reaction is heated to cause the two strands of the DNA to be amplified to come apart and allow the enzyme to attach and do its thing. An hour later, you can have more than a billion copies of what you started with, which makes it much easier to manipulate and analyze.
³¹

PCR was an extraordinary innovation in the 1980s; all of a sudden one could do tons of experiments with tiny amounts of starting DNA. The tedium of cloning DNA into bacteria in order to amplify it could often be avoided. The FBI began using PCR to analyze DNA from extremely minute samples of blood, hair, skin, and semen obtained at crime scenes (remember the O.J. murder trial?). PCR’s inventor, the notoriously wacky surfer dude and HIV-denialist Kary Mullis, received the Nobel Prize in fairly short order.
³²

But for all of its advantages, PCR was not necessarily ideal for large-scale sequencing—that is, for sequencing many samples and whole genomes. Rather, PCR was sometimes viewed as just another opportunity to introduce additional expense, manual labor, and human error. And even the process itself could be unpredictable. “Some DNA fragments just don’t PCR well, just as some pieces of DNA simply don’t clone well,” said former Church lab postdoc Jay Shendure, now at the University of Washington in Seattle. Although he was one of the architects behind polony sequencing, the Church lab’s PCR-based sequencing method, Shendure readily conceded that as far as large-scale sequencing goes, PCR could be “a pain in the ass.”
³³

Stan Lapidus had arrived at the same conclusion independently. In the 1990s he had developed an early genetic screening test for colorectal cancer and had gotten swept up in the work of Johns Hopkins cancer genetics pioneer Bert Vogelstein. After reading through hundreds of Vogelstein’s papers, he began to wonder why cancer researchers didn’t do what seemed to be the obvious experiment. Cancer arises, we know, because cellular genomes become unstable and this leads to uncontrolled cell division. “If we believe cancer is a disease of altered DNA, then the real experiment is to sequence one thousand, ten thousand, or one hundred thousand tumors,” Lapidus said. “So why would a smart guy like Vogelstein not do this simple experiment? The answer was because he couldn’t.”
³⁴It was too expensive and too time-consuming, thanks in part to the necessity of having to set up and run so many PCR reactions. Lapidus had long wondered if PCR couldn’t be circumvented altogether and thereby simplify the laborious workflow demanded by large-scale sequencing.

“I knew it was a tractable problem,” he told me. “I couldn’t tract it myself, but I kept my eyes open.”
³⁵After reading Quake’s paper, he and the Broad Institute’s Eric Lander flew out to Stanford to visit Quake and license his technology. Quake grilled Lapidus, and Lapidus followed up by sending him a mathematical model of a problem Quake had been working on. “We spent about six weeks falling in love,” said Lapidus.
³⁶In another six weeks he had raised $27 million.
³⁷Lapidus then hired a raft of first-call chemists, engineers, and executives.

Among them was Tim Harris, an analytical chemist by training who cut his teeth at Bell Labs, where, he told me, “AT&T spent a hundred years alienating the engineers and scientists.” Wanting a new challenge, Harris moved from a field where people were trying to make new materials to one where that step was already done. “In biology, God’s already made all the stuff.” He joined a start-up called SEQ; his team developed an automated imager (“a nice scope,” he called it) that allowed drug developers to examine the effects on cells of thousands of compounds at once—a real time-saver. The success of the imager led to SEQ being sold to British life-sciences giant Amersham (now part of GE Healthcare), but Harris felt his new bosses never really “got” what it was they had bought. “We tried to teach Amersham how to make this thing we’d invented, but were met with limited success. I’ve found that technology doesn’t transfer, only people do. What was broken [at Amersham] I couldn’t fix.”

He emailed Stan Lapidus, whom he’d met years earlier and who shared his interest in biological measurements, and said he was leaving Amersham. Lapidus quickly wrote back and asked when he could come to Boston. Harris was one of Helicos’s first two hands-on, tech-oriented employees. “Single-molecule sequencing was hard. I thought there was a fifty percent chance we would fall on our faces,” he said of the company circa 2004. “Maybe Stan didn’t.”
³⁸

Other books

Panic in Pittsburgh by Roy MacGregor

The Book of Bastards by Brian Thornton

Lucifer's Tears by James Thompson

25 Biggest Mistakes Teachers Make and How to Avoid Them by Carolyn Orange

Alice in Deadland Trilogy by Mainak Dhar

Moonlit Feathers by Sarah Mäkelä

History of the Jews by Paul Johnson

Aggressor by Andy McNab

Forever and a Day by Barber, Jasmine

The Cluttered Corpse by Mary Jane Maffini