Read Here Is a Human Being Online
Authors: Misha Angrist
The investors behind 23andMe saw the
Journal
article. “This is why your company has to be here,” they told Linda Avey and Anne Wojcicki. “To give people access.”
68
(And presumably to fill a market niche.)
Dietrich Stephan conceded that there was another issue, the same one he had confronted when trying to move complex genetic tests into clinical practice. “At the time [2006] we had absolutely no understanding of how to communicate probabilistic risk based on SNPs of very low effect size and what SNPs met the threshold for being ‘real.'” In other words, while a particular genetic marker might appear in ALS patients more than it does in controls, perhaps even a lot more, that was just the beginning; there might be ten, twenty, or fifty such markers. And even if there weren’t, an association between marker and disease was still a long way from understanding the significance of that marker, what it meant in the context of other DNA markers and/or the environment, and how it might be used to develop therapies. “We still don’t know [what’s ‘real'] for ALS. It’s actually one of the difficult problems that prompted the formation of Navigenics.”
69
One could argue that Knome, too, arose from similar circumstances: demand for a service that was not yet available. As soon as news of the PGP became public, George Church began to get requests from wealthy people who were prepared to have him sequence their entire genomes on a fee-for-service basis. While he was encouraged by the fact that people were interested, it got to be a headache. “I felt this would be distracting from our academic mission both for my lab and for the PGP, both of which are active in nonprofit research operations. This seemed to me to be a textbook case for starting a company: to get it out of my hair. I thought it would be a good way of calling people’s bluffs and making sure they actually
did
want a whole-genome sequence.”
70
Church founded Knome (he pronounces it “Know me"; the CEO says “Nome”) in 2007; it began enrolling customers at the end of the year. For $350,000, you could get your entire genome sequenced—all 6 billion base pairs, five to ten times over.
71
When I mentioned it to my frugal wife she raised her eyebrows. “You’re getting quite a discount,” she said.
What do these origin stories mean? Commercial personal genomics was brought to term via multiple paths. For 23andMe, starting a company was a way to circumvent the inadequacies of publicly funded biomedical research and bring a “holistic view of genomics” to the masses, that is, genetics and self-knowledge in the form of social networking and ancestry. For Navigenics it was a way to make complex medical genetic risk information available to eager (and presumably well-heeled) consumers. For George, starting Knome was simply a way to make commercial personal genomics go away, to segregate it cleanly from his academic and nonprofit enterprises.
None of this is to say that the principals, with the possible exception of George, were not interested in making money—I am not that naïve. But there had to be easier ways to make money. And to lump the top-tier personal genomics companies in with Internet-based vitamin supplement salesmen and other varieties of modern snake oil or late-night infomercial fare was both facile and unfair. This was commerce, yes, but it was also rebellion. And it was dubious commerce, at least initially. While the VC dollars continued to flow, by 2010 no one had gotten rich selling personal genomic services to the public. Indeed, some had already lost their shirts.
In the near term, the elephant in the room would remain determining what all of this information meant. But what was “near term"? When discussing the PGP with friend and Broad Institute geneticist Stacey Gabriel, I said without thinking, “I’m not so worried about interpretation of the sequence. We have the rest of our lives to do that.”
“That’s good,” she said. “Because it’s going to take that long.”
72
Stacey’s colleague Pardis Sabeti conceded that we were still in the very early days of this stuff, but insisted that that was not the point. “It’s unavoidable. This knowledge will be accessible and people will access it.”
73
Statistical geneticist Tara Matise was more agnostic. She had arranged to have her father get his APOE genotype, since Alzheimer’s was his biggest concern. As for herself, she could afford it but was in no hurry. “My family is pretty healthy, luckily, and I have not made time to think about how many surprises I want.”
74
I was still many, many months away from getting the full sequence of my twenty thousand genes, but a few weeks before the Navigenics soirée, Jason Bobe sent me an email: “ … here is your snp data. I’ve taken a look at it and I’m sorry to report that it’s pretty much all junk DNA.”
75
Finally—something to peruse! But jeez Louise, the raw data file went on for days. I would need help to get through it. And I would get it. But even then, of the half million SNPs George’s lab typed me for, I would take a hard look at only less than three hundred.
And even that slight peek inside of Pandora’s box was more than enough to blind me.
*
Another source of difference can be found in much bigger stretches of DNA that vary in how many times they are present in an individual. You, for example, may carry five copies of a million-base-pair stretch of DNA on chromosome 17 while your next-door neighbor may have seven copies of the same region. These recently discovered bits are known as copy-number variants (CNVs). In 2008, the personal genomics company Navigenics typed customers for approximately 1 million CNVs.
*
“Greg Mendel” was later “outed” as 23andMe cofounder Linda Avey’s husband.
†
This was true—even the companies themselves disagreed as to the extent that race and ethnicity had an impact on individual risks. deCODEme tended to discount ethnicity while Navigenics viewed it as very important. Why does this matter? Genetics is all about context; genes do not operate in a vacuum. You and I, for example, may both have the same gene variant that affects the way we metabolize vitamin D. But if I live in Greenland and you live in sunny West Africa, that variant may have much different effects on our resistance to melanoma, our skin pigmentation, etc. Furthermore, if your ancestors have lived somewhere for thousands of years, you have likely inherited a whole mess of gene variants that are of particular relevance to survival in the local climate.
*
In many ways, GWAS have been a disappointment: we’ve found a lot of disease genes, but they are weak—they don’t explain very much of the various diseases and that makes it hard to use that information in the clinic. If Crohn’s disease is caused by twenty genes, how do we design a drug that targets twenty proteins?
*
A notable exception was the Gene Sherpa (http://thegenesherpa.blogspot.com/), run by a clinical genetics fellow, Steve Murphy. Murphy was engaged in his own high-risk entrepreneurial pursuit, the first freestanding genomic medicine practice. From the first he was a vocal critic of personal genomics companies.
*
CLIA stands for Clinical Laboratory Improvement Amendments of 1988. It is the mechanism by which the Centers for Medicare & Medicaid Services regulate clinical laboratory testing of human specimens. There are some 189,000 CLIA-certified Certified labs, most in the United States. CLIA is meant to ensure accuracy, timeliness, and reliability of lab test results. Some people—like me—are not convinced it does that, at least with respect to genetic testing.
E
ven though it was early February, with temperatures in the sixties and a gusty wind buffeting the Gulf Coast, the lobby of the Marco Island Marriott Beach Resort still smelled like coconut. And not cheap suntan-lotion coconut, mind you, but the actual coconut one would pull off a tree and crack open with a hammer—sweet but not overpowering; an upscale pheromone evocative of drinks with umbrellas, bikinis, white sand, and turquoise surf beckoning from just twenty yards away. Flip-flopped and sunglassed tourists waddled in from the heated pool and assorted Jacuzzis.
Sprinkled among them were a few hundred academic genome scientists and their industrial counterparts—that is, representatives from a select group of companies who cater to university molecular biology types. The nerds and the suits both expected to be pampered, and the opening night reception of the Advances in Genome Biology and Technology meeting suggested that they would not be disappointed. In the dark, spread out in the lush grass next to the pool, there was a tiki motif at work: palm trees, torches stuck in the ground, a band warbling Jimmy Buffett covers in the background, various meats on and off the bone, a bevy of other fresh food stations, and an open bar stocked with beer and wine and rum. The staff-to-attendee ratio was high: a festive-shirt-wearing person was always ready to help, to pour coffee or clear away one’s plate.
In part the lavishness came off almost as an act of defiance. With the memory of the 2005 hurricane season and the malevolent sisters Rita and Katrina already receding after a couple of years, this bit of Gulf Coast seemed intent on behaving as though it were oblivious to its precarious location, let alone to whatever curveballs La Niña and climate change might hurl its way. The Marco Island beach was, as ever, dotted with wealthy pensioners moving in and out of their space-age, Bauhaus-on-steroids condos.
At the meeting itself, the opulence was subsidized. When I checked in, the clerk at the front desk handed me two nondescript key cards inside a Marriott envelope. I asked where registration for the genomics meeting was and he stopped short for a moment and then pulled back the envelope before I could pick it up. “I am going to give you different keys, sir,” he announced. The new ones were emblazoned with a golden double helix and Applied Biosystems, Inc.'s catchphrase for its newest DNA sequencing machine: “The next generation is SOLiD™.” (SOLiD™, as I would be reminded on many occasions over the next three days, stands for “sequencing by oligonucleotide ligation and detection.”) I felt as though I had been given a backstage pass to a rock concert. But that was only the beginning: ABI and a handful of other sequencing companies had underwritten the coffee breaks, the meals, the poster sessions, and the boatloads of goodies stuffed inside our complimentary High Sierra backpacks—the pens, the lab notebook, the candy, the beach balls, the digital timer. Plenty of swag for my daughters.
DNA sequencing has only existed since the 1970s and neither of the two original methods was ever patented.
1
So how did it become a multibillion-dollar bonanza? Until the last couple of years, the overwhelming majority of DNA sequencing was based on a principle developed by the unassuming English biochemist Fred Sanger, earning him the second of his two Nobel Prizes in 1980 (the first, in 1958, was for figuring out how to deduce the sequence of amino acids that make up proteins, the end products whose identities are embedded in the DNA code). Around the same time, a chemically based method of DNA sequencing was developed by George’s mentor, Walter Gilbert, and Gilbert’s student, Allan Maxam. Both methods were labor-intensive in the beginning; Sanger’s method was easier to automate and eventually overtook Maxam and Gilbert’s. And even though one could not generate much sequence in a single experiment in the early days, Gilbert said that something had shifted. “In 1975 Allan and I and Fred made sequencing a [laboratory] staple. We changed the problem from impossible to pretty easy.”
2
Sanger’s DNA sequencing method, which seems to me no less ingenious now than it did when I first learned about it in the 1980s, exploited the same enzyme our cells use to manufacture DNA: DNA polymerase (most enzymes bear the suffix -ase). Essentially, “Sanger sequencing” involves putting DNA polymerase in a tube along with the DNA one wants to sequence, plus DNA building blocks, or deoxynucleotides, each of which contains one of the four DNA bases, adenine (A), thymine (T), guanine (G), and cytosine (C). The “deoxys” are the raw material that the polymerase enzyme uses to extend the DNA chain. But the key to the method was that Sanger also added a small quantity of slightly altered versions of the bases. These “dideoxy” versions could also be added to the growing chain the polymerase is churning out, but each one is a dead end—the chain cannot be extended from a dideoxynucleotide: imagine a section of railroad track with a bumper at one end preventing the track from being elongated. Thus, after the enzyme does its work, the test tube is full of DNA strands of various lengths, each one capped at a random place by a chain-ending dideoxy A, T, G, or C. If one could resolve those chains by size, Sanger reasoned, then it should become possible to read the sequence of a DNA molecule from one end to the other: each nucleotide like the rung of a ladder.
3
But how to resolve the different-sized molecules? For nearly two decades the method of choice was a gel made of a thin layer of a latex-like chemical called polyacrylamide. The principle is simple: when exposed to an electric field, shorter DNA molecules migrate through a polyacrylamide gel faster than longer ones. Thus, by loading the contents of the test tube full of different-sized DNA fragments into a vertical gel and cranking up the voltage, one could get an overlapping ladder of DNA molecules and read it in order. Automated Sanger sequencing reads yield eight hundred bases per run, sometimes more (the protein-coding portion of a typical human gene is about two thousand bases long).
4
This was all well and good—exciting even: Sanger sequencing meant that the genomic Rosetta Stone could now be sounded out, even if it wasn’t clear what the words meant. For me, reading a clean piece of DNA sequence that might be harboring a disease-causing mutation was one of the thrills of my graduate student experience. But the setup—the “workflow,” as the corporate sales reps call it—was decidedly less thrilling. Pouring gels between pairs of glass plates, starting over when they developed bubbles, letting them solidify (“polymerize”), loading them with extreme care, disassembling them several hours later, transferring them to paper and exposing the paper to film, and then reading the sequence by hand … all of it got to be a drag. When I was working on my master’s in the late 1980s, my adviser used to walk through his human genetics lab and insist to his technicians, students, and postdocs, “This is
not
a factory.” Methinks he protested too much: We had an assembly line where we performed nearly identical experiments examining genetic markers, running gels, and seeing what came of them. Day after laborious day, each one divided from the next only by lots of beer and a little bit of sleep. It wasn’t quite assembling widgets, but I’d argue it was every bit a factory. And so it became with sequencing and me: I seemed to be most successful when I got into a kind of Zen state and didn’t overthink things. (For me, this was difficult—without strong medication of one kind or another, I am a pretty lousy Buddhist.)
Whatever its sweatshop-like qualities, the public and private versions of the Human Genome Project initially used more or less the same assembly-line Sanger approach. Several of the major sequencing centers hired dozens of people whose only job was to pour gels; they would often come in to work in the middle of the night. Variations of this workflow led to the sequence of a composite human genome ahead of schedule and under budget.
*
Indeed, within a few years automated capillary DNA sequencing, spurred mainly by demand from the HGP, produced yet another revelation. Suddenly there was no gel; it had been replaced by capillary tubes into which the four sequencing reactions were injected. By the late 1990s, there were two commercial capillary platforms. Molecular Dynamics (later Amersham and eventually GE Healthcare) offered the MegaBACE beginning in 1998. In December of that year, Applied BioSystems (then PerkinElmer) began shipping its PRISM 3700, which was eventually succeeded by the 3730.
5
By then the game was afoot: the major taxpayer-funded public sequencing centers had had a fire lit under them by the upstart Craig Venter, the iconoclastic public face of a private initiative that wound up sequencing the human genome in parallel to—and in competition with—the government-funded Human Genome Project. With Venter’s heretical commercial entry into human genome sequencing and ambitious plans to annotate and sell the information, there was a huge, ready-made market for whichever sequencing platform could walk the walk.
6
And what of the face-off between the MegaBACE and the 3700? “It was a real pissing contest for eight or ten months,” recalled Steve Lombardi, who managed ABI’s sequencing operations in the Americas at the time. “But the 3700 was a better instrument. Better chemistry.”
7
And more savvy marketing. Well before Molecular Dynamics did, ABI saw an opportunity to arm both sides of the fight with its new sequencer. For his part, Venter ordered 230 of the 3700s. On the public side, Eric Lander’s genome center at MIT ordered another 115. “By early 2000,” Lombardi told me, “the game was over.”
8
For the next seven years, Sanger sequencing, as embodied by the ABI machine, remained an entrenched technology in hundreds of academic sequencing facilities and biotech companies. Even as late as 2009, Sanger-based DNA sequencing was still very much a growth business—oodles of cash were spent every year on half-million-dollar instruments and the various chemical reagents needed to keep them churning out the endless stream of A’s, G’s, T’s, and C’s, the simple digital DNA alphabet that encodes the tens of thousands of proteins that comprise life on earth.
9
But after the completion of the HGP … then what? What would we do with sequencing technology? Now that we’d completed our “moon shot,” was there something useful that could be done with the leftover launchpads and rockets? Both Craig Venter and his rival Francis Collins encouraged the technologists to continue to bring down the cost of sequencing dramatically—from the average HGP cost of $1 per finished base (itself down from $10 per base in 1990) to six orders of magnitude less: the $1,000 genome. Of course, this meant that the old rocket fleet would have to be either souped up in a quantum way or else mothballed in favor of new machines.
Accordingly, in 2003 the J. Craig Venter Science Foundation announced that it was offering a onetime $500,000 Genomic Technology Prize intended to goad the research community into advancing automated sequencing to the point where a $1,000 genome was feasible.
10
In 2004, perhaps in part as a response to Venter, the National Human Genome Research Institute awarded $38 million in grants to develop novel sequencing technologies aimed at sequencing a “mammalian-sized” genome for $100,000 as an interim step on the path toward the $1,000 genome.
11
And in October 2006, the X Prize Foundation announced the launch of the Archon X Prize for Genomics,
12
a $10 million bounty to be awarded to the first private team to sequence one hundred complete human genomes to high accuracy in ten days—a $10,000 genome. At that time, George Church was on the rules committee for the X Prize and had all but ruled out competing for it himself. A year later not only had he decided to throw his hat into the ring using his polony technology, but with a trademark blend of confidence and modesty he predicted that his lab actually had a pretty good shot at winning. But why? What had changed?
“For one thing the rules have sharpened up,” he told me. We were driving through the Palo Alto rain after a dinner with several Stanford alums and their Asian backers, entrepreneurs who were starting yet another next-generation sequencing company, this one to be called LightSpeed Genomics. For George, the attraction of the X Prize was the incentive it offered to create an infrastructure able to do large amounts of whole-genome sequencing fast—the infrastructure needed to complete, say, the Personal Genome Project. “It would be amazing to be able to sequence a hundred genomes in ten days, but then what?” said George. “You’ll be all dressed up with nowhere to go. Wouldn’t it be nice to have a project waiting in the wings that could use that infrastructure and all those machines?”
13
George had also come to believe that the X Prize shared some of the Personal Genome Project’s educational mission, namely, to raise the profile of genomics in the public consciousness, although he thought the X Prize’s approach was “a bit more flamboyant” than the PGP’s. He may have had misgivings about that, but he was a realist: people liked prizes and competitions. They had a seemingly bottomless appetite for shows like
Dancing with the Stars, America’s Next Top Model,
and
American Idol.
“I would run into people and talk about how the PGP was going to inform us about genetics and medicine,” George said, “and their eyes would glaze over. And then I would mention the X Prize and they would get all excited about it. Like, ‘Where can I put my money?’ And I thought, ‘Gee, this is kind of weird. There are actually people out there who would rather invest in a yacht for the America’s Cup than they would in helping to provide for vaccines in Africa.’ ”
14
Many of the fruits of each of these various incentive programs to lower sequencing costs, some more ripe than others, were on display at Marco Island. The race to succeed Sanger sequencing was on like Donkey Kong.
“You picked the right year to come,” said Ian Goodhead of the Sanger Institute, the British sequencing powerhouse funded by the Wellcome Trust charity. “Last year [2006] it was all 454. By the end of the meeting people were pounding their heads on the table hoping to never hear those three digits again.”
15