The Numbers Behind NUMB3RS (13 page)

Authors: Keith Devlin

BOOK: The Numbers Behind NUMB3RS

8.44Mb size Format: txt, pdf, ePub

Faced with eyewitness evidence from a witness who has demonstrated that he is right 4 times out of 5, you might be inclined to think it was a blue taxi that the witness saw. You might even think that the odds in favor of it being a blue taxi were exactly 4 out of 5 (that is, a probability of 0.8), those being the odds in favor of the witness being correct on any one occasion.

Bayes' method shows that the facts are quite different. Based on the data supplied, the probability that the accident was caused by a blue taxi is only 4 out of 9, or 44 percent. That's right, the probability is less than half. It was more likely to have been a black taxi. Heaven help the owner of the blue taxi company if the jurors can't follow Bayesian reasoning!

What human intuition often ignores, but what Bayes' rule takes proper account of, is the 5 to 1 odds that any particular taxi in this town is black. Bayes' calculation proceeds as follows:

The “prior odds” of a taxi being black are 5 to 1 (75 black taxis versus 15 blue).
The likelihood of X=“the witness identifies the taxi as blue” is:
- 1 out of 5 (20%) if it is black
- 4 out of 5 (80%) if it is blue.
The recalculation of the odds of black versus blue goes like this: P(taxi was black given witness ID)/ P(taxi was blue given witness ID) =
(5 / 1) Ã (20% / 80%) = (5 Ã 20%) / (1 Ã 80%) = 1 / .8 = 5/4.

Thus Bayes' calculation indicates that the odds are 5 to 4 after the witness' testimony that the taxi was black.

If this seems counterintuitive (as it does initially to some people) consider the following “thought experiment.” Send out each of the 90 taxis on successive nights and ask the witness to identify the color of each under the same conditions as before. When the 15 blue taxis are seen, 80% of the time they are described as blue, so we can expect 12 “blue sightings” and 3 “black sightings.” When the 75 black taxis go out, 20% of the time they are described as blue, so we can expect 15 “blue sightings” and 60 “black sightings.” Overall, we can expect 27 taxis will be described by the witness as “blue”, whereas only 12 of them actually were blue and 15 were black. The ratio of 12 to 15 is the same as 4 to 5âin other words, only 4 times out of every 9 (44 percent of the time) when the witness says he saw a blue taxi was the taxi really blue.

In an artificial scenario where the initial estimates are entirely accurate, a Bayesian network will give you an accurate answer. In a more typical real-life situation, you don't have exact figures for the prior probabilities, but as long as your initial estimates are reasonably good, then the method will take account of the available evidence to give you a
better
estimate of the probability that the event of interest will occur. Thus, in the hands of an expert, someone who is able to assess all the available evidence reliably, Bayesian networks can be a powerful tool.

HOW CHARLIE HELPED TRACK DOWN THE ESCAPED KILLER

As we mentioned at the start of the chapter, nothing in the “Manhunt” episode of
NUMB3RS
explained how Charlie analyzed the many reported sightings of the escaped convict. Apart from saying that he used “Bayesian statistical analysis,” Charlie was silent about his method. But, almost certainly, this is what he must have done.

The problem, remember, is that there is a large number of reports of sightings, many of them contradictory. Most will be a result of people seeing someone they think looks like the person they saw in the newspaper or on TV. It is not that the informants lack credibility; they are simply mistaken. Therefore the challenge is how to distinguish the correct sightings from the false alarms, especially when you consider that the false alarms almost certainly heavily outnumber the accurate sightings.

The key factor that Charlie can make use of depends on the fact that each report has a time associated with it, the time of the supposed sighting. The accurate reports, all being reports of sightings of the real killer, will refer to locations in the city that follow a geometric pattern, reflecting the movements of one individual. On the other hand, the false reports are likely to refer to locations that are spread around in a fairly random fashion, and are inconsistent with being produced by a single person traveling around. But how can you pick out the sightings that correspond to that hidden pattern?

In a precise way, you cannot. But Bayes' theorem provides a way to assign probabilities to the various sightings so that the higher the probability, the more likely that particular sighting is to be correct. Here is how Charlie will have done it.

Picture a map of Los Angeles. The goal is to assign to each grid square on the map whose coordinates are i, j, a probability figure p(i,j,n) that assesses the probability that the killer is in grid square (i,j) at time n. The idea is to use Bayes' theorem to repeatedly update the probabilities p(i,j,n) over time (that is, as n increases), say in five-minute increments.

To start the process off, Charlie needs to assign initial prior probabilities to each of the grid squares. Most likely he determines these probabilities based on the evidence from the recaptured prisoner as to where and when the two separated. Without such information, he could simply assume that the probabilities of the grid squares are all the same.

At each subsequent time point, Charlie calculates the new posterior probability distribution as follows. He takes each new reportâa sighting in grid square (i,j) at time n+1âand on the basis of that sighting updates the probability of every grid square (x,y), using the likelihood of that sighting if the killer was in grid square (x,y) at time n. Clearly, for (x,y) = (i,j), Charlie calculates a high likelihood for the sighting at time n+1, particularly if the sighting report says that the killer was doing something that would take time, such as eating a meal or having a haircut.

If (x,y) is near to (i,j), the likelihood Charlie calculates for the killer being in square (i,j) at time n+1 is also high, particularly if the sighting reported that the killer was on foot, and hence unlikely to move far within a five-minute time interval. The exact probability Charlie assigns may vary depending on what the sighting report says the individual was doing. For example, if the individual was reported as “driving north on Third Street” at time n, then Charlie gives the grid squares farther north on Third a higher likelihood of sightings at time n+1 than squares elsewhere.

The probabilities Charlie assigns are also likely to take account of veracity estimations. For example, a report from a bank guard, who gives a fairly detailed description, is more likely to be correct than one from a drunk in a bar, and hence Charlie will assign higher probabilities based on the former than on the latter. Thus, the likelihood for the killer being at square (x,y) at time n+1 based on a high-quality report of him being at square (i,j) at time n is much higher if (x,y) is close to (i,j) than if the two were farther apart, whereas for a low-quality report the likelihood of getting a report of a sighting at square (i,j) is more “generic” and less dependent on (x,y).

Most likely Charlie also takes some other factors into account. For example, a large shopping mall on a Sunday afternoon will likely generate more false reports than an industrial area on a Tuesday night.

This process is, of course, heavily based on human judgments and estimates. On its own, it would be unlikely to lead to any useful conclusion. But this is where the power of Bayes' method comes into play. The large number of sightings, which at first seemed like a problem, now becomes a significant asset. Although the probability distribution Charlie assigns to the map at each time point is highly subjective, it
is
based on a reasonable rationale, and the mathematical precision of Bayes' theorem, when applied many times over, eventually overcomes the vagueness inherent in any human estimation. In effect, what the repeated application of Bayes' theorem does is tease out the underlying pattern in the sightings data that comes from the fact that sightings of the killer were all of the same individual as he moved through the city.

In other words, Bayes' paradigm provides Charlie with a sound quantitative way of simultaneously considering all possible locations at every point in time. Of course, what he gets is not a single “X marks the spot” on the map, but a probability distribution. But as he works through the process, he may reach some stage where high probabilities are assigned to two or three reasonably plausible locations based on recent reports of sightings. If he then gets one or two high-quality reports that dovetail well, Bayes' formula could yield a high probability to one of those locations. And at that point he would contact his brother Don and say, “Get an agent over there now!”

CHAPTER
7
DNA Profiling

We read a lot about DNA profiling these days, as a method used to identify people. Although the technique is often described as “DNA fingerprinting,” it has nothing to do with fingerprints. Rather, the popular term is analogous to an older, more established means of identifying people. Although both methods are highly accurate, in either case care has to be taken in calculating the likelihood of a false identification resulting from two different individuals having fingerprints (of either variety) that the test cannot distinguish. And that is where mathematics comes into the picture.

UNITED STATES OF AMERICA V. RAYMOND JENKINS

On June 4, 1999, police officers in Washington, D.C., found the body of Dennis Dolinger, age 51, at his home in Capitol Hill. He had been stabbed multiple timesâat least twenty-five according to reportsâwith a screwdriver that penetrated his brain.

Dolinger had been a management analyst at the Washington Metropolitan Area Transit Authority. He had lived in Capitol Hill for twenty years and was active in the community. He had a wide network of friends and colleagues across the city. In particular, he was a neighborhood politician and had taken a strong stand against drug dealing in the area.

Police found a blood trail leading from the basement where Dolinger was discovered to the first and second floors of his house and to the front walkway and sidewalk. Bloody clothing was found in the basement and in a room on the second floor. Police believed that some of the bloodstains were those of the murderer, who was cut during the assault. Dolinger's wallet, containing cash and credit cards, had been taken, and his diamond ring and gold chain were missing.

The police quickly identified several suspects: Dolinger's former boyfriend (Dolinger was openly gay), who had assaulted him in the past and had left the D.C. area around the time police discovered the body; a man who was observed fleeing from Dolinger's house but did not call the police; neighborhood drug dealers, including one in whose murder trial Dolinger was a government witness; neighbors who had committed acts of violence against Dolinger's pets; various homeless individuals who frequently visited Dolinger; and gay men whom Dolinger had met at bars through Internet dating services.

By far the strongest lead was when a man named Stephen Watson used one of Dolinger's credit cards at a hair salon and department store in Alexandria within fifteen hours of Dolinger's death. Watson was a drug addict and had a long criminal record that included drug offenses, property offenses, and assaults. Police spoke with a witness who knew Watson personally and saw him on the day of the murder in the general vicinity of Dolinger's home, “appearing nervous and agitated,” with “a cloth wrapped around his hand,” and wearing a “T-shirt with blood on it.” Another witness also saw Watson in the general vicinity of Dolinger's home on the day of the murder, and noted that Watson had several credit cards with him.

On June 9, police executed a search warrant at Watson's house in Alexandria, Virginia, where they found some personal papers belonging to Dolinger. They also noticed that Watson, who was present during the search, had a cut on his finger “that appeared to be several days old and was beginning to heal.” At this point, the police arrested him. When questioned at the police station, Watson “initially denied knowing the decedent and using the credit card” but later claimed that “he found a wallet in a backpack by a bank alongside a beige-colored tarp and buckets on King Street” in Alexandria. Based on those facts, the police charged Watson with felony murder.

That might seem to be the end of the matterâa clear-cut case, you might think. But things were about to become considerably more complicated. The FBI had extracted and analyzed DNA from various blood samples collected from the crime scene and none of it matched that of Watson. As a result, the U.S. Attorney's Office dropped the case against Watson, who was released from custody.

At this point, we need to take a look at the method of identification using DNA, a process known as DNA profiling.

DNA PROFILING

The DNA molecule comprises two long strands, twisted around each other in the now familiar double-helix structure, joined together in a rope-ladder-fashion by chemical building blocks called bases. (The two strands constitute the “ropes” of the “ladder,” the bonds between the bases its “rungs.”) There are four different bases, adenine (A), thymine (T), guanine (G), and cytosine (C). The human genome is made of a sequence of roughly three billion of these base-pairs. Proceeding along the DNA molecule, the sequence of letters denoting the order of the bases (a portion might beâ¦AATGGGCATTTTGACâ¦) provides a “readout” of the genetic code of the person (or other living entity). It is this “readout” that provides the basis for DNA profiling.

Every person's DNA is unique; if you know the exact, three-billion-long letter sequence of someone's DNA, you know who that person is, with no possibility of error. However, using today's techniques, and most likely tomorrow's as well, it would be totally impractical to do a DNA identification by determining all three billion letters. What is done instead is an examination of a very small handful of sites of variation, and the use of mathematics to determine the accuracy of the resulting identification.

DNA is arranged into large structural bodies called chromosomes. Humans have twenty-three pairs of chromosomes which together make up the human genome. In each pair, one chromosome is inherited from the mother and one from the father. This means that an individual will have two complete sets of genetic material. A “gene” is really a location (locus) on a chromosome. Some genes may have different versions, which are referred to as “alleles.” A pair of chromosomes have the same loci along their entire length, but may have different alleles at some of the loci. Alleles are characterized by their slightly different base sequences and are distinguished by their different phenotypic effects. Some of the genes studied in forensic DNA tests have as many as thirty-five different alleles.

Most people share very similar loci, but some loci vary from person to person with high frequency. Comparing variations in these loci allows scientists to answer the question of whether two different DNA samples come from the same person. If the two profiles match at each of the loci examined, the profiles are said to match. If the profiles fail to match at one or more loci, then the profiles do not match, and it is virtually certain that the samples do not come from the same person.
*

A match does not mean that the two samples must absolutely have come from the same source; all that can be said is that, so far as the test was able to determine, the two profiles were identical, but it is possible for more than one person to have the same profile across several loci. At any given locus, the percentage of people having matching DNA fragments is small but not zero. DNA tests gain their power from the conjunction of matches at each of several loci; it is extremely rare for two samples taken from unrelated individuals to show such congruence over many loci. This is where mathematics gets into the picture.

THE FBI'S CODIS SYSTEM

In 1994, recognizing the growing importance of forensic DNA analysis, Congress enacted the DNA Identification Act, which authorized the creation of a national convicted offender DNA database and established the DNA Advisory Board (DAB) to advise the FBI on the issue.

CODIS, the FBI's DNA profiling system (the name stands for COmbined DNA Index System) had been started as a pilot program in 1990. The system weds computer and DNA technologies to provide a powerful tool for fighting crime. The CODIS DNA database comprises four categories of DNA records:

Convicted Offenders: DNA identification records of persons convicted of crimes
Forensic: analyses of DNA samples recovered from crime scenes
Unidentified Human Remains: analyses of DNA samples recovered from unidentified human remains
Relatives of Missing Persons: analyses of DNA samples voluntarily contributed by relatives of missing persons

The CODIS database of convicted offenders currently contains in excess of 3 million records.

The DNA profiles stored in CODIS are based on thirteen specific loci, selected because they exhibit considerable variation among the population.

CODIS utilizes computer software to automatically search these databases for matching DNA profiles. The system also maintains a population file, a database of anonymous DNA profiles used to determine the statistical significance of a match.

CODIS is not a comprehensive criminal database, but rather a system of pointers; the database contains only information necessary for making matches. Profiles stored in CODIS contain a specimen identifier, the sponsoring laboratory's identifier, the initials (or name) of DNA personnel associated with the analysis, and the actual DNA characteristics. CODIS does not store criminal-history information, case-related information, social security numbers, or dates of birth.

When two randomly chosen DNA samples match completely in a large number of regions, such as the thirteen used in the CODIS system, the probability that they could have come from two unrelated people is virtually zero. This fact makes DNA identification extremely reliable (when performed correctly). The degree of reliability is generally measured by using probability theory to determine the likelihood of finding a particular profile among a random selection of the population.

BACK TO THE JENKINS CASE

With their prime suspect cleared because his DNA profile did not match any found at the crime scene, the FBI ran the crime scene DNA profile through the CODIS database to see if a match could be found, but the search came out negative.

Six months later, in November 1999, the DNA profile of the unknown contributor of the blood evidence was sent to the Virginia Division of Forensic Science, where a computer search was carried out to compare the profile against the 101,905 offender profiles in its databank. This time a match was foundâalbeit at only eight of the thirteen CODIS loci, since the Virginia database, being older, listed profiles based on those eight loci only.

The eight-loci match was with a man listed as Robert P. Garrett. A search of law enforcement records revealed that Robert P. Garrett was an alias used by Raymond Anthony Jenkins, an African-American who was serving time in prison for second-degree burglaryâa sentence imposed following his arrest in July 1999, a few weeks after Dolinger was murdered. From that point on, the police investigation focused only on Jenkins.

On November 18, 1999, police interviewed a witnessâa man who was in police custody at the time with several cases pending against himâwho claimed to know Jenkins. This witness reported that on the day after Dolinger's death he had seen Jenkins with several items of jewelry, including a ring with diamonds and some gold chains, and more than $1,000 in cash. Jenkins also appeared to have numerous scratches or cuts to his face, according to government documents.

Seven days later the police executed a search warrant on Jenkins and obtained blood samples. The samples were sent to the FBI's forensic science lab for comparison. In late December 1999, Jenkins' samples were analyzed and profiled on the FBI's thirteen CODIS loci, the eight used by the Virginia authorities plus five others. According to a police affidavit, the resulting profile was “positively identified as being the same DNA profile as that of the DNA profile of the unknown blood evidence that was recovered from the scene of the homicide.” The FBI analysis identified Jenkins' blood on a pair of jeans found in the basement near Dolinger, a shirt found in the upstairs exercise room, a towel on the basement bathroom rack, the sink stopper in the sink of the same bathroom, and a railing between the first and second floors of the residence. The FBI estimated that the probability that a random person selected from the African-American population would share Jenkins' profile is 1 in 26 quintillion. Based on that information, an arrest warrant was issued, and Jenkins was arrested on January 13, 2000.

In April 2000, Raymond Jenkins was formally charged with second-degree murder while armed and in possession of a prohibited weapon, a charge that was superseded in October of the same year by one of two counts of felony murder and one count each of first-degree premeditated murder, first-degree burglary while armed, attempted robbery while armed, and the possession of a prohibited weapon.

Such is the power of DNA profiling, one of the most powerful weapons in the law enforcement agent's arsenal. Yet, as we shall see, that power rests on mathematics as much as on biochemistry, and that power is not obtained without some cost.

THE MATH OF DNA PROFILING

By way of an introductory example, consider a profile based on just three sites. The probability that someone would match a random DNA sample at any one site is roughly one in ten (1/10).
*
So the probability that someone would match a random sample at three sites would be about one in a thousand:

1/10 Ã 1/10 Ã 1/10 = 1/1,000

Applying the same probability calculation to all thirteen sites used in the FBI's CODIS system would mean that the chances of matching a given DNA sample at random in the population are about one in 10 trillion:

(1/10)
¹³= 1/10,000,000,000,000

This figure is known as the random match probability (RMP). It is computed using the product rule for multiplying probabilities, which is valid only if the patterns found in two distinct sites are independent. During the early days of DNA profiling, this was a matter of some considerable debate, but for the most part that issue seems to have largely, though not completely, died away.

Other books

The Tutor's Daughter by Julie Klassen

From Now Until Infinity (2) by Layne Harper

My Autobiography by Charles Chaplin

A Dragon at Worlds' End by Christopher Rowley

Of Daughter and Demon by Elias Anderson

The Girlfriend (Single Wide Female in Love, Book 2) by Lillianna Blake, P. Seymour

The Royal Scamp by Joan Smith

Four Weddings and a Fireman by Jennifer Bernard

No More Secrets: A Small Town Love Story (The Pierce Brothers Book 1) by Lucy Score

The Dragon and the Pearl by Jeannie Lin