Read Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy Online
Authors: Cathy O'Neil
Tags: #Business & Economics, #General, #Social Science, #Statistics, #Privacy & Surveillance, #Public Policy, #Political Science
WMDs, by contrast, tend to favor efficiency. By their very nature, they feed on data that can be measured and counted. But fairness is squishy and hard to quantify. It is a concept. And computers, for all of their advances in language and logic, still struggle mightily with concepts. They “understand” beauty only as a word associated with the Grand Canyon, ocean sunsets, and grooming tips in
Vogue
magazine. They try in vain to measure “friendship” by counting likes and connections on Facebook. And the concept of fairness utterly escapes them. Programmers don’t know how to code for it, and few of their bosses ask them to.
So fairness isn’t calculated into WMDs. And the result is massive, industrial production of
unfairness
. If you think of a WMD as a factory, unfairness is the black stuff belching out of the smoke stacks. It’s an emission, a toxic one.
The question is whether we as a society are willing to sacrifice a bit of efficiency in the interest of fairness. Should we handicap the models, leaving certain data out? It’s possible, for example, that adding gigabytes of data about antisocial behavior might help PredPol predict the mapping coordinates for serious crimes. But this comes at the cost of a nasty feedback loop. So I’d argue that we should discard the data.
It’s a tough case to make, similar in many ways to the battles over wiretapping by the National Security Agency. Advocates of the snooping argue that it’s important for our safety. And those running our vast national security apparatus will keep pushing for more information to fulfill their mission. They’ll continue to
encroach on people’s privacy until they get the message that they must find a way to do their job within the bounds of the Constitution. It might be harder, but it’s necessary.
The other issue is equality. Would society be so willing to sacrifice the concept of probable cause if everyone had to endure the harassment and indignities of stop and frisk? Chicago police have their own stop-and-frisk program. In the name of fairness, what if they sent a bunch of patrollers into the city’s exclusive Gold Coast? Maybe they’d arrest joggers for jaywalking from the park across W. North Boulevard or crack down on poodle pooping along Lakeshore Drive. This heightened police presence would probably pick up more drunk drivers and perhaps uncover a few cases of insurance fraud, spousal abuse, or racketeering. Occasionally, just to give everyone a taste of the unvarnished experience, the cops might throw wealthy citizens on the trunks of their cruisers, wrench their arms, and snap on the handcuffs, perhaps while swearing and calling them hateful names.
In time, this focus on the Gold Coast would create data. It would describe an increase in crime there, which would draw even more police into the fray. This would no doubt lead to growing anger and confrontations. I picture a double parker talking back to police, refusing to get out of his Mercedes, and finding himself facing charges for resisting arrest. Yet another Gold Coast crime.
This may sound less than serious. But a crucial part of justice is equality. And that means, among many other things, experiencing criminal justice equally. People who favor policies like stop and frisk should experience it themselves. Justice cannot just be something that one part of society inflicts upon the other.
The noxious effects of uneven policing, whether from stop and frisk or predictive models like PredPol, do not end when the accused are arrested and booked in the criminal justice sys
tem. Once there, many of them confront another WMD that I discussed in
chapter 1
, the recidivism model used for sentencing guidelines. The biased data from uneven policing funnels right into this model. Judges then look to this supposedly scientific analysis, crystallized into a single risk score. And those who take this score seriously have reason to give longer sentences to prisoners who appear to pose a higher risk of committing other crimes.
And why are nonwhite prisoners from poor neighborhoods more likely to commit crimes? According to the data inputs for the recidivism models, it’s because they’re more likely to be jobless, lack a high school diploma, and have had previous run-ins with the law. And their friends have, too.
Another way of looking at the same data, though, is that these prisoners live in poor neighborhoods with terrible schools and scant opportunities. And they’re highly policed. So the chance that an ex-convict returning to that neighborhood will have another brush with the law is no doubt larger than that of a tax fraudster who is released into a leafy suburb. In this system, the poor and nonwhite are punished more for being who they are and living where they live.
What’s more, for supposedly scientific systems, the recidivism models are logically flawed. The unquestioned assumption is that locking away “high-risk” prisoners for more time makes society safer. It is true, of course, that prisoners don’t commit crimes against society while behind bars. But is it possible that their time in prison has an effect on their behavior once they step out? Is there a chance that years in a brutal environment surrounded by felons might make them more likely, and not less, to commit another crime? Such a finding would undermine the very basis of the recidivism sentencing guidelines. But prison systems, which are awash in data, do not carry out this highly important research.
All too often they use data to justify the workings of the system but not to question or improve the system.
Compare this attitude to the one found at Amazon.com. The giant retailer, like the criminal justice system, is highly focused on a form of recidivism. But Amazon’s goal is the opposite. It wants people to come back again and again to buy. Its software system targets recidivism and encourages it.
Now, if Amazon operated like the justice system, it would start by scoring shoppers as potential recidivists. Maybe more of them live in certain area codes or have college degrees. In this case, Amazon would market more to these people, perhaps offering them discounts, and if the marketing worked, those with high recidivist scores would come back to shop more. If viewed superficially, the results would appear to corroborate Amazon’s scoring system.
But unlike the WMDs in criminal justice, Amazon does not settle for such glib correlations. The company runs a data laboratory. And if it wants to find out what drives shopping recidivism, it carries out research. Its data scientists don’t just study zip codes and education levels. They also inspect people’s experience within the Amazon ecosystem. They might start by looking at the patterns of all the people who shopped once or twice at Amazon and never returned. Did they have trouble at checkout? Did their packages arrive on time? Did a higher percentage of them post a bad review? The questions go on and on, because the future of the company hinges upon a system that learns continually, one that figures out what makes customers tick.
If I had a chance to be a data scientist for the justice system, I would do my best to dig deeply to learn what goes on inside those prisons and what impact those experiences might have on prisoners’ behavior. I’d first look into solitary confinement. Hundreds of thousands of prisoners are kept for twenty-three hours a day in these prisons within prisons, most of them no bigger than a horse
stall. Researchers have found that time in solitary produces deep feelings of hopelessness and despair. Could that have any impact on recidivism? That’s a test I’d love to run, but I’m not sure the data is even collected.
How about rape? In
Unfair: The New Science of Criminal Injustice
, Adam Benforado writes that certain types of prisoners are targeted for rape in prisons. The young and small of stature are especially vulnerable, as are the mentally disabled. Some of these people live for years as sex slaves. It’s another important topic for analysis that anyone with the relevant data and expertise could work out, but prison systems have thus far been uninterested in cataloging the long-term effects of this abuse.
A serious scientist would also search for positive signals from the prison experience. What’s the impact of more sunlight, more sports, better food, literacy training? Maybe these factors will improve convicts’ behavior after they go free. More likely, they’ll have varying impact. A serious justice system research program would delve into the effects of each of these elements, how they work together, and which people they’re most likely to help. The goal, if data were used constructively, would be to optimize prisons—much the way companies like Amazon optimize websites or supply chains—for the benefit of both the prisoners and society at large.
But prisons have every incentive to avoid this data-driven approach. The PR risks are too great—no city wants to be the subject of a scathing report in the
New York Times
. And, of course, there’s big money riding on the overcrowded prison system.
Privately run prisons, which house only 10 percent of the incarcerated population, are a $5 billion industry. Like airlines, the
private prisons make profits only when running at high capacity. Too much poking and prodding might threaten that income source.
So instead of analyzing prisons and optimizing them, we deal
with them as black boxes. Prisoners go in and disappear from our view. Nastiness no doubt occurs, but behind thick walls. What goes on in there? Don’t ask. The current models stubbornly stick to the dubious and unquestioned hypothesis that more prison time for supposedly high-risk prisoners makes us safer. And if studies appear to upend that logic, they can be easily ignored.
And this is precisely what happens. Consider a recidivism study by
Michigan economics professor Michael Mueller-Smith. After studying 2.6 million criminal court records in Harris County, Texas, he concluded that the longer inmates in Harris County, Texas, spent locked up, the greater the chance that they would fail to find employment upon release, would require food stamps and other public assistance, and would commit further crimes. But to turn those conclusions into smart policy and better justice, politicians will have to take a stand on behalf of a feared minority that many (if not most) voters would much prefer to ignore. It’s a tough sell.
Stop and frisk may seem intrusive and unfair, but in short time it will also be viewed as primitive. That’s because police are bringing back tools and techniques from the global campaign against terrorism and focusing them on local crime fighting. In San Diego, for example, police are not only asking the people they stop for identification, or frisking them. On occasion, they also take photos of them with iPads and send them to a cloud-based facial recognition service, which matches them against a database of criminals and suspects. According to a report in the
New York Times
,
San Diego police used this facial recognition program on 20,600 people between 2011 and 2015. They also probed many of them with mouth swabs to harvest DNA.
Advances in facial recognition technology will soon allow for
much broader surveillance.
Officials in Boston, for example, were considering using security cameras to scan thousands of faces at outdoor concerts. This data would be uploaded to a service that could match each face against a million others per second. In the end, officials decided against it. Concern for privacy, on that occasion, trumped efficiency. But this won’t always be the case.
As technology advances, we’re sure to see a dramatic growth of surveillance. The good news, if you want to call it that, is that once thousands of security cameras in our cities and towns are sending up our images for analysis, police won’t have to discriminate as much. And the technology will no doubt be useful for tracking down suspects, as happened in the Boston Marathon bombing. But it means that we’ll all be subject to a digital form of stop and frisk, our faces matched against databases of known criminals and terrorists.
The focus then may well shift toward spotting
potential
lawbreakers—not just neighborhoods or squares on a map but individuals. These preemptive campaigns, already well established in the fight against terrorism, are a breeding ground for WMDs.
In 2009,
the Chicago Police Department received a $2 million grant from the National Institute of Justice to develop a predictive program for crime. The theory behind Chicago’s winning application was that with enough research and data they might be able to demonstrate that the spread of crime, like epidemics, follows certain patterns. It can be predicted and, hopefully, prevented.
The scientific leader of the Chicago initiative was Miles Wernick, the director of the Medical Imaging Research Center at the Illinois Institute of Technology (IIT). Decades earlier, Wernick had helped the US military analyze data to pick out battlefield targets. He had since moved to medical data analysis, including the progression of dementia. But like most data scientists, he didn’t see his expertise as tethered to a specific industry. He
spotted patterns. And his focus in Chicago would be the patterns of crime, and of criminals.
The early efforts of Wernick’s team focused on singling out hot spots for crime, much as PredPol does. But the Chicago team went much further. They developed a list of
the approximately four hundred people most likely to commit a violent crime. And it ranked them on the probability that they would be involved in a homicide.