Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy (30 page)

Read Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy Online

Authors: Cathy O'Neil

Tags: #Business & Economics, #General, #Social Science, #Statistics, #Privacy & Surveillance, #Public Policy, #Political Science

BOOK: Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy
7Mb size Format: txt, pdf, ePub

My job, in many ways, was to help come up with a recidivism
model. Much like the analysts building the LSI–R model, I was interested in the forces that pushed people back to shelters and also those that led them to stable housing. Unlike the sentencing WMD, though, our small group was concentrating on using these findings to help the victims and to reduce homelessness and despair. The goal was to create a model for the common good.

On a separate but related project, one of the other researchers had found an extremely strong correlation, one that pointed to a solution. A certain group of homeless families tended to disappear from shelters and never return. These were the ones who had been granted vouchers under a federal affordable housing program called Section 8. This shouldn’t have been too surprising. If you provide homeless families with affordable housing, not too many of them will opt for the streets or squalid shelters.

Yet that conclusion might have been embarrassing to then-mayor Michael Bloomberg and his administration. With much fanfare, the city government had moved to
wean families from Section 8. It instituted a new system called Advantage, which limited subsidies to three years. The idea was that the looming expiration of their benefits would push poor people to make more money and pay their own way. This proved optimistic, as the data made clear. Meanwhile, New York’s booming real estate market was driving up rents, making the transition even more daunting. Families without Section 8 vouchers streamed back into the shelters.

The researcher’s finding was not welcome. For a meeting with important public officials, our group prepared a PowerPoint presentation about homelessness in New York. After the slide with statistics about recidivism and the effectiveness of Section 8 was put up, an extremely awkward and brief conversation took place. Someone demanded the slide be taken down. The party line prevailed.

While Big Data, when managed wisely, can provide important insights, many of them will be disruptive. After all, it aims to find patterns that are invisible to human eyes. The challenge for data scientists is to understand the ecosystems they are wading into and to present not just the problems but also their possible solutions. A simple workflow data analysis might highlight five workers who appear to be superfluous. But if the data team brings in an expert, they might help discover a more constructive version of the model. It might suggest jobs those people could fill in an optimized system and might identify the training they’d need to fill those positions. Sometimes the job of a data scientist is to know when you don’t know enough.

As I survey the data economy, I see loads of emerging mathematical models that might be used for good and an equal number that have the potential to be great—if they’re not abused. Consider the work of Mira Bernstein, a slavery sleuth. A Harvard PhD in math, she created a model to scan vast industrial supply chains, like the ones that put together cell phones, sneakers, or SUVs, to find signs of forced labor. She built her slavery model for a nonprofit company called
Made in a Free World. Its goal is to use the model to help companies root out the slave-built components in their products. The idea is that companies will be eager to free themselves from this scourge, presumably because they oppose slavery, but also because association with it could devastate their brand.

Bernstein collected data from a number of sources, including trade data from the United Nations, statistics about the regions where slavery was most prevalent, and detailed information about the components going into thousands of industrial products, and incorporated it all into a model that could score a given product from a certain region for the likelihood that it was made using slave labor. “The idea is that the user would contact his supplier
and say, ‘Tell me more about where you’re getting the following parts of your computers,’ ” Bernstein told
Wired
magazine. Like many responsible models, the slavery detector does not overreach. It merely points to suspicious places and leaves the last part of the hunt to human beings. Some of the companies find, no doubt, that the suspected supplier is legit. (Every model produces false positives.) That information comes back to Made in a Free World, where Bernstein can study the feedback.

Another model for the common good has emerged in the field of social work. It’s a predictive model that pinpoints households where children are most likely to suffer abuse. The model, developed by
Eckerd, a child and family services nonprofit in the southeastern United States, launched in 2013 in Florida’s Hillsborough County, an area encompassing Tampa. In the previous two years, nine children in the area had died from abuse, including a baby who was thrown out a car window. The modelers included 1,500 child abuse cases in their database, including the fatalities. They found a number of markers for abuse, including a boyfriend in the home, a record of drug use or domestic violence, and a parent who had been in foster care as a child.

If this were a program to target potential criminals, you can see right away how unfair it could be. Having lived in a foster home or having an unmarried partner in the house should not be grounds for suspicion. What’s more, the model is much more likely to target the poor—and to give a pass to potential abuse in wealthy neighborhoods.

Yet if the goal is not to punish the parents, but instead to provide help to children who might need it, a potential WMD turns benign. It funnels resources to families at risk. And in the two years following implementation of the model, according to the
Boston Globe
, Hillsborough County suffered no fatalities from child abuse.

Models like this will abound in coming years, assessing our risk of osteoporosis or strokes, swooping in to help struggling students with calculus II, even predicting the people most likely to suffer life-altering falls. Many of these models, like some of the WMDs we’ve discussed, will arrive with the best intentions. But they must also deliver transparency, disclosing the input data they’re using as well as the results of their targeting. And they must be open to audits. These are powerful engines, after all. We must keep our eyes on them.

Data is not going away. Nor are computers—much less mathematics. Predictive models are, increasingly, the tools we will be relying on to run our institutions, deploy our resources, and manage our lives. But as I’ve tried to show throughout this book, these models are constructed not just from data but from the choices we make about which data to pay attention to—and which to leave out. Those choices are not just about logistics, profits, and efficiency. They are fundamentally moral.

If we back away from them and treat mathematical models as a neutral and inevitable force, like the weather or the tides, we abdicate our responsibility. And the result, as we’ve seen, is WMDs that treat us like machine parts in the workplace, that blackball employees and feast on inequities. We must come together to police these WMDs, to tame and disarm them. My hope is that they’ll be remembered, like the deadly coal mines of a century ago, as relics of the early days of this new revolution, before we learned how to bring fairness and accountability to the age of data. Math deserves much better than WMDs, and democracy does too.

*
1
You might think that an evenhanded audit would push to eliminate variables such as race from the analysis. But if we’re going to measure the impact of a WMD, we need that data. Currently, most of the WMDs avoid directly tracking race. In many cases, it’s against the law. It is easier, however, to expose racial discrimination in mortgage lending than in auto loans, because mortgage lenders are required to ask for the race of the applicant, while auto lenders are not. If we include race in the analysis, as the computer scientist Cynthia Dwork has noted, we can quantify racial injustice where we find it. Then we can publicize it, debate the ethics, and propose remedies. Having said that, race is a social construct and as such is difficult to pin down even when you intend to, as any person of mixed race can tell you.

*
2
Google has expressed interest in working to eliminate bias from its algorithm, and some Google employees briefly talked to me about this. One of the first things I tell them is to open the platform to more outside researchers.

NOTES

INTRODUCTION

one out of every two:
Robert Stillwell,
Public School Graduates and Dropouts from the Common Core of Data: School Year
2006–07, NCES 2010-313 (Washington, DC: National Center for Education Statistics, Institute of Education Sciences, US Department of Education, 2009), 5,
http://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2010313
.

8
percent of eighth graders:
Jihyun Lee, Wendy S. Grigg, and Gloria S. Dion,
The Nation’s Report Card Mathematics
2007, NCES 2007-494 (Washington, DC: National Center for Education Statistics, Institute of Education Sciences, US Department of Education, 2007), 32,
https://nces.ed.gov/nationsreportcard/pdf/main2007/2007494.pdf
.

Rhee developed a teacher assessment tool:
Bill Turque, “Rhee Dismisses 241 D.C. Teachers; Union Vows to Contest Firings,”
Washington Post
, July 24, 2010,
www.washingtonpost.com/wp-dyn/content/article/2010/07/23/AR2010072303093.html
.

the district fired all the teachers:
Steven Sawchuck, “Rhee to Dismiss Hundreds of Teachers for Poor Performance,”
Education Week Blog
, July 23, 2010,
http://blogs.edweek.org/edweek/teacherbeat/2010/07/_states_and_districts_across.html
.

another
5
percent, or
205
teachers:
Bill Turque, “206 Low-Performing D.C. Teachers Fired,”
Washington Post
, July 15, 2011,
www.washingtonpost.com/local/education/206-low-performing-dc-teachers-fired/2011/07/15/gIQANEj5GI_story.html
.

Sarah Wysocki, a fifth-grade teacher:
Bill Turque, “ ‘Creative…Motivating’ and Fired,”
Washington Post
, March 6, 2012,
www.washingtonpost.com/local/education/creative—motivating-and-fired/2012/02/04/gIQAwzZpvR_story.html
.

One evaluation praised her:
Ibid.

Wysocki received a miserable score:
Ibid.

represented half of her overall evaluation:
Ibid.

The district had hired a consultancy:
Ibid.

“There are so many factors”:
Sarah Wysocki, e-mail interview by author, August 6, 2015.

a math teacher named Sarah Bax:
Guy Brandenburg, “DCPS Administrators Won’t or Can’t Give a DCPS Teacher the IMPACT Value-Added Algorithm,”
GFBrandenburg’s Blog
, February 27, 2011,
https://gfbrandenburg.wordpress.com/2011/02/27/dcps-administrators-wont-or-cant-give-a-dcps-teacher-the-impact-value-added-algorithm/
.

29
percent of the students:
Turque, “ ‘Creative…Motivating’ and Fired.”

USA Today
revealed a high level:
Jack Gillum and Marisol Bello, “When Standardized Test Scores Soared in D.C., Were the Gains Real?,”
USA Today
, March 30, 2011,
http://usatoday30.usatoday.com/news/education/2011-03-28-1Aschooltesting28_CV_N.htm
.

bonuses of up to
$8,000
:
Ibid.

the erasures were “suggestive”:
Turque, “ ‘Creative…Motivating’ and Fired.”

Sarah Wysocki was out of a job:
Ibid.

CHAPTER 1

Boudreau, perhaps out of desperation:
David Waldstein, “Who’s on Third? In Baseball’s Shifting Defenses, Maybe Nobody,”
New York Times
, May 12, 2014,
www.nytimes.com/2014/05/13/sports/baseball/whos-on-third-in-baseballs-shifting-defenses-maybe-nobody.html?_r=0
.

Moneyball
:
Michael Lewis,
Moneyball: The Art of Winning an Unfair Game
(New York: W. W. Norton, 2003).

Other books

Armored Hearts by Angela Knight
Slow Learner by Thomas Pynchon
F Train by Richard Hilary Weber
A New Yorker's Stories by Philip Gould
The Dark Queen by Williams, Michael
A Corpse for Yew by Joyce, Jim Lavene
The Curse-Maker by Kelli Stanley
The Dead by Charlie Higson
Aurora: CV-01 by Brown, Ryk