Data and Goliath (4 page)

Read Data and Goliath Online

Authors: Bruce Schneier

BOOK: Data and Goliath
10.67Mb size Format: txt, pdf, ePub

I did a quick experiment with Google’s autocomplete feature. This is the feature that
offers to finish typing your search queries in real time, based on what other people
have typed. When I typed “should I tell my w,” Google suggested “should i tell my
wife i had an affair” and “should i tell my work about dui” as the most popular completions.
Google knows who clicked on those completions, and everything else they ever searched
on.

Google’s CEO Eric Schmidt admitted as much in 2010: “We know where you are. We know
where you’ve been. We can more or less know what you’re thinking about.”

If you have a Gmail account, you can check for yourself. You can look at your search
history for any time you were logged in. It goes back for as long as you’ve had the
account, probably for years. Do it; you’ll be surprised.
It’s more intimate than if you’d sent Google your diary. And even though Google lets
you modify your ad preferences, you have no rights to delete anything you don’t want
there.

There are other sources of intimate data and metadata. Records of your purchasing
habits reveal a lot about who you are. Your tweets tell the world what time you wake
up in the morning, and what time you go to bed each night. Your buddy lists and address
books reveal your political affiliation and sexual orientation. Your e-mail headers
reveal who is central to your professional, social, and romantic life.

One way to think about it is that data is content, and metadata is context. Metadata
can be much more revealing than data, especially when collected in the aggregate.
When you have one person under surveillance, the contents of conversations, text messages,
and e-mails can be more important than the metadata. But when you have an entire population
under surveillance, the metadata is far more meaningful, important, and useful.

As former NSA general counsel Stewart Baker said, “Metadata absolutely tells you everything
about somebody’s life. If you have enough metadata you don’t really need content.”
In 2014, former NSA and CIA director Michael Hayden remarked, “We kill people based
on metadata.”

The truth is, though, that the difference is largely illusory. It’s all data about
us.

CHEAPER SURVEILLANCE

Historically, surveillance was difficult and expensive. We did it only when it was
important: when the police needed to tail a suspect, or a business required a detailed
purchasing history for billing purposes. There were exceptions, and they were extreme
and expensive. The exceptionally paranoid East German government had 102,000 Stasi
surveilling a population of 17 million: that’s one spy for every 166 citizens, or
one for every 66 if you include civilian informants.

Corporate surveillance has grown from collecting as little data as necessary to collecting
as much as possible. Corporations always collected information on their customers,
but in the past they didn’t collect very much of it and held it only as long as necessary.
Credit card companies
collected only the information about their customers’ transactions that they needed
for billing. Stores hardly ever collected information about their customers, and mail-order
companies only collected names and addresses, and maybe some purchasing history so
they knew when to remove someone from their mailing list. Even Google, back in the
beginning, collected far less information about its users than it does today. When
surveillance information was expensive to collect and store, corporations made do
with as little as possible.

The cost of computing technology has declined rapidly in recent decades. This has
been a profoundly good thing. It has become cheaper and easier for people to communicate,
to publish their thoughts, to access information, and so on. But that same decline
in price has also brought down the price of surveillance. As computer technologies
improved, corporations were able to collect more information on everyone they did
business with. As the cost of data storage became cheaper, they were able to save
more data and for a longer time. As big data analysis tools became more powerful,
it became profitable to save more information. This led to the surveillance-based
business models I’ll talk about in Chapter 4.

Government surveillance has gone from collecting data on as few people as necessary
to collecting it on as many as possible. When surveillance was manual and expensive,
it could only be justified in extreme cases. The warrant process limited police surveillance,
and resource constraints and the risk of discovery limited national intelligence surveillance.
Specific individuals were targeted for surveillance, and maximal information was collected
on them alone. There were also strict minimization rules about not collecting information
on other people. If the FBI was listening in on a mobster’s phone, for example, the
listener was supposed to hang up and stop recording if the mobster’s wife or children
got on the line.

As technology improved and prices dropped, governments broadened their surveillance.
The NSA could surveil large groups—the Soviet government, the Chinese diplomatic corps,
leftist political organizations and activists—not just individuals. Roving wiretaps
meant that the FBI could eavesdrop on people regardless of the device they used to
communicate with. Eventually, US agencies could spy on entire populations and save
the data for years. This dovetailed with a changing threat, and they
continued espionage against specific governments, while expanding mass surveillance
of broad populations to look for potentially dangerous individuals. I’ll talk about
this in Chapter 5.

The result is that corporate and government surveillance interests have converged.
Both now want to know everything about everyone. The motivations are different, but
the methodologies are the same. That is the primary reason for the strong public-private
security partnership that I’ll talk about in Chapter 6.

To see what I mean about the cost of surveillance technology, just look how cheaply
ordinary consumers can obtain sophisticated spy gadgets. On a recent flight, I was
flipping through an issue of
SkyMall
, a catalog that airlines stick in the pocket of every domestic airplane seat. It
offered an $80 pen with a hidden camera and microphone, so I could secretly record
any meeting I might want evidence about later. I can buy a camera hidden in a clock
radio for $100, or one disguised to look like a motion sensor alarm unit on a wall.
I can set either one to record continuously or only when it detects motion. Another
device allows me to see all the data on someone else’s smartphone—either iPhone or
Android—assuming I can get my hands on it. “Read text messages even after they’ve
been deleted. See photos, contacts, call histories, calendar appointments and websites
visited. Even tap into the phone’s GPS data to find out where it’s been.” Only $120.

From other retailers I can buy a keyboard logger, or keylogger, to learn what someone
else types on her computer—assuming I have physical access to it—for under $50. I
can buy call intercept software to listen in on someone else’s cell phone calls for
$100. Or I can buy a remote-controlled drone helicopter with an onboard camera and
use it to spy on my neighbors for under $1,000.

These are the consumer items, and some of them are illegal in some jurisdictions.
Professional surveillance devices are also getting cheaper and better. For the police,
the declining costs change everything. Following someone covertly, either on foot
or by car, costs around $175,000 per month—primarily for the salary of the agents
doing the following. But if the police can place a tracker in the suspect’s car, or
use a fake cell tower device to fool the suspect’s cell phone into giving up its location
information, the cost drops to about $70,000 per month, because it only
requires one agent. And if the police can hide a GPS receiver in the suspect’s car,
suddenly the price drops to about $150 per month—mostly for the surreptitious installation
of the device. Getting location information from the suspect’s cell provider is even
cheaper: Sprint charges law enforcement only $30 per month.

The difference is between fixed and marginal costs. If a police department performs
surveillance on foot, following two people costs twice as much as following one person.
But with GPS or cell phone surveillance, the cost is primarily for setting up the
system. Once it is in place, the additional marginal cost of following one, ten, or
a thousand more people is minimal. Or, once someone spends the money designing and
building a telephone eavesdropping system that collects and analyzes all the voice
calls in Afghanistan, as the NSA did to help defend US soldiers from improvised explosive
devices, it’s cheap and easy to deploy that same technology against the telephone
networks of other countries.

Mass Surveillance

The result of this declining cost of surveillance technology is not just a difference
in price; it’s a difference in kind. Organizations end up doing more surveillance—a
lot more. For example, in 2012, after a Supreme Court ruling, the FBI was required
to either obtain warrants for or turn off 3,000 GPS surveillance devices installed
in cars. It would simply be impossible for the FBI to follow 3,000 cars without automation;
the agency just doesn’t have the manpower. And now the prevalence of cell phones means
that everyone can be followed, all of the time.

Another example is license plate scanners, which are becoming more common. Several
companies maintain databases of vehicle license plates whose owners have defaulted
on their auto loans. Spotter cars and tow trucks mount cameras on their roofs that
continually scan license plates and send the data back to the companies, looking for
a hit. There’s big money to be made in the repossession business, so lots of individuals
participate—all of them feeding data into the companies’ centralized databases. One
scanning company, Vigilant Solutions of Livermore, California, claims to have 2.5
billion records and collects 70 million scans in the US per month, along with date,
time, and GPS location information.

In
addition to repossession businesses, scanning companies also sell their data to divorce
lawyers, private investigators, and others. They sometimes relay it, in real time,
to police departments, which combine it with scans they get from interstate highway
on-ramps, toll plazas, border crossings, and airport parking lots. They’re looking
for stolen vehicles and drivers with outstanding warrants and unpaid tickets. Already,
the states’ driver’s license databases are being used by the FBI to identify people,
and the US Department of Homeland Security wants all this data in a single national
database. In the UK, a similar government-run system based on fixed cameras is deployed
throughout the country. It enforces London’s automobile congestion charge system,
and searches for vehicles that are behind on their mandatory inspections.

Expect the same thing to happen with automatic face recognition. Initially, the data
from private cameras will most likely be used by bounty hunters tracking down bail
jumpers. Eventually, though, it will be sold for other uses and given to the government.
Already the FBI has a database of 52 million faces, and facial recognition software
that’s pretty good. The Dubai police are integrating custom facial recognition software
with Google Glass to automatically identify suspects. With enough cameras in a city,
police officers will be able to follow cars and people around without ever leaving
their desks.

This is mass surveillance, impossible without computers, networks, and automation.
It’s not “follow that car”; it’s “follow every car.” Police could always tail a suspect,
but with an urban mesh of cameras, license plate scanners, and facial recognition
software, they can tail everyone—suspect or not.

Similarly, putting a device called a pen register on a suspect’s land line to record
the phone numbers he calls used to be both time-consuming and expensive. But now that
the FBI can demand that data from the phone companies’ databases, it can acquire that
information about everybody in the US. And it has.

In 2008, the company Waze (acquired by Google in 2013) introduced a new navigation
system for smartphones. The idea was that by tracking the movements of cars that used
Waze, the company could infer real-time traffic data and route people to the fastest
roads. We’d all like to avoid traffic jams. In fact, all of society, not just Waze’s
customers, benefits when
people are steered away from traffic jams so they don’t add to them. But are we aware
of how much data we’re giving away?

For the first time in history, governments and corporations have the ability to conduct
mass surveillance on entire populations. They can do it with our Internet use, our
communications, our financial transactions, our movements . . . everything. Even the
East Germans couldn’t follow everybody all of the time. Now it’s easy.

HIDDEN SURVEILLANCE

If you’re reading this book on a Kindle, Amazon knows. Amazon knows when you started
reading and how fast you read. The company knows if you’re reading straight through,
or if you read just a few pages every day. It knows if you skip ahead to the end,
go back and reread a section, or linger on a page—or if you give up and don’t finish
the book. If you highlight any passages, Amazon knows about that, too. There’s no
light that flashes, no box that pops up, to warn you that your Kindle is sending Amazon
data about your reading habits. It just happens, quietly and constantly.

We tolerate a level of electronic surveillance online that we would never allow in
the physical world, because it’s not obvious or advertised. It’s one thing for a clerk
to ask to see an ID card, or a tollbooth camera to photograph a license plate, or
an ATM to ask for a card and a PIN. All of these actions generate surveillance records—the
first case may require the clerk to copy or otherwise capture the data on the ID card—but
at least they’re overt. We know they’re happening.

Other books

Orphan of the Sun by Gill Harvey
Play It Again, Spam by Tamar Myers
Walking to Camelot by John A. Cherrington
Greyhound by Piper, Steffan
Ethan: Lord of Scandals by Grace Burrowes
Invitation to Love by Lee, Groovy
See You at the Show by Michelle Betham
The Ruby Dice by Catherine Asaro