Read Black Code: Inside the Battle for Cyberspace Online
Authors: Ronald J. Deibert
Tags: #Social Science, #True Crime, #Computers, #Nonfiction, #Cybercrime, #Security, #Retail
Signals intelligence gathering is highly secretive, but it is a world we should all get to know better. Originally, the objects of sigint operations were other states’ military and intelligence agencies: ballistic missile-test telemetry or operational instructions sent by high-ranking Politburo members. As the Cold War came to a close, however, this bipolar conflict atomized into a multitude of national security threats, some of which emanate from transnational terrorist groups and organized crime, and the scope of sigint operations became much broader and more widely dispersed across global civil society. As the volume of data flowing through global networks is exploding in all directions, and the tools to undertake signals intelligence have become more refined, cheaper, and easier to use, the application to cyberspace is obvious.
• • •
Although cyberspace is often
experienced as an ethereal world separate from physical reality, it is supported by a very real infrastructure, a tangible network of code, applications, wires, and radio waves. Behind every tweet, chat message, or Facebook update, there is also a complex labyrinth of machinery, cables and pipes buried in trenches deep beneath the ocean, and thousands of orbiting satellites, some the size of school buses. In addition to
being complex and fragile, this physical infrastructure contains a growing number of filters and chokepoints. Pulling back its layers is like pulling back curtains into dark hallways and hidden recesses, which, it turns out, are also objects of intense political contests.
There is another component of cyberspace, separate from its physical infrastructure, but that is also growing in leaps and bounds and becoming a critical part of the domain: the data. Information related to each and every one of us (and everything we do) is taking on a life of its own. It, too, has become an object of geopolitical struggle. Every call we make, every text and email we send, increasingly everything we do as we go about our daily lives, is recorded as a data point, a piece of information in the ever-expanding world of “Big Data” that is insinuating itself deeper and deeper into our lives and the communications environment in which we live.
From August 31, 2009
to February 28, 2010, German citizen
Malte Spitz had virtually every moment of his life tracked – every step he took, where he slept and shopped, flights and train trips he booked, every person he communicated with, every Internet connection he made. All of his movements and communications were cross-checked against open-source information that could be found out about him, including his Twitter, blog, and website entries. The surveillance net around him was total, and all this information was dutifully archived. In short, someone, somewhere, knew Malte Spitz better than he knew himself.
Who was behind it? Was it the Bundesamt für Verfassungsschutz, Germany’s formidable domestic intelligence agency responsible for monitoring threats to the German state? Did they plant a bug on him? Tap his phone lines? What did Spitz do to warrant such attention? Was he a criminal? A terrorist? A long-lost member of the 1970s-era Baader-Meinhof gang? None of the above.
Malte Spitz is a Green Party politician with a clean record. Deutsche Telekom, Germany’s largest cellphone company, collected the data on him through his mobile phone, but it was Spitz himself (along with Germany’s leading newspaper,
Die Zeit
) who collated it on an interactive map. He did so to demonstrate to the public the volume of data mobile carriers routinely collect about their users. Spitz asked Deutsche Telekom to send him all of the
information they had on him. After several persistent appeals and the threat of a lawsuit, the company finally complied, sending Spitz a CD containing 35,830 lines of data. “Seen individually, the pieces of data are mostly inconsequential and harmless,” wrote
Die Zeit
,” [but] taken together, they provide what investigators call a profile – a clear picture of a person’s habits and preferences, and indeed, of his or her life.”
• • •
On a daily basis
, most of us experience a dynamic and interactive communications ecosystem that only two decades ago was the stuff of science fiction. And today, after perhaps a decade of near total immersion, it is almost impossible for most people in the West to imagine going back to a world before instant access and 24/7 connectivity, a bustling tableau of images, text, and sounds always at our fingertips. As with any such wholesale social change, we should expect unintended consequences, not all of them desirable. Past experiences with the printing press, telegraph, radio, and television tell us that new media environments shape and constrain the realm of the possible, favouring some social forces and ideas over others. The world of “big data” is no exception.
Understood by computer engineers as data sets that grow so large that they become awkward to work with and/or analyze using standard database management tools, I like to think of big data in metaphorical terms: as endless digital grains of sand on an ever-expanding beach that we produce as we act in cyberspace. Big data comes from everywhere: from space satellites used to gather climate information to lunchtime jokes on social media sites; from digital pictures and videos posted online to transaction records from grocery stores; from signals emitted by our mobile phones to information buried in the packet headers of our emails. Every day, 2.5
quintillion bytes of data are created, and 90 percent of the data in the world today was created in the past two years. According to Dave Turek, IBM’S VP of exascale computing, from the beginning of recorded time up until 2003, humans created five “exabytes” of information (an exabyte = 1,000,000,000 gigabytes). In 2011, Turek estimates we produced that same amount of information every two days.
IBM predicts that in 2013, we will be producing five exabytes every ten minutes. And it only grows. To take just one example: the Square Kilometer Array (SKA) telescope complex, currently under development by a consortium of countries and set to deploy in Australia in 2024, will produce one exabyte of data every day, roughly twice the volume of daily global Internet traffic in 2012.
Most of this data – searches, software downloads, music purchases, tweets, Skype calls, et cetera – comes from ordinary people going about their ordinary lives. In 2011, 200 million tweets were posted every day (and over 30 billion have been written and sent since Twitter’s launch in 2006). Every sixty seconds, 168 million emails were sent, nearly 700,000 Google searches and Facebook status updates made, 375,000 Skype calls initiated, and 13,000 new iPhone apps downloaded.
Mobile forms of connectivity, including smartphones and tablets, have massively increased this volume of data. Being untethered to a fixed location allows us to be always on, always connected, always communicating. According to the multinational telecommunications company Cisco Systems, in 2012, and for the fifth year in a row,
mobile data traffic more than doubled. It is expected that the number of mobile-connected devices will exceed the world’s population by 2013; that is, there will be at least one operative mobile device for every human on the planet, and people will be constantly searching, texting, linking, networking, sharing, photographing, recording, purchasing. The proliferation of highend handsets, tablets, and laptops on mobile networks, all major
generators of data owing to the more detailed information experiences they support, will make up a greater proportion of the market. Says Cisco, a “single smartphone can generate as much traffic as 35 basic-feature phones; a tablet as much traffic as 121 basic-feature phones; and a single laptop can generate as much traffic as 498 basic-feature phones.” Mobile data traffic is likely to grow at a compound annual rate of nearly 80 percent, reaching some eleven exabytes per month by 2016. We are immersed in a weightless but dense cloud of bits and bytes, percolating everywhere.
• • •
The data we create
contains not just the information we send or interact with, but data about the data, or metadata. We rarely experience metadata directly, as it is buried in instructions and communications several layers below our interactions with our devices. But we can see it when we download photographs onto our computers or upload them to a photo-sharing site like Flickr. When we do so, we might notice that embedded in that digital photograph is data on the model of camera used to take the picture, the exact time it was taken, and the longitude and latitude of where it was taken (should such settings be activated by the user). Digital music and movie files typically contain metadata on the artist, album, date of the recording, and copyright information. Metadata on a mobile phone can contain information about a user’s number, receiver’s number, geographic coordinates of where the message was sent, date and time of the message, duration of any particular call, amount of data transmitted, and the cost of the transmission. Average users may have thousands of data points like these collected from them every day as they communicate through cyberspace. A typical smartphone emits a signal every few seconds, a “beacon” to nearby cellphone towers or wifi hotspots in order to
triangulate the most efficient connection for the device. (It was this automatic beaconing that led Mark Wuergler to identify the security vulnerabilities on Apple products described in the previous chapter). Every call, text, or email we send via mobile phones yields space-time coordinates accurate to within metres of where we are and with whom we communicate. This information is stored on a server somewhere, or in multiple places – on “clouds” of computers – spread out across the physical infrastructure of cyberspace. It is an embodiment of us, a kind of cyberspace biography and activity chart, and we have little control over it.
Enthusiasts say that this world of big data is a gift. Google engineers, for instance, show us through their Flu Trends project how they can harness the information collected from millions of real-time queries to predict the location and timing of disease outbreaks.
Simply by collating the number, location, and frequency of search queries for symptoms, insight of planetary significance and proportions is gained. If enough users in Chicago or San Francisco simultaneously search for information about a fever, Google can spot a virus before it spreads with a greater degree of accuracy than tools specifically designed to issue early-warning alerts employed by the U.S. Centers for Disease Control.
This ability to identify large-scale patterns can lead to new opportunities for humanitarian aid and development assistance, even in the most impoverished and dangerous of environments. In Haiti, for example, researchers used mobile-data patterns to monitor the movement of refugees and health risks following the massive hurricanes that slammed into that small island country in 2010. Crowd-sourcing data through the Ushahidi platform – a free and open-source software tool developed for information collection, visualization, and interactive mapping after the 2008 Kenyan election – is used to monitor elections, conflicts, and numerous other issues around the world. The LRA Crisis Tracker uses
crowd-sourced data plotted on Ushahidi from radios distributed to local communities and other means to monitor atrocities undertaken by the Lord’s Resistance Army (LRA), responsible for one of the most ruthless insurgencies in Africa. Each LRA-related incident is plotted on a map by type – civilian death, injury, abduction, looting – and once consolidated, the map shows the movements of the LRA across the region, and the scope, scale, and frequency of its actions. Incidents captured by cellphone cameras are linked to specific events on the website as corresponding evidence.
In Kibera, Nairobi, Kenya’s largest slum, an experiment in crowd-sourcing data may revolutionize access to basic health care and sanitary services. Conditions in Kibera are dire: most residents are illegal squatters, and local officials regularly withhold basic services, including electricity, sewage treatment, and garbage collection. The most important commodity, water, is extremely scarce – turned on and off by capricious officials, and grossly overpriced by private dealers. Despite the poverty, over 70 percent of Kiberans have mobile phones. They are cheap, plentiful, and can save lives.
Researchers at Stanford University are testing an app called M-Maji (“mobile water” in Swahili), which sends users text messages with up-to-date information on the location, price, and quality of water available from different vendors. They believe that this project can be replicated in impoverished communities around the world.
There are countless examples of big data being used to achieve such social goods, but such a rapid transformation of a global communications environment rarely avoids unanticipated negative consequences. To understand these, we need first to understand the political economy of big data, and this boils down to a simple question: Why are we able to use Gmail, Facebook, Twitter, and other cyber services for free?
• • •
“There is no free lunch,”
the old saying goes, and to that we should add “and no free tweet, either.” The business model of big data rests on the repurposing of that which all of us routinely give away. Not surprisingly, the market to harvest the digital grains of sand on that constantly expanding beach has exploded: companies of all shapes and sizes systematically pick through our digital droppings, collating them, passing them around, inspecting them, and feeding them back to us. And this market shows no sign of slowing. In 2012, the open-source analyst firm Wikibon reported that
the big-data market stood at just over $5 billion and predicted that it will grow to $50 billion by 2017. ISPs, web-hosting companies, cloud and mobile providers, massive telecommunications and financial companies, and a host of other new digital market organisms digest and process unimaginably large volumes of information about each and every one of us, each and every day, and it is then sold back to us as “value-added” products, services, or advertisements for yet more products and services!
Social networks may seem like secure, even cozy, playgrounds, but they are more like vacuum cleaners that hoover up every click and shared link, every status change, every tag and piece of personal history. As Facebook states frankly in its data-use policy, the company uses “the information we receive to deliver ads and to make them more relevant to you. This includes all of the things you share and do on Facebook, such as the Pages you like or key words from your stories, and the things we infer from your use of Facebook.” Facebook “likes” are translated into
customized dating and vacation ads; geolocation data is used to advertise local products. Not a single bit or byte is ignored: the companies involved reap what we sow. Freedom in cyberspace is just another word for nothing left unused.