Read Rise of the Robots: Technology and the Threat of a Jobless Future Online
Authors: Martin Ford
In later chapters, we’ll look in more detail at some of the overall economic and social implications of digital technology’s relentless acceleration. But first, let’s look at how these innovations are increasingly threatening the high-skill jobs held by workers with college and even graduate or professional degrees.
*
The supersonic Concorde, for example, offered a new S-curve in terms of absolute performance, but it did not prove to be an economically sustainable technology and was never able to capture more than a tiny fraction of the airline passenger market. The Concorde was in service from 1976 until 2003.
*
The idea behind 3D chips is to begin stacking circuitry vertically in multiple layers. Samsung Electronics began manufacturing 3D flash memory chips in August 2013. If this technique proves economically viable for the far more sophisticated processor chips designed by companies like Intel and AMD (Advanced Micro Devices), it may represent the future of Moore’s Law. Another possibility is to turn to exotic carbon-based materials as an alternative to silicon. Graphene and carbon nanotubes, both of which are the result of recent nanotechnology research, may eventually offer a new medium for very high-performance computing. Researchers at Stanford University have already created a rudimentary carbon nanotube computer, although its performance falls far short of commercial silicon-based processors.
*
DARPA also provided the initial financial backing for the development of Siri (now Apple’s virtual assistant technology) and has underwritten the development of IBM’s new SyNAPSE cognitive computing chips.
On October 11, 2009, the Los Angeles Angels prevailed over the Boston Red Socks in the American League play-offs and earned the right to face the New York Yankees for the league championship and entry into the World Series. It was an especially emotional win for the Angels because just six months earlier one of their most promising players, pitcher Nick Adenhart, had been killed by a drunk driver in an automobile accident. One sportswriter began an article describing the game like this:
Things looked bleak for the Angels when they trailed by two runs in the ninth inning, but Los Angeles recovered thanks to a key single from Vladimir Guerrero to pull out a 7–6 victory over the Boston Red Sox at Fenway Park on Sunday.
Guerrero drove in two Angels runners. He went 2–4 at the plate.
“When it comes down to honoring Nick Adenhart, and what happened in April in Anaheim, yes, it probably was the biggest hit [of my career],” Guerrero said. “Because I’m dedicating that to a former teammate, a guy that passed away.”
Guerrero has been good at the plate all season, especially in day games. During day games Guerrero has a .794 OPS [on-base plus slugging]. He has hit five home runs and driven in 13 runners in 26 games in day games.
1
The author of that text is probably in no immediate danger of receiving any awards for his writing. The narrative is nonetheless a remarkable achievement: not because it is readable, grammatically correct, and an accurate description of the baseball game, but because the author is a computer program.
The software in question, called “StatsMonkey,” was created by students and researchers at Northwestern University’s Intelligent Information Laboratory. StatsMonkey is designed to automate sports reporting by transforming objective data about a particular game into a compelling narrative. The system goes beyond simply listing facts; rather, it writes a story that incorporates the same essential attributes that a sports journalist would want to include. StatsMonkey performs a statistical analysis to discern the notable events that occurred during a game; it then generates natural language text that summarizes the game’s overall dynamic while focusing on the most important plays and the key players who contributed to the story.
In 2010, the Northwestern University researchers who oversaw the team of computer science and journalism students who worked on StatsMonkey raised venture capital and founded a new company, Narrative Science, Inc., to commercialize the technology. The company hired a team of top computer scientists and engineers; then it tossed out the original StatsMonkey computer code and built a far more powerful and comprehensive artificial intelligence engine that it named “Quill.”
Narrative Science’s technology is used by top media outlets, including
Forbes,
to produce automated articles in a variety of areas, including sports, business, and politics. The company’s software generates a news story approximately every thirty seconds, and many of these are published on widely known websites that prefer not to
acknowledge their use of the service. At a 2011 industry conference,
Wired
writer Steven Levy prodded Narrative Science co-founder Kristian Hammond into predicting the percentage of news articles that would be written algorithmically within fifteen years. His answer: over 90 percent.
2
Narrative Science has its sights set on far more than just the news industry. Quill is designed to be a general-purpose analytical and narrative-writing engine, capable of producing high-quality reports for both internal and external consumption across a range of industries. Quill begins by collecting data from a variety of sources, including transaction databases, financial and sales reporting systems, websites, and even social media. It then performs an analysis designed to tease out the most important and interesting facts and insights. Finally, it weaves all this information into a coherent narrative that the company claims measures up to the efforts of the best human analysts. Once it’s configured, the Quill system can generate business reports nearly instantaneously and deliver them continuously—all without human intervention.
3
One of Narrative Science’s earliest backers was In-Q-Tel, the venture capital arm of the Central Intelligence Agency, and the company’s tools will likely be used to automatically transform the torrents of raw data collected by the US intelligence community into an easily understandable narrative format.
The Quill technology showcases the extent to which tasks that were once the exclusive province of skilled, college-educated professionals are vulnerable to automation. Knowledge-based work, of course, typically calls upon a wide range of capabilities. Among other things, an analyst may need to know how to retrieve information from a variety of systems, perform statistical or financial modeling, and then write understandable reports and presentations. Writing—which, after all, is at least as much art as it is science—might seem like one of the least likely tasks to be automated. Nevertheless, it has been, and the algorithms are improving rapidly. Indeed, because
knowledge-based jobs can be automated using only software, these positions may, in many cases, prove to be more vulnerable than lower-skill jobs that involve physical manipulation.
Writing also happens to be an area in which employers consistently complain that college graduates are deficient. One recent survey of employers found that about half of newly hired two-year college graduates and over a quarter of those with four-year degrees were found to have poor writing—and in some cases even reading—skills.
4
If intelligent software can, as Narrative Science claims, begin to rival the most capable human analysts, the future growth of knowledge-based employment is in doubt for all college graduates, especially the least prepared.
Big Data and Machine Learning
The Quill narrative-writing engine is just one of many new software applications being developed to leverage the enormous amounts of data now being collected and stored within businesses, organizations, and governments across the global economy. By one estimate, the total amount of data stored globally is now measured in thousands of exabytes (an exabyte is equal to a billion gigabytes), and that figure is subject to its own Moore’s Law–like acceleration, doubling roughly every three years.
5
Nearly all of that data is now stored in digital format and is therefore accessible to direct manipulation by computers. Google’s servers alone handle about 24 petabytes (equal to a million gigabytes)—primarily information about what its millions of users are searching for—each and every day.
6
All this data arrives from a multitude of different sources. On the Internet alone, there are website visits, search queries, emails, social media interactions, and advertising clicks, to name just a few examples. Within businesses, there are transactions, customer contacts, internal communications, and data captured in financial, accounting, and marketing systems. Out in the real world, sensors continuously
capture real-time operational data in factories, hospitals, automobiles, aircraft, and countless other consumer devices and industrial machines.
The vast majority of this data is what a computer scientist would call “unstructured.” In other words, it is captured in a variety of formats that can often be difficult to match up or compare. This is very different from traditional relational database systems where information is arranged neatly in consistent rows and columns that make search and retrieval fast, reliable, and precise. The unstructured nature of big data has led to the development of new tools specifically geared toward making sense of information that is collected from a variety of sources. Rapid improvement in this area is just one more example of the way in which computers are, at least in a limited sense, beginning to encroach on capabilities that were once exclusive to human beings. The ability to continuously process a stream of unstructured information from sources throughout our environment is, after all, one of the things for which humans are uniquely adapted. The difference, of course, is that in the realm of big data, computers are able to do this on a scale that, for a person, would be impossible. Big data is having a revolutionary impact in a wide range of areas including business, politics, medicine, and nearly every field of natural and social science.
Major retailers are relying on big data to get an unprecedented level of insight into the buying preferences of individual shoppers, allowing them to make precisely targeted offers that increase revenue while helping to build customer loyalty. Police departments across the globe are turning to algorithmic analysis to predict the times and locations where crimes are most likely to occur and then deploying their forces accordingly. The City of Chicago’s data portal allows residents to see both historical trends and real-time data in a range of areas that capture the ebb and flow of life in a major city—including energy usage, crime, performance metrics for transportation, schools and health care, and even the number of potholes
patched in a given period of time. Tools that provide new ways to visualize data collected from social media interactions as well as sensors built into doors, turnstiles, and escalators offer urban planners and city managers graphic representations of the way people move, work, and interact in urban environments, a development that may lead directly to more efficient and livable cities.
There is a potential dark side, however. Target, Inc., provided a far more controversial example of the ways in which vast quantities of extraordinarily detailed customer data can be leveraged. A data scientist working for the company found a complex set of correlations involving the purchase of about twenty-five different health and cosmetic products that were a powerful early predictor of pregnancy. The company’s analysis could even estimate a woman’s due date with a high degree of accuracy. Target began bombarding women with offers for pregnancy-related products at such an early stage that, in some cases, the women had often not yet shared the news with their immediate families. In an article published in early 2012, the
New York Times
reported one case in which the father of a teenage girl actually complained to store management about mail sent to the family’s home—only to find out later that Target, in fact, knew more than he did.
7
Some critics fear that this rather creepy story is only the beginning and that big data will increasingly be used to generate predictions that potentially violate privacy and perhaps even freedom.
The insights gleaned from big data typically arise entirely from correlation and say nothing about the causes of the phenomenon being studied. An algorithm may find that if A is true, B is likely also true. But it cannot say whether A causes B or vice versa—or if perhaps both A and B are caused by some external factor. In many cases, however, and especially in the realm of business where the ultimate measure of success is profitability and efficiency rather than deep understanding, correlation alone can have extraordinary value. Big data can offer management an unprecedented level of insight
into a wide range of areas: everything from the operation of a single machine to the overall performance of a multinational corporation can potentially be analyzed at a level of detail that would have been impossible previously.