Is the post-2012 acceleration in automation tech sustainable?
First, how good is today’s AI-enabled software? There is no question that it is a lot better at specific tasks than its predecessors. These tasks are more than toys – they include classifying photos, language translation, speech recognition and text synthesis.
My photo library has lived on an iPad for many years now. More recently, Apple added AI-based image classification to its operating system. Search for “beach”, and scenes like this pop up:
Photo: Dave Heatley
But Apple’s AI makes some perplexing errors. On the left, Turoa ski field is a “beach” (a false positive), while on the right, Murray Beach on Stewart Island doesn’t make the cut (a false negative).
Photos: Dave Heatley
Such classification errors are fun and have low consequence in a photo search.1 But should I ask an autonomous vehicle to “take me to the beach”, I wouldn’t expect it to head for the snow!
Competing predictions follow a patchy history of AI performance
AI research has gone through several cycles of high promise and deep pessimism. Histories document two major “AI winters”, covering 1974–80 and 1987–93, along with several smaller episodes of slow or seemingly backwards progress. Forecasters offer three starkly contrasting trajectories, as shown in this stylised chart.
Diagram: Dave Heatley
- The blue dotted line depicts predictions of continuing exponential performance improvement. For example, Kurzweil predicted in 2006 that “by the early 2030s, machines will persuasively display human emotions, human-level intelligence, and will claim to be conscious”. Continuing exponential performance improvement will lead to a technological singularity in 2045 “as artificial intelligences surpass human beings as the smartest and most capable life forms on the Earth”. Organisations such as Singularity University are built around the idea of “exponential technologies”. According to Jason Silva “AI is perhaps the granddaddy of all exponential technologies—sure to transform the world and the human race in ways that we can barely wrap our heads around”.
- The green dashed line treats the recent past as the new “normal” and foresees no constraints on further improvement. This is the position of many industry observers, and perhaps behind the recent sky-high valuations of tech companies with AI-related intangible capital.
- The brown dashed line represents a sceptic’s position, such as that of Gary Marcus of New York University2. He suggests that deep learning might be approaching a wall.
Which trajectory is more credible?
Clearly, current technology has room to improve, and it is yet to reach all possible applications. Still, deep learning has inherent limitations, and it is debateable whether the approach is extendable to situations that require more “general” intelligence. As Marcus points out, while deep learning is very good at interpolation (cases between known training examples) it performs poorly at extrapolation (cases beyond the range of examples). The problem is that in many cases it is the ability to cope with unusual cases that matter.
The self-driving cars of Waymo, Uber and Tesla, for example, have been unable to reach human-equivalent driving performance, let alone earlier predictions of better-than-human performance. According to commentator Timothy B Lee:
Driverless cars seemed to reach peak hype some time in late 2017. Then in 2018, the industry plunged into the trough of disillusionment, with some people wondering if driverless technology might be decades away.
An illustrative example is responding to an unusual object on the road. A human driver can draw on other experience to distinguish between, for example, a cushion and similarly-shaped rock on the road and respond accordingly. A self-driving car is unlikely to have sufficient examples of both in its training set, and so may take (potentially dangerous) evasive action to avoid hitting a soft cushion! Google’s Chief Economist, Hal Varian points out that data has diminishing returns to scale – the value of data scales with the square root of its quantity. Collecting huge amounts of data cannot guarantee that all unusual cases are included in the training dataset.
Overall, I’m inclined to forecast a near-term plateau in performance improvement. For two reasons:
- The exponential and linear improvement projections discount earlier experience of sharp improvements punctuated by “winters”, waiting for improved methods and algorithms.
- Marcus’ case is strong. That is, current AI technologies face some impending technical roadblocks and it will take time – perhaps decades – to deal with them comprehensively.
What does this mean for the future of work?
Early waves of automation tech replaced routine manual work with machines. They haven’t necessarily eliminated occupations. The quintessential manual worker – depicted on road signs as a person leaning on a spade – is still required, but today’s worker is most often seen working alongside an excavator. The excavator (and its operator) do the heavy lifting, while the spade operator handles the “edge cases” – exploiting their dexterity and decision-making capability to work near power and communication lines for example.
Later waves of automation tech replaced routine cognitive work with machines. For example, “computers” – at one time humans employed doing maths – were replaced by their electronic namesakes. But again, electronic computers handle the regular and standard cases, and we still need humans with maths skills to program those computers and to deal with the irregular.
I think what we are now looking at is a progressively expanding definition of “routine”, as AI technologies take on routine decision-making. Human decision makers will still be necessary for the non-routine – the special cases. And we will be even more valuable for that purpose. But I can’t see any strong evidence that this wave of automation will hit harder or faster than did previous ones.
Dave Heatley, Principal Advisor, Productivity Commission
- Of course, Apple’s AI may not represent the current pinnacle of image recognition. As a second datapoint, I tried the same test in the Google Photos app. It made a different set of similarly perplexing classification errors.
- Gary Marcus, Deep Learning: A Critical Appraisal