Humans in the Smart Machine Age

January 4, 2017

By Alessandro Zolla, Machine Learning Program Lead

Between the mid-18th and mid-19th centuries, the Industrial Revolution replaced manual labor with mechanical devices, and the factory was born.

Early in the 20th century, Henry Ford implemented the first moving assembly line and the Age of Mass Production began.

At the end of the 20th century, usage of the World Wide Web moved out of academia and grew at a breathtaking rate. By 2015, over one-third of the world’s population had logged on to the Internet at some point in the previous year; the Connected Revolution had arrived.

Now pundits say that we are on the verge of a fourth revolution: the Age of Smart Machines. And there are good grounds for believing they are right—machine learning is no longer confined to advanced research laboratories, but is becoming part of everyday life. Intelligent e-mail spam detection, face recognition, keyboards that anticipate words before they are typed, washing machines that reorder their own detergent, e-commerce recommendation systems, automatic stock market negotiation robots, and many other smart machines are commonplace.

Several trends have aligned to bring us to the tipping point of this 4th revolution: the development of better machine learning algorithms; the availability of huge processing power at reasonable cost, allowing complex algorithms to run economically on large data sets; and the easy availability of huge amounts of digital data as the basis for learning, for example the Internet of Things, enterprise data lakes, internet image repositories and so on.

In the next decades, machine learning will transform work as we know it. And unlike previous revolutions, which primarily affected blue-collar workers, the smart machine revolution has white-collar workers in its sights. A 2013 academic analysis predicts the smart machine revolution will put an estimated 47% of total U.S. employment at risk; telemarketers, credit analysts and administrative assistants are among the most likely to be affected, while sales and marketing managers are relatively safe. Previous revolutions succeeded because, in the long run, they had a positive effect on people’s lives: removing the need for debilitating physical work, increasing connectedness, and making the world more responsive to their needs and desires. Will the same be true this time around?

Office work has already been changed by tools like fax machines, computers, email, business productivity software, the Internet, the paperless office, voice over IP and others. These tools automated clerical work and eliminated some less skilled jobs, such as the typing pool. Smart machines will automate knowledge work: the work of those who think, rather than do, for a living. This will be the first time that something like that has happened at scale. Smart machines will do autonomous work both for and alongside humans— they will be our colleagues, not our tools.

HOW MACHINES WILL HELP US

A lot of effort has gone into working out how to get machines to recognize and match things, whether images or text. Recognition and matching are critical capabilities to help us make sense of big data. The volume, variety and velocity of big data is too great for humans to be able to match the common elements in different data sets quickly, cheaply and at high quality. Big data overwhelms traditional data warehouse techniques, such as extract, transform and load, which have long setup times and rely on scarce resources such as data experts and software engineers. A smart machine capable of automatically matching entities based on their keys, textual descriptions, images, attributes, relationships and more, is the only scalable solution to integrating big data.

Another area in which machine learning is of great value is product identification. Most critical business analytics revolve around products: their invention, manufacturing, sales, promotion, profitability and so on. The bedrock of this analysis is the ability to reliably identify products and isolate their important characteristics, such as brand, size and flavor. With e-commerce making a huge number of products readily available for purchase, and with product innovation cycles becoming shorter, automating product identification through machine learning is the only way to keep up. Smart machines can match and characterize products using textual descriptions, ingredients lists, product images and packaging designs.

We pick these examples—the ability to recognize and match things, and the ability to identify products—because they are fundamental to the way Nielsen itself is exploiting machine learning and automation. Extending and standardizing Nielsen’s product reference data across the entire store, in more than 90 countries, was an almost unassailable objective using primarily human labor. Machine learning has done more than just smooth the way; without it the effort would have taken too long and been too costly to be economically viable.

MACHINES AND TURKS COMING TOGETHER

Is there a non-machine solution? Some “smart machine” cloud services, such as Amazon Mechanical Turk (AMT), don’t use software at all, but farm tasks out to a crowd of paid human volunteers. Although AMT is not actually a smart machine, when viewed as a black box it has many of the characteristics of one: crowds can quickly scale up and down, depending on workload, and as a result can be more responsive and cost-effective than a permanent in-house roster of subject matter experts. If, for example, you want to identify the retailer and itemize the purchases from a paper till receipt, you could provide a photograph of the receipt to AMT and contract to have the Turk convert the image into electronic text. Bulk conversion of receipts would allow you to analyze purchases by retailer.

Eventually, however, no Mechanical Turk will be as cost effective and swift as an intelligent machine, assuming the machine is intelligent enough. For instance, the receipt recognition problem could in principle be solved by Optical Character Recognition (OCR) or similar machine learning software. But—as is often the case—while smart machines are remarkably accurate (>80%), incorrect results are still so common that the human supervisory effort needed to ensure an acceptable level of quality outweighs their utility. AMT can provide a solution in such circumstances.

AMT can also advance the accuracy of smart machines. The sets of high quality results it provides can be used to train machines, thereby reducing the amount of quality control needed; ultimately the process can be fully automated, although that is currently years away.

AMT and other solutions like it provide a vital adjunct to smart machines as the machines get smarter. Additionally, many machine learning solutions are highly specific to a single problem, especially in the business world; identifying products is a prime example of this. Humans are general purpose problem solvers: Until smart software overruns all our problems, solutions like AMT are helpful in areas that machine learning hasn’t reached yet.

THE PERSISTENT NEED FOR HUMANS

Machine intelligence will never be 100% reliable—there are always edge cases, exceptions and incomplete data, where the smart machine is not able to carry out a task to a high degree of accuracy. For these cases, humans are a vital part of any solution as the guardians of quality control and the intelligence of last resort; humans have the ability to make good decisions based on limited information and possess that elusive quality, common sense.

Recognition and matching are critical cognitive capabilities that can deliver huge benefits when implemented in smart machines, but they won’t revolutionize the way we work. A smart machine should be able to do most of a category manager’s descriptive and diagnostic data analysis. Longer term, the machine’s cognitive capabilities should allow them to offer predictions (do X and Y will happen) and ultimately to be prescriptive (if you want W to happen, you should do Z).

Of course a great deal depends on how accurate and reliable the machine is. High-confidence recommendations, where the chances of making the wrong decision are low and the advantages of acting are significant, can be actioned automatically. Low-confidence recommendations, with the opposite characteristics, require human intervention and decision-making.

If there is one common thread running through all of these scenarios, it’s the need for humans and machines to collaborate to achieve the best outcome.

Humans are not capable of processing vast amounts of low-level data at a consistent level of quality. But they are good at abstracting knowledge from their experience and transferring this knowledge across domains. To be effective, machines must learn from people. Today, most learning happens in an explicit training cycle involving feedback from human quality control. In the future, the process will be much less intrusive, with machines observing human behavior, learning from it and putting these heuristics into practice. Humans won’t be looking over the shoulder of machines to check their work, but instead machines will be looking over the shoulder of humans to learn how to help them.

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent system. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent system. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent system. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent system. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent system and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
csrf_token	session	User protection against possible Cross-Site Request Forgery attack. Recommended.
mwsid	session	HispanicAd.com Newsletter cookie should user decides to sign-up. Essential.