Human supervision is called human in the loop (HTL). The general idea is that by putting humans in the loop we keep the AI human. There are different forms of keeping humans in the loop. By actively involving humans in the design and decision making of AI, we can keep AI from making bad decisions or even going rogue. Human controllers will have the final say in any decisions regarding or made by AI-applications. But "as technologies develop and outperform humans in many critical functions, humans may have little real contribution to the outcomes," say Joachim Meyer and Nir Douer from Tel Aviv University. So how to tackle this problem?
“So we’re really thinking about augmenting the human experience as opposed to flipping out humans and putting machines artificially enabled machines in their place.” (Kay Firth-Butterfield, World Economic Forum)
Augmentation sees AI as an enabler for humans to make better decisions. Augmentation is probably the best way to keep AI ethical – like a helping hand, a co-worker, or an assistant. It sounds great, but will it work? In my opinion, one of the greatest AI-systems that augmented people was IBM Watson for Oncology. This system tried to provide oncologists wit scientific information to help them diagnose cancer. There are many reasons why it never took off. In my opinion, an important one had to do with the economics of AI. Most of Watson’s advice was completely in line with the oncologists’ diagnosis. Watson didn’t provide any additional insights. From a medical point of view, this isn’t necessarily a bad thing, but from an economical perspective, it is questionable. Why invest in a multimillion-dollar system when it doesn’t provide added value, other than to reassure a few doctors? In some cases, Watson provided invaluable insights, even saving patients’ lives by suggesting the right diagnosis. But is that worth the money?
Tom Rikert, co-founder and CEO at Masterful AI, pointed out that to get a good return on investment (ROI) on an AI project, you must automate. Autonomous decision making by AI leaves out the costliest factor – human employees. Efficiency will rise. You can make more decisions in less time and at lower cost. That’s what mechanization and automation have always been for. Putting humans back in the loop increases the operating costs of your AI system. We have to establish whether keeping humans in the loop is worthwhile from a financial and risk perspective.
Keep in mind that a lot of human jobs in the field of AI are low-paying, tedious jobs. At the moment, researchers are finding out that these jobs aren’t executed as they should be. HTLs don’t get the training, time, or pay to focus on quality decisions and supervision. HTLs are only human and their biases can perpetuate into the system.
“Any agent, whether human or machine, will fail at some point. […] If these failures cannot be mitigated properly, policy makers should block the AI interacting with the general public.” (Joost Verdoorn, FeedbackFruits)
Bias is currently one of the biggest issues in AI. Decisions made by AI can be discriminatory. There are several examples of discriminations based on gender, ethnicity, or geography. HTLs are used to detect these discriminatory decisions and rectify them. But new research finds that these humans are also biased. E.g. employees, who have to control their company’s AI-systems, operate in the same organization where the biased data has been created. And could see these biases as business-as-usual. So how can these employees detect discrimination when they share the same biases? The problem even gets worse when we use AI to be less biased as human decision makers. AI is used in recruitment to select people without (or with much less) bias than the old-fashioned human selection process. But how can, humans, in this case, help to find bias systematically and remove the remaining bias on the AI? Do HTLs need to have superhuman capabilities?
“Broadly stated policies of keeping humans in the loop and having meaningful human control are misleading and cannot truly direct decisions on how to involve humans in advanced automation.” (Nir Douer & Joachim Meyer, Tel Aviv University)
Tasks and roles
Joachim Meyer and Nir Douer from Tel Aviv University are investigating how TLs can be meaningfully employed to control AI. "Simply demanding human involvement does not assure that the human will have a meaningful part in creating the outcomes, even when important functions are allocated to the human," they state. Designers of AI and AI-related processes should consider how to use human supervision. What do HTLs have to do? What kind of tasks do they have to perform? Can they do it from a knowledge and training perspective? Can you burden them with the responsibility to override AI-based decisions? Are they empowered? What will happen when a correct AI-decision is overruled by a wrong decision? Who will be responsible for the outcomes? "Falsely holding the human responsible, when having little real influence, may expose the human to unjustified legal liability and psychological burdens," state Joachim Meyer and Nir Douer. When the pressure is on, you can expect lesser quality of the human supervisory tasks, as the science of human factors or ergonomics has long since discovered.
We also have to consider another very important thing. Is the task of being the HTL executable? Can the supervisory tasks be performed? Are humans capable o performing such tasks and if so, under what conditions? Google has rightfully claimed that these tasks have a different nature than normal tasks in a decision-making process. Google is talking about an “AI Systems Operator.” This is a new position that looks more like an operator in the process industry than an employee in an office. An operator is specially trained to get systems back in line when things go wrong. Before that happens, alarm bells will go off when signal values are exceeded. But current AI systems do not send out signals when unusual or wrong decisions are made. How can any human operator focus for such a long time? How to keep your attention alive when nothing particular happens? With self-driving cars, we’ve already discovered that the reaction time between spotting something unusual and reacting appropriately is too long in traffic. That’s one of the reasons the industry wants autonomous vehicles without any passenger intervention. Always having humans in the loop can have serious implications on the overall function of the system, as Nir Douer concludes.
“Delegating tasks and decisions to a machine is not bad, even in high stakes settings, so long as people have meaningful choice about doing so, and can revise their decision.” (Google)
As I stated before, we must seriously consider which tasks we’re going to give the HTLs. We should also think about the qualifications they need. The oncologists in the Watson example are experts and their knowledge is up to par with the knowledge of the AI. Furthermore, Watson also gives reasons for its suggestions. But machine learning doesn’t do that, so how can we assess if the computer is wrong and I’m right? Or, conversely when the computer is right, and the human is wrong? Without insights in the reasoning of AI, judging results will be difficult. It will force the human to repeat the decision-making process and compare the results. But what will happen when the HTLs aren’t trained to control the outcomes; when their knowledge and experiences fall short compared to the AI? In that case, computer decisions cannot be challenged. HTLs shouldn’t be there only to blame and hold accountable the AI operator for things he cannot help.
As long as AI isn’t perfect – will it ever be? – we need supervision over AI. HTL seems to be the most logical way of doing that. How we define these supervisory tasks determines if controlling AI is successful. We should be aware that the role of “AI Systems Operator,” as Google calls it, is a strenuous, complicated, and responsible one. When designing HTL tasks and roles, we should give psychological, ergonomic, organizational, and financial aspects priority. And you should put proper care into designing these human factors too.
This article has been previously published on Medium.