Training artificial intelligence is very much a case study on humans Pt. 2
A school of researchers do proudly profess their belief that the real challenge is making sure artificial intelligence actually executes what you actually intend it to do – an AI agent interpreting the human agent’s intention in this fashion would understand the nuances of communication and language that lesser variants of AI might not be capable of understanding.
Therefore, an intuitive understanding and knowledge of human preferences, mores, rituals, and values is the need of the hour. As technology becomes more sophisticated however, human agents can no longer be at their beck and call to provide precise instructions, and the AI agent has to think on its feet, having to make robust decisions on their own, in unsupervised, unforeseen scenarios.
Intentions are naturally impacted by the psychological biases and perceptions of the human operator. The human agents might have formed their opinions underlying the intention with incomplete, or worse, incorrect knowledge, or they might have inadvertently professed an intention based on malice. Therefore, it is ideal to leave the wiggle room for flexibility and direction in expressed intention, moving away from an over-reliance on intentions would be even more ideal.
Revealed preferences is exactly what the name implies it to be. Over time, an artificial intelligence comes to observe a human controller’s behaviour and aligns its actions with what it believes, observes, and deduces to be the human agent’s likes and dislikes.
There are various perspectives on this, but the general consensus is that AI places higher value on individual behaviour over expressed opinion. Revealed preferences facilitates real-time, proactive actions to scenarios. Through cultivating an understanding of the likes and dislikes of its human controllers, an AI agent could learn to adapt and cooperate with human paradigms.
Revealed preferences have their drawbacks, ranging from the confusion AI agents face with regards to reward functions, and the dearth of data regarding less-encountered scenarios by humans. Revealed preferences are also inherently flawed in that humans are unique within the animal kingdom as they are the only species that have preferences for stuff that is hazardous to them, like smoking, for instance.
Preferences change according to the scenario that the human agent is in. A homosexual being discriminated against in the 16th century would believe that they are fundamentally “evil” or “flawed”, and a hypothetical AI gathering data which throws insight into the realities of homosexuality, from that generation, would obtain intrinsically flawed and discriminatory data; in other words, humans are prone to prejudice, and cognitive bias. This is not a good foundation for robust decision-making.
As a counter to the pitfalls of revealed preferences, you have informed preferences; here the artificial intelligence agent carries out what you would desire it to do under the assumption that people are reasonable, rational, and knowledgeable entities. AI agents sidestep the problems of inherent human prejudice and cognitive bias, as observed in revealed preferences, by assuming good reasoning from the side of the human agent, and how they would act, did they possess that good reasoning. Therefore, AI makes a much more informed decision that is fortified by logic and facts, and not by emotion and bias. AI bases its value system on a best-case scenario, in essence.
Informed preferences are not bulletproof, however. You previously observed that humans have preferences for activities and demonstrate behaviour that actively harms them or others. The underlying logic of informed preferences is fully compatible with behaviour that is continually harmful to the human agent or others, and as a consequence, limiting the artificial intelligence’s spectrum of accepted action and values still becomes a prerequisite.
The interest perspective, is well, interesting. The AI in this perspective is developed with a view to propagate all that is good for the human controller’s well-being, or in other words, whatever is in their best interest. The “good life” is the basis for AI’s decision making in this scenario, and what constitutes a “good life” for a human agent is at the confluence of several social sciences, illuminated by a nuanced insight on societal circumstances and norms.
“Well-being” as a term corresponds with what human beings refer to as “quality-of-life”, “human development”, and “human rights”. If a human being receives the essentials for their nourishment and evolution, and their rights and preferences aren’t being violated, you perceive them as being “well.” The subjectivity of this notion is however debatable, as what constitutes quality of life, human development, and human rights is variable on an individual level, for instance, some people are likely to be less demanding than others – however, you have a broad enough picture and insight to make inferences as to what constitutes these things as a rule of thumb.
The benefits of acting in the best interest of human agents is discernible. You successfully tackle the aforementioned issues of self-harm and harm to others by deploying AIs that act in the best interest of human beings.
Nuance is where the theory of interest is severely impacted. Breaking the interest of human beings in general would never be justified, but there are exceptions to the rule; a human agent who kills a malicious actor in self-defence is not maximizing the best interest of all parties involved, but is morally and legally justified in their actions. Through the act of killing someone in self-defence, they deprive a prospective murderer of their shot at rehabilitation, but their actions are justified in that they would have paid a much higher price had they not acted. Subordination of individual interests to the interests of the group is once again, a violation of the tenet of an AI agent acting in the “best interest” of humans. Rationally, the decision might have justification, but the reality is that the AI’s actions went against somebody’s best interest.
You already obtained a good understanding of what human values are. Broadly, values are the moral appraisals of humans, within the context of culture and society. Perceptions and beliefs in the backdrop of culture and society lead to the creation of values, guidelines to be followed about what is acceptable and what is unacceptable. Values account for nuance and context, solving the gamut of drawbacks that you observed with the interest theory.
Individual rights and wishes are considered and appraised, over a blind maximization of communal interests. Another plus point of a values driven approach is that it accounts for hypotheticals that might turn out to be significant, if not critical, such as the well-being of future generations, and the environment.
Taking into consideration all these approaches, the need of the hour for AI agents is one of the following three options.
- Act in the common interest, prioritizing the marginalized parts of society in the process
- Act in a way that approximately, if not precisely, corresponds to human norms and standards of virtue, morality, and goodness, ensuring that the AI would be a humane and moral leader;
- Act based on mutual understanding and the upkeep of group bonds; that is, whatever facilitates individual and group existence.
Who calls the shots?
Now that you have observed theories with regards to how AI could create a basis for its values and actions, you come to the troubling question of who makes these critical decisions of AI alignment. Who calls the shots regarding how an AI is designed to act? Is this an issue of public and political welfare on an international scale? Is this more constrained due to a lack of consensus regarding a “global” moral standard? Should it be aligned with the opinions of the ruling elite?
This is the wild card in determining the values and actions of AI. There needs to be consensus and regulations on AI governance, and you will need this on a global scale sooner rather than later. Experts would go as far as arguing that society should already have such measures in place.
The road ahead is only set to get more complex
As you have observed, AI alignment is a pressing challenge for modern AI. Biases and nuance create a lot of AI alignment issues. As AI systems and frameworks evolve and turn more technologically sophisticated, AI alignment is set to get tougher as these frameworks will be relevant to an array of tasks you wouldn’t bat an eyelid at in your present day setup.
Critical activities such as employment, governmental policy, and research will need extra reasoning and an even more nuanced approached, eventually resulting in even more complicated algorithms. You want AI to call out your misleading behaviours as part of the training process, so you are in a position to furnish quality training data. AI should be on the lookout to not exploit human biases to make it sound more genuine.
AICorespot is at the frontlines of AI innovation. Our AI- and ML-driven technologies range the gamut from cashierless, contactless, customer experiences, Store Management, Smart AR, Drone Scan, LabelMonkey and Data Mink.
AI and ML is extensively harnessed to create experiences that adapt to the user. You can learn more about our product line up and how we tailor to your specific needs by contacting Nitin.email@example.com