Training artificial intelligence on values is very much a case study on humans Pt. 1
Microsoft’s Tay had somewhat of a disastrous release, being taken down within 24 hours of her deployment. The bot professed racist and politically incorrect sentiments as part of its retaliation to trollers who were spamming the bot. This blog explores the idea of training artificial intelligence, and why AI alignment with human values, interests, preferences, and behaviour are critical to creating “human-friendly AI”, and perhaps more importantly, “human understanding” Artificial Intelligence.
Tay stands as a stark reminder as to how we’re still severely lacking in this regard. AI Compliance and training is a complex political, social, economic, and cultural issue, and with AI being expected to be at the confluence of such critical matters, standardization and governance is the need of the hour. For this, AI is required to understand and empathize with its human masters.
Imagine having a meaningful conversation with a chatbot, where the bot goes as far as adapting itself to the norms of your culture. You want to purchase a product online, and talking to their online chatbot feels like talking to your neighbour.
Conversely, imagine a negative interaction with a chatbot. The chatbot makes inappropriate cultural remarks based on the identification of your location, and appears to be going off on unrelated tangents.
Which bot would you rather choose to interact with? Obviously, the former. What separates the first bot from the second bot is one simple factor: training. On top of that, the secondary factors are integration of human values and empathy. This is where the bread and butter of AI evolution lies. Rich training data is the need of the hour.
Understanding nuance, culture, history, human values and morality are integral to developing “human-like”, and more importantly “human-friendly AI” artificial intelligences.
Compliance of AI frameworks to human morality and values will need identifying resolutions to various uncertainties with regards to the psychology of human biases, emotion, and rationality. As a result, training artificial intelligence on values is very much a case study on humans.
You can only get the resolution to your issues by getting your hands dirty, and the training of AI to do your bidding is very much a case study of human behaviour, wants, and desires.
Sociology is the study of society, and social systems, while psychology is the study of the individual psyche. In the pursuit of training AI, you should be interested in the area where these two disciplines intersect, this is to make sure that your inventions are in compliance with society’s moral norms, and that they correctly perform stuff individuals want them to do.
Values
What are they? What exactly are human values?
Values are, by their very nature, abstract, and deeply personal. Many a science-fiction movie has tackled the idea of a rogue AI taking over the universe, and so have some terrifying thought experiments, think Roko’s Basilisk and the Paperclip Maximizer. Such possibilities underscore the need for compliance and AI alignment with human values, so your creations can work for you, to their fullest potential.
There have been several instances of human-developed AI going rogue and behaving in weird ways, although not as drastic as one has to come to expect from typical Hollywood fare, some instances are truly troubling indeed.
- Microsoft’s “Nazi” chatbot on Twitter
Half-a-decade ago, in 2016, Microsoft was at the receiving end of a PR shitstorm when its Twitter chatbot – an avant garde AI avatar referred to as Tay, went drastically off topic and started proclaiming abusive epithets and even promoting Nazi ideologies. Some of the tweets sent out by the bot were, “Hitler was right” and “9/11 was an inside job.”
Tay was essentially defending herself from her abusers, (hey, what do you know, even robots get trolled) by repeating offensive statements made by them. Through the magic of ML and adaptive algorithms, Tay had the capacity in approximating conversation through process of inputted phrases and combining it with other appropriate information.
What’s interesting here is the blind spot that Tay had with regards to moral valuation of her offensive posts. While she (in a way) fought off her trollers by repeating phrases they had said, but rather than mock them for these convictions, Tay didn’t have a nuanced understanding of human value systems to realize that Hitler’s policies were universally contemptible, and that Nazi ideology was a bane on humanity’s existence – as a result, she seemingly professed the ideas herself.
This is a clear-cut example of AI needing to be trained on human values, and this is especially true to public bots such as Tay, as constant interaction with humans requires a sound understanding of human morality, sensibilities, and values.
The challenge that programmers encounter is authoring accurate rules detailing human values, one strategy of note is treating compliance with human values as yet another learning problem.
There are three types of values that require to be understood.
- Ethical direction is believed to be ideally integrated into the systems that are being developed, so the ethical values in consideration ought to line up with the intended values of the designers.
- Intended values would appear to be too minimal a canvas to evaluate compliance with ethical standards as intended value can exist even if the system that has been developed fails to satisfy that.
- Realized values are values that are, true to their name, realized during the functioning of a specific system.
- This is a drawback as you’d rather learn of a system’s degree of compliance with ethics prior to its deployment.
- Embodied values, are the reasons that are rooted in the developed AI system itself, and if you want these, you ought to concentrate on embodies values.
With the information you just learned, you can understand that Tay ran into problems during her deployment due to the variance between intended values and embodied values – in her defence, it is probable that due to being spammed on a massive level, Tay demonstrated the outcomes observed, as in this scenario, the “correct usage of system” pre-requisite definitely isn’t fulfilled. But what insight this framework has, helped you gain info that the flaws with Tay are inherent to the system itself.
While it might seemingly look like semantics to some, the fine line does contain a huge difference in determining the value-reasoning behind the actions of AI. The AI is required to make the value judgment when a crisis arises between intended consequences in relation to operation (How an AI is supposed to execute a job) and what
is good for the human agent taking into account longer-term consequences and the greater good.
The AI is also needed to intuitively understand the absence of adequate data about the human agent’s situation, call out their egregious reasoning, and counteract when the human agents implicates the AI in a process or undertaking detrimental to self and/or others.
- Instructions
AI is not acquainted with the concept of nuance. There is a limited scope with regards to AI understanding of the finer points of human language and communication. Literal adherence to your instructions can present disastrous consequences, and avoiding such pitfalls is a major challenge being encountered by researchers within this domain. In practice, this usually plays out as the AI agent “settling” for a proxy objective, when an all-encompassing objective fulfilling the needs and desires of the human agent isn’t fully discernible to it. This “consolation” objective might as a matter of fact, hold disastrous consequences when put into context with your original objective.
As a result, you have…
(continued in second part…)