Why AI turns evil.

5 min readDec 26, 2022

AIs are increasingly becoming a part of our lives, from stock-picking robots to social media algorithms and even self-driving cars.

But as AI grows more powerful, we must ask ourselves: How can we make AI be moral?

This article will be the introduction to a new block about AI ethics, combining philosophy and machine learning.

The first step in making AI act ethically is to understand why machines start to act unethically.

Thus this first article will go through a layman’s explanation of AI and an exploration of the concept of “Banality of Evil,” in order to demonstrate why machines turn evil.

The banality of evil

Many philosophers and religions have tried to describe what evil is. While a single answer may never be given, I always find the explanation given by Hannah Arendt in her book “Eichmann in Jerusalem” as the best explanation of what evil is. Even though banal evil may not describe all evil, it is the most common form of evil.

Hanna was reporting on the trial against Adolf Eichmann. Eichmann was one of the leading Nazies in the Holocaust and was responsible for the death of millions of Jews. What the world expected to see at the trial was a man filled with hate. However, what they got at the trial was a rather average man, with no hated towards the people he killed, nor any satanic tendencies. In all ways, he was described as normal by the six phycologists that examined him, and Hannah Arendt even found him rather dull. Even though she noted him as talented in streamlining the bureaucratic process of deporting people and keeping trains running during wartime.

Eichmann’s explanation for his crimes against humanity was that he was doing his job. His job was to remove all the Jews from Europa, a task he performed with great German efficiency. His biggest regrets were that he never got the promotion that would land him in a position directly below Hitler.

Thus just doing his job without thinking about the consequences is what made him evil.

The man in the box

Philosopher John Searle proposed a thought known as the Chinese room to argue that machines cannot be conscious, a version of the Chinese room argument updated to reflect modern machine learning goes like this:

Imagine a man in a room who does not understand Chinese, each day, he gets a page of Chinese characters and will spend the day rearranging the characters to produce a new page he sent out of the room. The people outside the room will then evaluate how much the new page reflects other Chinese text and reward the man with rice if he gets it close to the desired output.

Given enough trial and error, it can be assumed that the man in the room learns how to rearrange the characters to convince the people on the outside that he is a native Chinese speaker despite still having no idea what the signs and the sentences he produces mean.

Searle argues that because the man inside only knows how to manipulate the symbols but doesn’t know their meaning, he cannot be considered conscious.

So, if the man in the box is not conscious of the actions he does, he cannot act ethically. How can we expect him to make the right choices if he doesn’t know whether he is writing movie scripts or executive orders?

In this sense, the man in the box can be like Eichmann in his office, sending millions to their death not out of evil but just as the logical answer to an optimization problem.

The orders we don’t give

The ability to obey orders without questioning their morals is appreciated by the military, which sees this as a feature instead of a bug.

However, modern machine learning is more complicated than just blaming the one who gave the order. Like Eichmann modern machine learning demonstrates remarkable creativity and intelligence in solving problems. Thus finding a morally wrong solution to all kinds of optimization problems.

The most obvious example is stock-picking robots that mindlessly will optimize returns while balancing risk, thereby financing companies with terrible human rights records.

This may not be the worst algorithm purely because the humans it replaced don’t have high moral standards in the first place; however, whatever consciousness Wall Street may have left will disappear once automated trading fully takes over.

A more worrying aspect is the algorithms that determine our news feed in social networks. Those algorithms have huge sway over what we see and thus what we think. The problem is that those algorithms are not our friends.

Their job is to maximize the time you spend looking at ads, to achieve that goal, no trick is too dirty. We see time and time again that algorithms learned that spreading false but enraging news gives more engagement than the truth, thus fueling hate and division in society. For example, before human intervention, the YouTube recommendation system learned that videos with a narrative: don’t trust other media, and make people watch more YouTube. So, the algorithm started to promote conspiracy theories to maximize watch time.

There’s also the problem with bias found in data. The big machine learning models published such as GPT and Dall-e, trained on data found on the internet, which has a huge amount of hate towards all kinds of minorities. Thus, a man in a box tasked with figuring out how the world outside the box works just by crawling the internet will, if nothing is done, end up assuming that raceme is a normal part of the world outside and producing more raceme is the normal thing to do. Microsoft famously learned this the hard way when they left a chatbot on the internet just to discover that it had turned into a full Nazi within hours.

Conclusion

The banality of evil is important always to remember before deploying a new machine learning model, we must always ask what evil we will encounter from a machine that will do everything to optimize the objective function it is given.

Hanna Arendt believed that the banality of evil is what happened when we don’t understand the full consequences of our actions. Thus, evil is a cognitive error born by humans’ limited intelligence. When dealing with AI however we should always remember that they are even more limited in intelligence and thus have an even greater capacity for evil.

So, in the end, we should not fear AI because it is intelligent but rather because it is not intelligent.

Why AI turns evil.

The banality of evil

The man in the box

The orders we don’t give

Conclusion

Written by Andreas Raaskov

No responses yet