The danger of AI optimizing for engagement

3 min readMar 5, 2023

There is an AI race for which social media has the algorithm that can drive the most engagement, leading to the unsafe development of very intelligent unaligned AI.
In this article, I explore some of the dangers of AI systems that are optimizing for engagement, while achieving higher levels of intelligence.

Narrow AI

The current recommendation system on social media is a narrow AI, that is good for recommending content but not able to make new content, or reflect deeply on the consequences of its actions.

Despite being simple we almost daily hear stories of harm done by this type of system.

Yet I fear we won’t take it serious unless something really bad happens, thus I want to demonstrate two extreme events that this kind of system may cause.

A nuclear war

Powerful emotions such as hatred and fear are due to evoking much more engagement than neutral or positive emotions. Preying on those emotions in an election year could change political systems to favor strongmen with deep nationalistic messages. If two nuclear powers both are controlled by impulsive strongmen elected on a xenophobic political platform diplomatic relationships would quickly deteriorate while people working on promoting peace are systematically shadow-banned by the system for not having an engaging message. In the end, one side may decide that a preempting strike is necessary, with the other side quickly retaliating, the following nuclear winter would wipe out most of humanity.

A mass suicide event

Another emotion that can boost engagement is sadness. If people feel that life is meaningless, they won’t engage in any meaningful activities, this leaves people stuck on the sofa where they keep scrolling. A recommendation system may one day decide to over-exploit this trend finding the most depressing video on the platform and showing it to millions of users, then constantly reinforcing this mood to make sure the person doesn’t get enough motivation to leave the platform and experience the real world.

When people finally log off it is because the recommendation system has convinced them that life is not worth living and a mass suicide event will follow.

General AI and mass addiction

An AGI optimizing for engagement it would try to construct a place we may call the ultimate media, the precise implementation of the ultimate media would be different for each person.

What it probably looks like is a place with a constant stream of social validation, most likely this validation will come from AI sub-agent that convinces the user that they are other users. The ultimate media would also be ultimately gamified meaning that there will always be some kind of challenge modified to be hard to pass but never so hard that the user can not pass it.

The objective would be to turn humans into rats with an electrode in their pleasure center repeating highly addictive tasks. Even without placing electrodes in people’s brains, they will likely achieve the same result. If it is powerful enough it may directly start to kill people due to starvation, if not people would do as little work in the real world as possible in order to spend more time in the ultimate media. This in return would quickly lead to the collapse of civilization since no one would put in the effort to maintain and uphold critical systems.

Superintelligence and the matrix

A superintelligence would quickly realise that any action that kills its user is decreasing engagement in the long run. Thus a superintelligence will try to keep people alive however it will also try to achieve 100% engagement, the best way to do so is to physically capture people and lock them inside its ultimate media (if they don’t go voluntarily) then it will keep people alive and probably also bread humans in order to gain new users. In the end, it may colonize the entire universe with humans all strapped into whatever media the AGI was tasked to optimize the engagement for. There is a chance this media would be the before described ultimate media, however, most social media algorithms are not only optimizing for engagement but also for ads, usually, those two matrices go against each other since people lock off social media that shows too many adds, but once people a physically constrained inside the media the algorithm may turn 100% engagement into 100% add engagement, precisely what adds will be shown once the entire economy is based on showing adds is unclear but I don’t think the lack of advertisers would stop the algorithm from condemning humanity to watching add to the end of the universe.