The AI Wolf That Preferred Suicide Over Eating Sheep

A private musing over an AI experiment gone wrong unexpectedly sparked off a culture and ethics debate in China.

Source: Xinzhiyuan on WeChat

So this happened in China. In 2019, two university students did an AI project that involved a simple ‘wolf versus sheep’ game. The senior member of the team, a Thai national studying in China, left to work in Australia after he graduated, and the project was thus abandoned.

The junior member went on to teach. One day in March 2021, he told one of his students about the initial results of the experiment over text. The student was so tickled by the story, he screenshot it and shared it with his friends.

Those screenshots went viral on Chinese social media and became a small sensation.

Better death by boulder than catch sheep

The game was simple. Two wolves and six sheep would be placed at random within a game space by the computer. The wolves would have to catch all the sheep in 20 seconds while avoiding some boulders within the space.

In order to incentivize the AI wolf to improve its performance, a simple point system was also programmed.

If a wolf caught a sheep, 10 points were awarded. If he hit a boulder, one point was deducted. To encourage them to catch the sheep as fast as possible, 0.1 points were deducted from the wolves for every second that passed.

Other abilities were given to the wolves — which direction they were facing, what was in front of them, where the sheep were, its speed, the speed of the sheep etc. and a whole bunch of other parameters meant to help the wolves in their hunt.

The goal was to see if the AI wolves could, through training and retraining, figure out a way to maximize its scores.

After 200,000 iterations, the researchers found that the wolves simply rammed themselves against the boulders to commit suicide most of the time.

AI wolves committing suicide by ramming the boulders. Source: Xinzhiyuan on WeChat

