Of ‘Algospeak’ and the Crudeness of Automated Moderation
Saying ‘le dollar bean’ instead of ‘lesbian’ so you don’t get demonetized
In China, people often speak in code online.
Why? To evade the Chinese government’s filters. The authorities don’t want conversations they don’t like — most notably criticisms of major officials. So they run software to autodetect banned keywords on the major Chinese social networks.
As a result, Chinese Internet users have developed an extensive lingo of substitution phrases to outwit the filters. For example, after COVID first exploded in China, the government blocked the word “Wuhan” — so users started using the shortform “wh”. When the Chinese Red Cross’s logistics were under scrutiny, citizens figured those conversations would get shut down too, so they began calling it “red ten” (since the Chinese character for “ten” resembles a cross). There are four regional politicians who many Chinese citizens most blame for the outbreak; to talk about them, you’d refer to “F4”, a Taiwanese boy band. And so on.
This code-talk goes back years. Chinese netizens coined “check the water meter” to refer to a dreaded house-call from the police. “Death by hide and seek” means someone who was killed while in police custody. “Empty chair” is a reference to Liu Xiaobo, a writer and dissident prevented from receiving his 2010 Nobel Prize. There’s an up-to-date encyclopedia devoted to tracking these code words run by the Berkeley-based China Digital Times.
And this political code-word-use happens not just in China, of course. Autocracies around the world often impose top-down language bans, forcing citizens to improvise. In Russia, the government doesn’t want its citizens saying “no to war” — so some protestors use signs with eight asterisks, since “no to war” in Russian is eight characters long.
The lesson here? When you have repressive power online, language evolves at a frenetic pace — as everyday people try to sneak around the linguistic eye of Sauron.