Peter Gostev, Head of AI at Moonpig, found an easy way to get a Chinese language model (LLM) to talk about taboo topics like the Tiananmen Square incident.
Gostev manipulated DeepSeek’s public chatbot by mixing languages and swapping out certain words. He would reply in Russian, then translate his message back into English, tricking the AI into talking about the events in Tiananmen Square. Without this method, the chatbot would simply delete all messages on sensitive topics, Gostev said.
Gostev’s example illustrates China’s dilemma of wanting to be a world leader in AI, but at the same time wanting to exert strong control over the content generated by AI models (see below).
Controlling the uncontrollable
But if the development of language models has shown one thing, it is that they cannot be reliably controlled. This is due to the random nature of these models and their massive size, which makes them complex and difficult to understand.
Even the Western industry leader OpenAI sometimes exhibits undesirable behavior in its language models, despite numerous safeguards.
In most cases, simple language commands, known as “prompt injection,” are sufficient – no programming knowledge is required. These security issues have been known since at least GPT-3, but until now, no AI company has been able to get a handle on them.
Simply put, the Chinese government will eventually realize that even AI models it has already approved can generate content that contradicts its ideas.
How will it deal with this? It is difficult to imagine that the government will simply accept such mistakes. But if it doesn’t want to slow AI progress in China, it can’t punish every politically inconvenient output with a model ban.
China’s regulatory efforts for large AI models
The safest option would be to ban all critical topics from the datasets used to train the models. The government has already released a politically approved dataset for training large language models, compiled with the Chinese government in mind.
However, the dataset is far too small to train a capable large language model on its own. Political censorship would therefore limit the technical possibilities, at least at the current state of the technology.
If scaling laws continue to apply to large AI models, the limitation of data material for AI training would likely be a competitive disadvantage.
At the end of December, China released four large generative AI models from Alibaba, Baidu, Tencent, and 360 Group that had passed China’s official “Large Model Standard Compliance Assessment.”
China first released guidelines for generative AI services last summer. A key rule is that companies offering AI systems to the public must undergo a security review process, in which the government checks for political statements and whether they are in line with the “core values of socialism.”