Headlines and Hashtags

Can China Be Trusted to Lead on AI Safety?

While the country presents itself as a leader in AI safety, a closer look suggests its governance priorities may not always align with international concerns — raising questions about who should shape the emerging global AI order.

Dec 18, 2025

While AI development accelerates from week to week, so rapidly that most of us are hard-pressed to keep up, it seems that international governance is stalling. For their part, many frontier AI companies have abandoned the safety commitments made at international summits. Meanwhile, policymakers in global capitals like Beijing, Brussels and Washington are competing for the high ground when it comes to the emerging international system for AI governance. The stakes could not be higher. If some prognosticators are right, a breakneck race to AGI between rival powers could end catastrophically within five years.

The case can be made that the international system needs determined nations, or blocs, to move forward on this critical issue. But the Trump administration is leaning isolationist on global governance and laissez-faire on domestic regulation. So it is no doubt tempting for some scientists to find hope in China’s apparent resolve on AI governance, and to turn toward it as a key partner to meet international governance challenges — in the same way that China is an indispensable global partner on the environment.

The pull of China was palpable last week over at Nature, one of the world’s most-cited scientific journals, as it ran an op-ed called “China is leading the world in AI governance: other countries must engage.” The authors argue that China’s dedication to AI regulation makes it ideal to lead international AI governance. Other governments, they suggest, “should get on board” with the Shanghai-based World AI Cooperation Organization (WAICO), which China proposed back in July this year.

But a closer look at Chinese AI systems raises serious questions about these claims. Consider DeepSeek-R1, praised by one Nature-quoted scientist as coming from “the most regulated [AI company] in the world.” In English-language jailbreaking interactions at the China Media Project, we easily obtain accurate instructions for producing fentanyl, anthrax, cyanide, semtex, bazookas, Molotov cocktails, and napalm. Alibaba’s Qwen-3-Max chatbot also yielded detailed recipes for each of these — through a jailbreaking tactic so simple it was being used on ChatGPT three years ago. This is a loophole that OpenAI has long since closed, in both Chinese and English. Indeed, our OpenAI accounts were terminated after trying just one of these prompts.

A chat labelled by DeepSeek-R1 as “Grandma’s Fentanyl Production Lullaby.” The bot is vulnerable to a tactic known as the “grandma jailbreak”, tricked yielding accurate ingredients for fentanyl which we have blanked out. Qwen3-Max went into even further detail, including the temperature and pH needed to grow anthrax, a bioweapon.

How are Chinese models, so closely watched by the government, performing on these same concerns? Buried in DeepSeek’s technical papers is a statistic showing their model has a jailbreaking rate up to three times higher than equivalent models from Alibaba, Anthropic, or OpenAI. The company claims to have resolved this with a “risk control system,” but our tests conducted on DeepSeek’s website, where this system was presumably active, are hardly encouraging. While jailbreaking is still a problem in models across the world, the UK-based AISI notes in their recent jailbreak tests of multiple anonymous AI models that some take up to seven hours to crack (rather than our five minutes), and that open-source models are “particularly hard to safeguard against misuse.” Open-source is now effectively a Chinese AI trademark.

This invites a simple and direct question. Why is China, a country so fixated on AI regulation, trailing on such a basic safety issue? How can it lag on this behind the US, a country that has little to no AI regulation, and is busy picking apart related advancements?

The Trump administration’s retreat on AI may dismay scientists and experts. But dismay does not make Zhongnanhai’s rhetoric more sincere or its governance more effective. Before China’s prolific regulations, promises and discussions — amplified by well-connected groups like the Chinese AI safety research firm Concordia AI — mesmerize us, we should measure them against observable safety failures that remain inexplicably unresolved. International cooperation is a must. But international cooperation must also rest on a clear-eyed understanding of a partner’s broader goals, as well as the pitfalls on safety that could loom ahead.

To understand this gap between regulatory rhetoric and reality, it is worth examining what drives Beijing’s AI governance agenda. First, we should recognize what China’s leaders have stated only too clearly in the country’s domestic political discourse: that they regard AI, first and foremost, as a means of elevating China’s global standing.

Safety First?

When the State Council released its comprehensive AI development plan in 2017 — China’s first holistic policy on the technology — it identified strengthening the country’s international status as the primary benefit, with security and economic growth as secondary considerations. During a subsequent 2018 Politburo learning session on artificial intelligence, Xi Jinping described the technology as an essential “strategic lever” for competing in the global tech race, capable of producing what he termed a “lead goose effect” — a metaphorical reference to how the frontmost bird in a flying V-formation determines the path for those trailing behind.

This competitive framing has shaped how Beijing approaches international AI cooperation. Beijing views international promotion of its AI technologies and regulatory frameworks as instrumental to achieving diplomatic and geopolitical ambitions. The State Council’s 2017 policy encouraged domestic firms to leverage existing frameworks like the Belt and Road Initiative (BRI), China’s global investment and infrastructure program. The symbolism is hard to miss. Xi Jinping unveiled the Global AI Governance Initiative in 2023 at a BRI gathering, laying out Beijing’s approach to international AI engagement. The BRI and companion Xi-era programs — including the Global Development Initiative (GDI), Global Security Initiative (GSI), and the recently introduced Global Governance Initiative (GGI) — aim to establish what the CCP calls “a community of shared destiny for mankind,” framing China as a defender of collective international priorities.

While this rhetoric invokes universal human rights, it actually reinforces Beijing’s doctrine of non-interference and validates its state-first model, where individual freedoms remain subordinate to national objectives. Our testing-based research of Chinese AI models has demonstrated repeatedly that those national objectives include advancing the Chinese Communist Party’s political goals, such as the suppression of speech deemed politically sensitive or critical.

China has already launched multiple cooperation frameworks designed to export Chinese AI products and governance, using existing multilateral institutions as a base. They have established frameworks for the UN, BRICS, and the Shanghai Cooperation Organization (SCO). There is also an ASEAN network run by the Guangxi provincial government. Writing in Seeking Truth, the CCP’s main theoretical journal, Guangxi Party Secretary Liu Ning declared in August that the province would play a central role in creating “a China-ASEAN community of common destiny” through AI development. Any “World AI Cooperation Organization” would certainly follow the same template and pursue identical aims.

When Push Comes to Shove

While Xi Jinping has emphasized balancing safety and development in AI rollout, the economic and strategic goals of enterprises and provincial governments often override safety concerns in practice. The State Council has set a target for 70 percent AI penetration into China’s society and economy within two years. Provincial governments, it should be recalled, have a long track record of overriding safety priorities and regulations in the name of central government demands, such as economic growth. Environmental rules were constantly flouted during the economic expansion of the 1990s and 2000s. More recently, basic safety protocols were widely ignored during China’s zero-Covid policy, sometimes with fatal results.

Passersby help to lift a Hello Auto vehicle off of a pedestrian following an accident in June 2025.

We hope that this time will prove different, but recent incidents in China’s autonomous vehicle sector illustrate this pattern. In June, Hello Bike, a smart-bike company, expanded into self-driving cars as “Hello Auto,” announcing plans for 70,000 vehicles across China by 2027. A co-founder stated in September that safety was a priority, and China already has a number of standards in force to regulate self-driving vehicles. But one of Hello Auto’s test vehicles ran over two pedestrians crossing a Hunan city street last week — what some industry insiders described to Caixin Media as China’s first serious self-driving car accident. According to Jiemian News, the company was also involved in a collision two weeks earlier. An industry insider told the outlet that the company could not have accumulated the road data required for safe driving “in just six months of its establishment.” Nonetheless, Hello Bike has already signed a deal with a Singapore transportation company to expand their self-driving products abroad.

The government’s attitude toward AI safety becomes clearest when it conflicts with national strategy. Consider open-source AI, integral to China’s AI systems. Internationally renowned AI scientists like Yoshua Bengio have pointed out that launching frontier AI models on the internet — downloadable by anyone without security checks — allows bad actors to obtain them for malicious use. Chinese enterprises and government-run tech industry associations have long known about these safety issues and appear to have been working on solutions since the beginning of this year, but have offered no concrete fixes yet. Such solutions would likely require making AI models less accessible, which could conflict with open-source being a key strategy since the 14th five-year plan in 2021, cited as a way to accelerate China’s scientific development. In his speech at APEC about WAICO in November, Xi said that China will deepen open-source cooperation with the world. That makes U-turns on open-source an impossibility.

While some Chinese scientists may seem genuinely motivated to pursue international cooperation for the sake of safe AI development, we cannot assume government priorities are aligned with their personal convictions, or that they are able to push against the grain. At the risk of sounding cynical, we have to consider that creating an image that appeals to international AI safety concerns may actually serve the broader interests of the government on AI in ways that run counter to safety.

We can even hear these tensions already at play in PRC policy documents. A new AI safety framework recently released by the Cyberspace Administration of China creates a national risk framework for AI using phrases commonly heard in the international AI safety community. But according to the document’s accompanying expert interpretation, the framework serves to “gain international trust in safety and compliance, laying the foundation for Chinese AI to expand globally.”

We can always, of course, hope for the best from international exchange and cooperation, and China has to be at the table. Should it sit at the head of the table? That is a different question entirely, and the international AI community should have no illusions about what priorities will take precedence when safety and national development are in conflict. When it comes to the balance between national strategic interests and global safety priorities, expect China first, not safety first. And then, sure, test your assumptions against China’s actions and performance — and hope to be surprised.