When China’s top internet control agency announced late last month that it would launch a security review of the country’s leading academic research database, China National Knowledge Infrastructure (CNKI), citing the need to protect “important data,” the reactions online were broadly of two types. While some wished to know exactly what kind of content concerned the authorities, many others cheered the action, noting how CNKI had drawn ire for many months over its skyrocketing subscription fees and its monopoly hold on the sector.
Both reactions were largely missing the point. The most recent action against CNKI is not about the sensitivity of certain types of data. Nor is it about the fair price for access to data. The issue for China’s authorities is about the abundant public access the CNKI database provides to an astonishing range of information. On the face of it, most of the information available through CNKI, drawn from thousands of periodical titles, is non-sensitive. Once made widely available to researchers, however, and once contextualized and interpreted, it can be revealing in ways that unsettle China’s leadership.
How Much is “Too Much”?
CNKI’s most recent troubles were announced on June 24 by the Cyberspace Administration of China (CAC), which said it had met the previous day with management at Tongfang Knowledge Network Technology Co., Ltd., the company operating the CNKI site. The CAC notice said the company had been informed it would be subject to a “security review” (安全审查) on the grounds that its database service, which has been widely used for academic research in China and abroad, “contains large amounts of personal information as well as important data concerning national defense, industry, telecommunications, transportation, natural resources, health and sanitation, finance and other key industry sectors.”
The sheer breadth involved suggested that article and source removals on the site could be extensive to address these concerns. The haziness of the matter was captured nicely in the news headlines, such as in a report from the tech-related public account “Leitech” (雷科技), shared at popular portals like Netease, which said simply that CNKI “holds too much sensitive information” (掌握太多敏感信息).
What did this mean?
How, for starters, would the CAC define “personal information”? Academic articles in CNKI, for example, frequently include the e-mail addresses of authors. Would these addresses be regarded as personal information and redacted along with other “important data”? Would entire articles be removed?
Another key question is how sensitive information, “important data,” or unwarranted levels of personal information could have gotten through in the first place, given the fact that CNKI simply aggregates issues from Chinese periodicals across the country, which must already have passed muster prior to publication with the party-state organs supervising them.
For a taste of what CNKI has on offer, consider the Journal of Intelligence (情报杂志), one of the more than 8,500 periodical titles in the database. The publication sounds riveting if intelligence is your game. And fear of intelligence exposures, prompted by media coverage of the CAC notice, certainly motivated the comments of many internet users. “More investigations like this should be imposed on other institutions,” read one comment under a related post by foreign relations expert Jin Canrong (金灿荣) on Weibo. “I don’t want to see our education system infiltrated like this [by foreign threats].”
But the journal, published by the Science and Technology Department of Shaanxi Province (陕西省科学技术厅), an office under the provincial government, is not exactly a leaking tap. Most of the academic articles at the Journal of Intelligence deal with issues of cyber strategy that are common knowledge to those in China or studying it from the outside, such as efforts to gauge “public opinion risk” following sudden-breaking incidents, how influencers are formed online, or simply how to conduct content analysis.
This is where the question of volume and accessibility come in as the primary point of sensitivity. Even if the trivial details discussed in publications like the Journal of Intelligence are prima facie non-sensitive, they can, taken collectively, help researchers piece together various aspects of Chinese politics, governance, society, industrial strategies and so on.
When, for example, CMP recently investigated how journalists for Chinese state media were using personal accounts on platforms like Twitter to amplify government talking points, this phenomenon could be connected to explicit strategies at outlets like Xinhua elucidated in studies available in the Chinese academic literature, which spoke of the use of “personal brands” of “individual netizens” (trusted personal account holders with Party-state affiliation) as an effective strategy for external propaganda.
The sensitivity here lies not in the information or data itself, but rather in the research that broad access to this information makes possible. And the volume of information available in CNKI is astonishing. As the largest aggregator of academic electronic resources in China, CNKI covers more than 95 percent of published academic resources. As of January this year, CNKI included more than 8,540 journals in total, with more than 60 million full-text documents.
The issue, then, is not that CNKI “holds too much sensitive information,” but that it simply holds too much information at a single access point, period. The breadth and volume of information potentially available to the public through the platform – all held, moreover, under a single state-owned software company linked to China’s Tsinghua University – has become a source of sensitivity for the leadership.
This is where the issues of pricing and monopoly that have trailed CNKI in recent years intersect with the question of national security.
CNKI came under renewed scrutiny in April after local media reported that the Chinese Academy of Sciences, the country’s top academy for the natural sciences, said it would suspend its use of CNKI, citing high subscription fees. An internal email circulated to CAS members outlining the decision explained that the academy’s subscription fees to CNKI had topped 10 million yuan in 2021.
Such complaints are by no means new. A number of universities in recent years have either discontinued use of CNKI or let their grievance be known. In January 2016, the Wuhan University of Technology announced that it would end its subscription to the service, its library noting that the price quoted by CNKI had risen above 10 percent each year since 2000. From 2010 to 2016, the university said, the price for use of CNKI had increased by more than 132 percent.
Two months later, Peking University also announced through its school website that it would discontinue use. Accusations of plagiarism and price gouging have plagued CNKI ever since. In 2019, a law student a Suzhou University sued the service over unfair pricing that deprived consumers of the right to choose.
The announcement by the Chinese Academy of Sciences in April this year that it would discontinue use of CNKI was just the latest sign of discontentment within academia. Finally, on May 13, China’s State Administration for Market Regulation announced an anti-trust investigation into the company. The announcement brought praise and sighs of relief from many.
A report from the China Youth Daily newspaper quoted Zhu Jian (朱剑), the former editor-in-chief of the official school journal at Nanjing University, as saying that CNKI had grown into something far more powerful than an information business, giving it a central role in regulating relationships among writers, publications and readers. Another journal editor told the paper that CNKI had become “like the air,” and no one could do without air.
Breaking CNKI’s monopoly should have addressed the range of issues plaguing the platform and its relationship to academia. But then came the June bombshell. Not only was CNKI far too dominant, “like the air.” The authorities now seemed to be saying that there was something foul about the air itself.
If there were real national security issues at stake with CNKI content, why was this coming to light only now?
Part of the answer, overlooked in much coverage, is that the security-related action comes amidst a series of cybersecurity reviews for major tech companies, such as that for Didi Chuxing Technology Co., the ride-hailing company, which was recently completed after a full year. There have also been security reviews for US-listed companies, including the digital freight platform Full Truck Alliance – China’s “Uber for trucks” – and the online recruitment platform Kanzhun.
But CNKI’s case is also different. It is the first company to be targeted since the revision in February this year of China’s Cybersecurity Review Measures (网络安全审查办法), which the CAC said in a release would be the “cornerstone of strengthening national data security.”
According to a report from the 21st Century Business Herald (21世纪经济报道) last month, CNKI is likely to face a very different form of security review from those that faced the likes of Didi. According to Wu Shenguo (吴沈括), the deputy director of the Internet Society of China Research Center, this is because the platform is an immense ecosystem for the flow of intellectual products, or zhishi chanpin (知识产品). This, said the report, made the matter “more complex.”
As the CAC has made clear, one key distinction in the revised Cybersecurity Review Measures is that they require network platform operators, a designation that would include CNKI, to carry out “data handling activities” (数据处理活动) based on the “impact or possible impact on national security.” This essentially provides the basis for a full review of CNKI content on national security grounds.
“Law enforcement may systematically review the cybersecurity status of CNKI,” the 21st Century Business Herald report concluded, “and assess the risk of theft, leakage, destruction, and illegal use of data.”
Understanding that the content shared through CNKI has already been subjected to various levels of official scrutiny prior to publication, the periodicals themselves being controlled through a strict licensing system, the criterion of potential “illegal use” should be considered to have broader implications – recalling recent restrictions placed on other platforms providing general information to the public. These include the likes of business data search firm Tianyancha – which researchers at MIT were able to use, for example, to identify “active firms based in China producing facial recognition AI” – and China Judgements Online (中国裁判文书网), a collection of judgments and decisions from Chinese courts.
The fact that many databases, like Tianyancha, are not made accessible outside of China except by use of a VPN, is a further sign of the general sensitivity of public data access points for the Chinese authorities. Access itself, in other words, is a point of sensitivity, and something over which the government intends to exercise much greater control. Sure, there may be real needles of national security concern hidden somewhere in the haystack of CNKI. But the larger point is that the whole haystack, to the extent that it allows the broader public to root around, is a growing problem.
As one source in Shanghai told Radio Free Asia, “The real reason that CNKI is being investigated is not that it holds sensitive information as the reports suggest, but because search platforms for public information, just like China Judgements Online and Tianyancha, enable ordinary people to know how in the past, present and future the government will violate citizens’ personal rights, with power and money deals, academic corruption and so on.”