When Propaganda Trains the Bots: Why You Should Read About LLM Grooming
In a digital world where truth is increasingly shaped by algorithms, awareness might be the last defence we have.
Could an enemy inject propaganda into our educational system? Our art? Our scientific literature? With artificial intelligence, those fears are no longer hypothetical. We now have real-world examples of people using large language models to flood the internet with false narratives. But a new report from the American Sunlight Project outlines something even more troubling: disinformation isn’t just being produced by AI—it’s being trained by it.
The report calls this phenomenon LLM grooming, and if you care about democracy, information integrity, or even just being able to trust what you read online, it’s worth your attention.
The researchers behind the report identified a network of pro-Russia propaganda websites they call the Pravda network. It’s not built to engage human readers. The sites are clunky, mistranslated, and hard to navigate. But that’s the point—they’re not trying to go viral with people. They’re targeting algorithms. Their goal is to flood training datasets used by AI models like ChatGPT and others, so these models start citing and reproducing their narratives as facts. From the report:
Over the past several months, ASP researchers have investigated 108 new domains and subdomains belonging to the Pravda network, a previously-established ecosystem of largely identical, automated web pages that previously targeted many countries in Europe as well as Africa and Asia with pro-Russia narratives about the war in Ukraine.
ASP’s research, in combination with that of other organizations, brings the total number of associated domains and subdomains to 182. The network’s older targets largely consisted of states belonging to or aligned with the West.
Notably, this latest expansion includes many countries in Africa, the Asia-Pacific, the Middle East, and North America. It also includes entities other than countries as targets, specifically non-sovereign nations, international organizations, audiences for specific languages, and prominent heads of state.
The top objective of the network appears to be duplicating as much pro-Russia content as widely as possible. With one click, a single article could be autotranslated and autoshared with dozens of other sites that appear to target hundreds of millions of people worldwide.
ASP researchers also believe the network may have been custom-built to flood large language models (LLMs) with pro-Russia content. The network is unfriendly to human users; sites within the network boast no search function, poor formatting, and unreliable scrolling, among other usability issues. This final finding poses foundational implications for the intersection of disinformation and artificial intelligence (AI), which threaten to turbocharge highly automated, global information operations in the future.
This is a big shift from the old model of disinformation. It’s not just about tricking a few people into believing lies—it’s about embedding falsehoods into the infrastructure of how we access information.
Since the report’s release, organisations like NewsGuard and the Atlantic Council’s DFRLab have confirmed that major AI models have indeed cited Pravda network content. Once that happens, those narratives can be repeated by unsuspecting users, cited in articles, and even end up in places like Wikipedia. It’s a form of information laundering that’s almost invisible. The goal is simple: to disrupt elections and sow chaos.
“Past reporting on potential motives of the Pravda network has focused on the anti-Ukraine, pro-war nature of much of the network as well as possible implications for European elections throughout 2024,” the authors wrote.
The three main motives behind the Pravda network's operations are:
LLM Grooming
The Pravda network appears designed to target not just people, but automated systems—specifically web crawlers and the training pipelines of large language models (LLMs). By flooding the internet with duplicated pro-Russia content, the network seeks to influence what LLMs learn and ultimately how they respond. This manipulation, called "LLM grooming," could cause AI systems to repeat disinformation, shaping the future of automated communication and search without users realising it.Mass Saturation
By publishing a high volume of content daily across multiple platforms, the Pravda network aims to dominate the online information space. This saturation strategy increases the chances that users will see the content directly, encounter it quoted on other sites, or stumble across it in encyclopedia-style summaries. Saturation also helps ensure that the targeted narrative becomes a persistent part of the digital environment.Exploiting the Illusory Truth Effect
The network takes advantage of a psychological bias where people are more likely to believe something if they've seen it multiple times from different sources. By spreading the same narratives across Telegram, X, VK, Bluesky, and through citations by other media outlets—intentionally or not—the network increases both the reach and perceived credibility of its content. This cross-platform repetition strengthens the illusion of truth and further embeds the message.
The implications are serious. If AI-generated content is increasingly based on disinformation, and future models are trained on that same AI-generated content, we risk what researchers call model collapse—a feedback loop of garbage in, garbage out. Human-written content could become the rare exception, and trust in digital information could erode even further.
The American Sunlight Project lays out several steps to push back. AI developers need to clean their training data and avoid using known disinformation sources. Lawmakers should mandate transparency and labelling for AI-generated content. And just as important, we need national information-literacy programs to help adults and kids understand what they’re seeing online.
This issue isn’t going away. In fact, it’s just getting started. The report is dense, detailed, and worth reading in full. It’s one of the clearest looks yet at how AI is changing the shape of the internet—and how propaganda is adapting to those changes.