The Hidden Cost of the New Internet

The Hidden Cost of the New Internet
The Wikimedia Foundation is sounding the alarm on generative artificial intelligence — a technology that quietly consumes the resources of Wikipedia and its sister projects to feed large language models, threatening the sustainability of one of the world’s greatest repositories of free knowledge.

As artificial intelligence systems become ever more deeply embedded in daily life, a costly secret is emerging from behind the curtain: to function, these massive models depend on the continuous extraction of information from the web, including indispensable resources such as those of Wikimedia Commons. The warning comes directly from the Wikimedia Foundation, the organization behind Wikipedia, which has documented the accelerating use of crawlers — automated programs designed to index content — that represent not human traffic but an escalating burden on the Foundation’s servers. In a matter of months, download traffic from these bots has surged a striking 50% since January 2023: a rate of growth that first surprises, then alarms.
The situation grows more complex as the Wikimedia community — sustained almost entirely by volunteer labor and open content — confronts an unrelenting wave of automated requests. Although bots account for just 35% of page views on the platform, they generate 65% of its most resource-intensive traffic. That burden slows access to information during peak periods — breaking news events, for instance — and raises a more fundamental dilemma: what becomes of the human user? The growing disparity in content demand threatens to undermine Wikimedia‘s core mission as a guardian of free and accessible knowledge.
The Balance of Resource Use: A Call for Corporate Responsibility
The problem extends beyond resource consumption to the question of data value itself. Major technology companies have been leveraging Wikipedia’s community-generated content to train their generative models. These models — which depend on verified human data to avoid so-called “hallucinations,” the erroneous outputs that AI systems can produce — are drawing vital resources from a platform built on the premise of free knowledge. The conversion of that content into a commercial asset lays bare a fundamental inequity: the corporations that profit most from artificial intelligence are not compensating those who produce the quality material that powers their systems.

Wikimedia’s Chief Product Officer, Birgit Mueller, has stated unequivocally that AI’s purpose should be “to be useful to people.” Yet the technology’s deepening dependence on Wikimedia‘s resources reveals a relationship that is anything but balanced. Companies must find ways to consume content that do not quietly strangle the communities that produce it. The concern is pointed: when users pose a question and a chatbot delivers an immediate answer, the motivation to visit the original source — Wikipedia itself — diminishes. This self-reinforcing cycle threatens the very community that has labored for years to provide accurate, accessible information to anyone who seeks it.
The deepening interdependence between Wikimedia‘s data and the expansion of AI invites a harder question about the future: one in which bots grow increasingly dominant, steadily displacing human users from the information ecosystem. Are we allowing the mission of a free internet to be quietly eroded in favor of an AI-mediated era — one in which access to knowledge is brokered by machines rather than sought by people? The digital future is being built today, and the decisions we make now will determine whether the great commons of free knowledge can endure.


'