"So what, kinda like a bridge, but for AI"?
The conundrums of creating—and understanding—public digital infrastructure.
Next week is the AI Action Summit in Paris, the biggest global AI gathering since the 2023 summit in the UK. Half the people I know are going and the other half are having FOMO. One of the buzzwords there will be “public AI”—basically, the notion that governments need to build their own publicly owned and controlled AI infrastructure to serve societal goals instead of profit motives. (EDIT: It doesn’t have to be just governments.)
If that made you snooze, I’m not surprised. Even “public AI” is hardly a rallying cry like “free healthcare” or “save the planet.” Frankly, the people pushing it haven’t done a great job of explaining it in terms anyone but a policy wonk would care about. That’s in part because, surprisingly, there’s no really good historical parallel—no “like X, but for AI.” Certainly not a bridge.
But I think it’s a potentially important idea. So I’m going to try.
The TL;DR: Imagine an entire parallel ecosystem of AI tech, but built and owned by governments. By not being beholden to profit, it might provide things the private AI giants have no incentive to create, like AI tailored for doing research on hard social problems and cutting-edge science, or for serving specific geographic regions and languages. It wouldn’t spew misinformation and hate speech, and would generally try to support a healthy public sphere. And affordable access for all would be guaranteed. Nice idea—but will anyone pay to build it?
First: the problem with private AI
Historically, infrastructure development—think roads, railways, electricity, telephony—has gone like this. A company, or a few of them, start building a thing. But it’s not economic to build the thing everywhere, so access is patchy. If there’s a monopoly, the thing is expensive. If there’s competition, there may be incompatible versions of the thing, like different railway gauges. Perhaps the thing is also dangerous.
Eventually, the government realizes that society really needs the thing. So it steps in to expand access, standardize requirements, reduce prices, and impose safety regulations. It might, for instance, nationalize the thing (electricity in the UK), create incentives for more people to build the thing (electricity in the US), break up monopolies (Standard Oil), or regulate them (railroads). That’s how the thing goes from being just a thing to being infrastructure.
But with 21-century digital services—think social media, cloud computing, online publishing, AI—things become infrastructure faster than the government can get its socks on.1 The internet makes access to the thing almost ubiquitous. The price of the thing often starts at zero (aside from the hidden price of providing your personal data). The variations are a feature, not a bug—choose the version you like best. And because AI especially is a general-purpose technology, lots of people think it could be dangerous, but nobody can agree on exactly how, or whether the dangers outweigh the benefits.
So now the thing is already infrastructure, and it’s cheap for users, and choice is good, and the risks are impossible to pin down. So what kind of intervening can a government even do?
This of course is a story that suits tech companies very nicely. In fact, there are lots of problems with letting a handful of secretive, immensely rich organizations set the rules for a powerful general-purpose technology, which I probably don’t need to elaborate for readers of Futurepolis.2
Most people think the solution to these problems is regulation. Governments impose all sorts of rules on physical infrastructure companies. If the water supply is contaminated, the water company (hopefully) gets it in the neck. But that works only because those things have easily identifiable failure modes—water gets contaminated, bridges collapse, trains get derailed. It’s much harder to regulate a technology like AI that could take any number of forms and be used in any number ways.
Another oft-touted solution is open-source or open-weight3 AI models—alternatives to proprietary ones like chatGPT—which anyone can freely use and adapt. These do have the effect of taking power away from the big AI companies. But they don’t ensure that AI is being used for the public good—they just allow more people to create their own AI products, which could be exactly as bad for all the same reasons.
There’s also much talk about democratic AI governance, which essentially means getting AI companies, funders, and regulators to listen to citizens and incorporate their ideas as to what “good” AI is. This is great, but it does kinda depend on all those institutions being willing to play ball.
Hence the movement for public AI.
What would public AI consist of?
The idea is for governments to build publicly owned versions of the key components of the AI stack, chiefly:
datacenters that public sector organizations and researchers as well as small businesses can use for training and running models
training datasets that anyone can use—what some are calling an “open Library of Alexandria”—which don’t contain junk data, aren’t stolen from copyrighted data, and can be tailored to specific cultural contexts or uses (e.g. for climate modeling)
truly open-source foundation models that countries, research labs and companies can adapt and build on, trained on reliable data and with democratic values built in
standards, goals, and governance mechanisms to guide the development of AI in a socially beneficial direction
What would we get out of it?
This is where the lack of a straightforward historical parallel makes it a little hard to explain the point of public AI. When governments have built public infrastructure in the past, it’s usually been to fill a gap left by the private sector. Here the idea is to build a whole alternative system to ones that already exist. Sort of as if you built an entire railroad network alongside the existing one instead of just subsidizing branch lines to remote places.
So you can’t just say this is “bridges, but for AI” or “power grids, but for AI.” However, there are a few different “X for AI” metaphors that can at least explain different facets of it.
It’s a BBC or PBS for AI. Private media outlets are free to take whatever positions they want, including pushing misinformation or specific political views. But many governments created public broadcasters whose mission includes creating shared values and understandings and a healthy civil society. (And sometimes more than that: the BBC actually spurred the adoption of radio in Britain, because its funding was tied to the number of radios sold.) In the same way, public AI could create AI services that promote democratic values and a healthy public discourse instead of being used to spread misinformation or hate.
It’s a CERN or DARPA for AI. Many of the US’s biggest technological innovations in the 20th century came out of the research labs at firms like AT&T, Xerox, and IBM. But those firms still had profits in their sights, not societal goals. DARPA, however, funds research that’s crucial to US national security. CERN pools billions of dollars worth of research funding that individual countries wouldn’t be able to muster alone. Public AI could do the same, giving scientists the means to do cutting-edge research and develop AI models for uses the private sector might not. For example, a specialist medical AI for public-health research, a housing AI to help solve problems of affordable housing, or a legal AI to improve the justice system.
It’s a Post Office for AI. If DHL or FedEx stop serving certain areas or jack up their prices, the postal service ensures everyone will still have an affordable way to send mail. Right now, anyone who wants to use AI for free has a plethora of options, but will they always? Just look at Twitter to see how a platform can change radically when a new owner takes over. Public AI would ensure that the public always has access to high quality AI services for free or at a guaranteed low cost.
It’s public utilities for AI. A private company has one goal: to make money. It can be regulated against causing harms—pollution, for instance, or dangerous products—but it can’t be forced to do good. A public utility can be. A public power utility, for example, may have to make enough money not only to cover its own costs but to help maintain the electricity grid. A water utility may be required not just to provide clean water but to fund the sewage system or provide irrigation for public parks. In the same way, a public AI utility might be obliged to encode democratic values in its models or support the creation of public datasets. (Here’s a slide deck with some useful diagrams for this.)
It’s public libraries for AI. The library system ensures anyone can have access to knowledge. Public AI ensures anyone can have access to AI.
There are probably some other metaphors you could pick. Again, though the point is that public AI is all these things. That’s what I think makes it hard to get a handle on, because there’s no one description that covers it.
How about a supermarket metaphor?
OK, bear with me. Perhaps the best way to explain public AI is something like this.
There are lots of supermarkets, but a lot of the food they sell is unhealthy or produced in unsustainable ways, and in some places there are food deserts where one chain has a monopoly. So what if the government set up a whole alternative supermarket chain which only sold organic and local produce, foods with low sugar content, nothing too processed, etc, at cost prices, with branches everywhere, and also offered cooking and nutrition classes for free. The economies of scale of that chain would shift incentives for the food industry and boost sustainable and healthy food production so even the private supermarkets would end up changing what they sell. And all of it would ultimately lead to better health outcomes, lower healthcare costs, higher tax revenues (because healthy people can work), and less environmental damage, more than compensating for any initial outlay on the supermarkets.
In this metaphor, the origins of the food—organic, low-sugar, etc—is the training data. The food itself is AI models. The government supermarkets are the datacenters and other physical infrastructure. The private supermarkets and the food industry are the AI private sector. And so on.
When you put it this way, it sounds crazy. No government in its right mind would do this for supermarkets. But maybe they should?
Anyway, I don’t know what to call it. But there’s gotta be something more thrilling than “public AI.”
But will it be built?
There are scattered efforts to build different parts of the public AI stack in different countries. For example, a project called OpenEuroLLM wants to build foundation models for various European languages. There’s Euro Stack, which wants to build “a complete digital ecosystem” for Europe. There are national AI projects in places like Sweden, Switzerland, Singapore, and a couple in the US (though what will become of them in the Trump administration is anyone’s guess). There are reports that an as-yet unnamed foundation for AI in the public interest will be launched in Paris next week.
But it’s nothing even remotely on the scale of what the private sector is doing. Some people are all abuzz about a report from a few days ago that the EU has decided to invest $56 million in an open-source European model (presumably, OpenEuroLLM, though the article doesn’t say). Some point out that China’s DeepSeek reportedly trained its R1 model, which took the world by storm a couple of weeks ago, for just $6 million.
But that number is probably a massive underestimate. Meanwhile, the $56 million is less than a tenth of what Mistral, Europe’s biggest homegrown AI company, raised in a funding round last summer. It’s less than one six-hundredth of the €30-35 billion that one study estimated it would cost to build a “CERN for AI” (and that just in the first three years). Never mind the tens or perhaps even hundreds of billions the US is supposedly planning to throw at the “Stargate” project, though I think we should take those claims with a giant heap of salt.
Still, people in the Public AI Network, a loose coalition of policy people and researchers working on this—and whom I must thank for helping me gain whatever meagre understanding I have of this topic—are planning to propose a handful of “moonshots” at the Paris summit next week. The open-source LLM is the main one, followed by the “library of Alexandria” (massive public datasets), the “CERN for AI” (i.e., massive computing infrastructure), and some frameworks for governance and regulation. I wish them luck… and a better name too.
Links
More about why finding the right name matters. “Jenga politics,” “reverse hockey-stick dismantlement,” “arson”… danah boyd tries to find framings to capture the first weeks of the Trump administration. (apophenia)
And yet more. A study of hundreds of words and phrases that Americans of different political persuasions use in different ways. The conclusion: “Americans seem to speak two different ‘languages’” composed of the same words. And a tool that helps you filter out politically charged language from your own writing. (Better Conflict Bulletin)
Europe’s first Gen-Z revolution. Protests led by Serbian youth forced the prime minister out of power, and they organized themselves using modern technological tools for direct democracy—a first for the country. (Marija Gavrilov via Exponential View)
Track the gutting of the US administrative state. Follow Henry Farrell’s list of experts on BlueSky who are keeping tabs. (BlueSky via Programmable Mutter)
How to actually save $2 trillion. It’s not even a crazy figure to shoot for. You just have to understand that government employees aren’t the source of the waste, they’re the solution to it—which, of course, Musk and Trump won’t. (Prospect, and summarized on Pluralistic)
In a short but excellent essay, Robin Berjon argues that the proliferation of digital infrastructure faster than governments can react to it "may be the biggest but least recognised shock delivered by the internet.”
But in case I do:
Enshittification. There’s a general trend that private-sector tech services get worse as they consolidate their hold on a market, because they lose the incentive to attract more users.
Lock-in. When the service does go to shit, companies try to keep their users from leaving—for example, by making it hard to export their data.
Perverse incentives. The big AI companies’ primary goal isn’t actually to serve their users’ needs. It’s to build the biggest baddest models possible, in the race to get (or so they hope) to artificial general intelligence. This makes them do things like suck up vast amounts of training data that may be copyrighted, or just downright trash.
Lowest common denominator. Nonetheless, private-sector firms do want to get as many users as possible along the way. So they value general-purpose usability over more specific applications aimed at pressing social problems.
Cultural biases. The big LLMs are trained disproportionately on English-language data. While they can answer you in any language you like, those answers are just translations: the underlying content is biased towards what it would be for English-speakers. For other users, it may be culturally insensitive or just plain wrong. That’s because there’s simply less training data available in other languages, and moreover, companies see a diminishing return in training models to serve smaller cultural groups.
No values. LLMs will happily spit out misinformation, hate speech, and justifications for Nazism, if that’s in their training data. Their creators may impose content moderation—or they may not. Just as Meta recently decided to stop fact-checking and allow some forms of hate speech it previously blocked, AI companies may decide it’s in their commercial interests to let their models run riot.
No accountability. Nobody really knows what goes into the building of models like chatGPT—what data they were trained on or how that data was digested. Relying on them is like trusting a bridge to be safe without knowing what materials the construction company used. If the bridge collapses, who’s going to take the blame?
Concentration of wealth and power. Companies valued at hundreds of billions of dollars with a stranglehold on a technology can make governments do their bidding.
Other negative externalities. Such as climate impacts from building massive datacenters.
DeepSeek’s and Meta’s models, for example, are open-weight, despite often being described as open-source. The difference is that open-weight models don’t disclose the data they were trained on or details of the training process, only the outcome—as if you published the blueprints for a building but not the details of the construction process or where the materials came from. More details on this distinction here and here.