The promises and perils of voluntary commitments for AI safety
It's a better place to start than you might think, but stronger measures will soon be necessary
I’m writing this from Paris, where the AI Action Summit is underway and top tech companies are rolling out their latest voluntary AI commitments.
For examples, OpenAI released an update on their progress towards their commitments, DeepMind updated their Frontier Safety Framework, Meta launched their Frontier AI Framework, Microsoft launched their Frontier Governance Framework, Amazon launched their Frontier Model Safety Framework, and xAI made a Risk Management Framework. Furthermore, this wasn’t just an American activity - G42, the UAE-based AI company, also released a Frontier AI Safety Framework. (…Yes, there does seem to be some sort of rule that no two companies can have the exact same name. Seems important to get your framework in fast before all the good names are taken!)
The flurry of announcements suggests the industry takes these voluntary commitments seriously, and I applaud companies for that. But their true value lies not in the promises themselves, but in how they fit into the broader view of how industries and government ensure safety. How will this work? Upon review, there’s good reasons to be both optimistic and pessimistic.
The Optimistic Case for Voluntary Commitments
The case for voluntary commitments rests on several key advantages that are often overlooked by critics.
Doing crimes with AI is already illegal
When a political consultant used an AI-powered deepfake to clone President Biden’s voice to discourage voters from participating in the New Hampshire primary, this was a novel use of AI that had not been tested before by election law. However, the Federal Communications Commission still zeroed in on the fact that, under the Telephone Consumer Protection Act (TCPA), it’s already illegal to send robocalls using prerecorded voices without consent. While this law was passed in 1991 well before AI deepfakes, it was still sufficient to result in a $6 million fine against the consultant for the deepfake and a $1M fine against the service provider that transmitted these calls for failing to verify caller IDs properly.
These enforcement actions demonstrate that criminal behavior mediated by AI is still criminal behavior even if the laws were not written with AI in mind. If someone were to use AI to make a bioweapon or cyberattack, they would be held liable under existing laws that make such things illegal. Of course, actually enforcing these laws and containing the damage is another matter, so it’s not like problem solved. But it’s also not like the AI is a wild west of “anything goes” either.
Good anticipatory regulation is hard
A core challenge in all regulation - not just AI - is that getting good regulations is actually hard, especially when you have to anticipate the challenges in advance. Current regulations in AI involve aspects of AI that are poorly understood and involve harms that are not yet be well-studied. Moreover, the technologies being invoked to guard against these risks are themselves also in their infancy.
Furthermore, regulations involve making important trade-offs. Regulations that are too restrictive might prevent minor immediate harms but could block beneficial innovations that would create much greater long-term value. This seems arguably the case in autonomous vehicles, where such driverless cars are likely already safer than human drivers, but are being held to very high standards and denied a fast rollout.
And we could face worse. Just consider the National Environmental Policy Act (NEPA) as a cautionary tale. Despite the best of intentions towards protecting the environment, NEPA’s overblown review process has stalled clean energy and infrastructure projects for years with full EISs sometimes taking over 4–5 years and a “litigation doom loop” that drags construction even longer. This costs us billions of dollars in lost opportunity, discouraged investment, and ultimately harms the environment it was supposed to help.
The challenge is finding the sweet spot between enabling innovation and ensuring adequate safeguards, all while lacking historical precedent for knowing how different proposals will affect these trade-offs.
Demonstrating that a practice works in a voluntary setting is crucial for its eventual codification into law, so we can have precedent and evidence that a particular regulation will be workable and won’t backfire.
Voluntary commitments reflect a typical regulatory process
Rather than imposing rigid rules immediately, governments can use voluntary commitments to establish desired outcomes while giving companies flexibility to develop practical solutions. This approach recognizes both the urgency of AI safety and the complexity of regulating rapidly evolving technology.
In fact, this is a common path for developing safety standards for industries. When aviation was first developing, airlines voluntarily adopted safety practices and technical standards through industry associations, which later formed the foundation for formal FAA regulations.
Today's rigorous aviation safety framework didn't appear overnight - it evolved through decades of industry experience, voluntary standards, and gradual formalization into law. It begins with individual companies establishing internal policies, like Anthropic's Responsible Scaling Policy. These individual efforts grow into broader industry-government collaborations, like the White House's Voluntary Commitments or the Seoul Commitments, which saw governments setting broad safety goals without immediate legal mandates. As companies figure out how to meet these goals and these practices prove workable, they crystallize into industry best practice. Then the cycle can complete - if industry best practice isn’t widely followed or is deemed insufficient, the practices can be codified into binding regulations - potentially with additional requirements beyond what companies initially volunteered.
Thus voluntary commitments aren't a sign of weakness, but rather a crucial stage in a well-established process.
Voluntary commitments can be the basis of liability law
Even without formal legislation, industry best practice, often born from voluntary commitments, can form the basis of negligence lawsuits, creating an incentive pushing companies towards safety.
In Tort Law and Frontier AI Governance, Matthew van der Merwe, Ketan Ramakrishnan, and Markus Anderljung argue that the current tort system can already provide a framework for holding AI companies accountable for damages caused by their systems. This approach helps fill some of the regulatory gap and complements future AI-specific regulations, providing immediate incentives for responsible AI development.
The authors outline three specific scenarios where tort liability might apply: careless development (e.g., a company's AI creating a computer worm that causes widespread damage), AI systems pursuing goals in harmful ways, and failing to implement reasonable safeguards against malicious use. If a company is being clearly careless in their development, they can already potentially be held liable for what their AI systems do.
But this also has an important connection to voluntary commitments. When courts need to decide if a company was negligent, they often look at what other similar companies typically do. This is because "reasonable care" isn't defined in a vacuum - it's based on what experts and professionals in the field consider necessary and appropriate.
Thus when companies in an industry start doing something consistently to address a recognized risk or problem (such as creating voluntary commitments), when experts and professionals in that field generally agree it's a good approach, and the safety practice becomes common (but not necessarily universal) in industry, it becomes considered an industry best practice. And at that point, it can potentially be used for liability lawsuits.
Voluntary commitments come with further implicit threats and accountability
When a company does a voluntary commitment, they know that the government and people are watching. If the system works as designed, backsliding on that commitment can lead to public outrage and government action. This implicit threat serves as a powerful incentive for making self-regulation be more than a “fig leaf”. Voluntary commitments create reputational stakes that companies take seriously.
But critically, it does depend on governments - and people - paying attention.
The Pessimistic Case for Voluntary Commitments
However, while voluntary commitments offer meaningful advantages, they also face limitations, especially as AI capabilities advance. These challenges call into question whether the traditional path from voluntary standards to regulation can keep pace.
Voluntary commitments risk “safety washing” and backtracking
In 2018, following employee protests over Project Maven, Google announced it would no longer develop AI for use in weapons or surveillance. However, in 2021, Project Nimbus, a $1.2 billion contract with Israel, suggested Google was pivoting back towards national security applications, and employees were terminated for protesting it. This shift continued this year, with Google officially removing their commitment to avoid AI use in weapons and surveillance, citing competition and a complex geopolitical landscape.
Perhaps this shift was merited, and if so that would be an example of how the flexibility in voluntary commitments is generally helpful. But the pattern extends beyond just Google. Companies frequently make bold ethical commitments during periods of public scrutiny or concern, only to quietly walk them back when market conditions or strategic interests shift.
The problem is that companies can decide to withdraw from voluntary commitments whenever they want for whatever reason they want and face minimal immediate repercussions, and potentially no repercussions at all if people and the governments, civil society, and people don’t actually work to keep companies accountable.
Companies face a lot of bad incentives and fall prey to a “Prisoner’s Dilemma”
While individual companies may choose to abandon commitments, this behavior is part of a broader systemic challenge — companies face intense competitive pressure to develop and deploy AI systems quickly. Even if everyone would be better off moving more carefully, individual companies may feel compelled to cut corners on safety to avoid falling behind.
This creates a classic prisoner's dilemma: if all companies stick to strong safety practices, everyone benefits from reduced risks and sustained public trust. But if some companies ignore safety while others maintain it, the “defectors” can gain significant market advantages - they move faster, spend less on safety measures, and potentially capture critical market share or technological leads. This creates pressure for everyone to defect from safety commitments, leading to a race to the bottom.
This is exactly the kind of situation where government regulation is typically needed. Just as labor laws prevent a race to the bottom on worker protections, AI safety may require external enforcement to overcome these destructive incentives. This suggests that while voluntary commitments are valuable as an interim step, they likely need to evolve into binding regulations to fully address these coordination challenges.
There are limited opportunities for iteration when the risks are high-stakes
These competitive pressures become even more concerning when we consider that AI development may not give us second chances.
Traditional industries can learn from failures - for instance, the 1911 Austin Dam collapse tragically killed 78 people and led directly to Pennsylvania's first dam safety law. But if the dam had somehow killed everyone on Earth, there wouldn’t be anyone left to implement the next round of dam safety law measures.
The idea of AI being catastrophic may appear “sci-fi”, but there are legitimate reasons to be worried. The International AI Safety Report — authored by 100 leading AI experts and commissioned by the UK government and supported by over 30 nations plus the UN and EU — highlights how future advanced AI systems pose critical national security risks. Future AI could potentially attack critical infrastructure, enable novel bioweapons, or even become outright uncontrollable.
While a dam failure is tragic but local, future AI could cause simultaneous global harm affecting billions. And traditional industries show warning signs - a dam shows structural stress before failing. But once a powerful AI system starts behaving unexpectedly, it might be too late to contain.
This combination of catastrophic potential, rapid speed, and complex failure modes means we may not be able to rely on the traditional regulatory approach of “try, fail, learn, and improve”. By the time we observe a major failure, it could be too late.
AI might be moving too fast for voluntary commitments
Another concern is that the traditional voluntary-to-regulatory pipeline works well when technological change is relatively slow. For example, the aviation industry had decades to develop safety standards. But AI capabilities are advancing at a blistering pace - we went from GPT-2 to GPT-4 with massive capability jumps. And AI regularly goes from random guessing to superhuman performance on tasks in just 2-3 years.
This rapid advancement means regulators will struggle to keep pace. In “Time's Up for AI Policy”, Miles Brundage, the former Head of Policy Research at OpenAI, argues that advanced AI systems surpassing human capabilities across domains could come in just a few years, and our existing policy frameworks and capacity for response remain critically unprepared. If it truly is “time up” right now, we’re in big trouble.
This suggests that while voluntary commitments remain valuable, we may not have the time to see them through and we may actually have to step in with more powerful guardrails earlier than normally would, and face risks of bad trade-offs.
Conclusion: Threading the Needle
Ultimately, the question isn't whether voluntary commitments are good or bad – it's how to best use them as a tool - and ensure they match the magnitude and urgency of the challenge ahead. AI isn’t like other technologies and we must understand that in our approach.
We need to applaud and support voluntary AI commitments. They’re a great first step for companies and they help people who care about risks from AI to have a basis to see where companies are at and push for more. Summits, such as the Paris AI Action Summit, perform an important role in enabling governments to convene, understand where AI is at, and decide how to craft policy.
We also shouldn’t rush to regulate, especially if we don’t understand. Generally, the free market is already good at working safety into their considerations, or risk public backlash. But with national security at stake with AI issues, government can’t be too careful.
We need to keep this up, and we need more. Governments must further develop the technical expertise to evaluate AI risks and be prepared to step in with meaningful oversight. Civil society must continue to maintain pressure for accountability and help the public understand the stakes. And the public must pay attention and hold industry, the government, and civil society accountable.
It’s on all of us to pay attention, figure out what we care about, and figure out how to ensure that those needs are met. The voluntary commitments emerging from the Paris AI Action Summit represent an important step in this evolution. But they must be viewed as a beginning rather than an end - a foundation for building more comprehensive progress that can truly address the unprecedented challenges of advanced AI systems.
Having been a news magazine editor, I'm somewhat of an expert on how well you did with in today's newsletter. Summary: outstanding!
#1: Great job with the lede! In addition to the dateline, you tell us that you are reporting from Paris and the AI Action Summit. Just one sentence and we are on the edges of our seats!
#2 Context: a journalist should assume the readers are intelligent, yet not knowledgeable. Your prolific use of links gets around having to describe each item that isn't general knowledge.
#3 You demonstrate the best integrative complexity practices. https://sk.sagepub.com/ency/edvol/socialpsychology/chpt/integrative-complexity. Instead of a mere "both sides" approach, you do deep dives into how these varying approaches might integrate with each other. Research has shown that integrative complexity is the single strongest factor in forecasting success. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3779404
#4 I have only one suggestion for improving your newsletter. How about a link in each newsletter about your qualifications. I suggest emphasizing your world-class forecasting ability. That's probably related to your ability with integrative complexity.
I will be forwarding this newsletter to some of my colleagues.
Hi Peter,
Another nice piece - I'm glad you've started your newsletter.
One thing I would have liked to see mentioned here for the sake of a more complete picture - especially because this article may land as a fairly comprehensive high-level overview to readers - is something like '... and the problem with AI safety washing risk is that we can't really afford to collectively lull ourselves into a false sense of safety for any length of time given the properties of this particular kind of technology and the pace of its development [which you then describe very well]. So in this particular sense, there's a chance that voluntary commitments might not be merely ineffective but actually counter-productive, despite their positive potential in other directions'.
Keep it up!