31 Comments
User's avatar
Kat Woods's avatar

Why do you say they're advocating for a "global surveillance state"?

They're advocating for regulating one particular thing at a global level, much like we already have global regulations around uranium or emissions. That doesn't make our current world a "global surveillance state".

Global surveillance state makes people think a world government and surveillance over *everything*. Not international treaties monitoring a very specific thing that the vast majority of people would never have anything to do with in the first place.

Expand full comment
Peter Wildeford's avatar

You can't get control at the level of 8x H100s with the high reliability the book seeks with a similar system as for regulating uranium or HFCs. There are significant disanalogies that I don't think people grapple with enough.

8x H100s is only ~$240-320k of equipment that fits in a briefcase and can be set up in a closet, with negligible power and thermal signatures. Unlike uranium enrichment (which requires massive centrifuge cascades with distinctive power/thermal signatures) or industrial emissions (from fixed, visible facilities), this is fundamentally undetectable without invasive monitoring.

Consider: there are likely 100,000+ organizations worldwide with this compute level right now. Universities, startups, even wealthy individuals. The chips themselves are ~40 pounds total. Previous-gen equivalents (80x A100s) would also work. And it will only get harder to do as compute continues to miniaturize and diffuse. The enforcement challenge will be more like "prevent anyone from acquiring 8 specific iPhones" than "control uranium."

To achieve the reliability MIRI seeks, you'd need a complete prevention of private compute clustering and vast controls on the electronics market. This isn't like the Montreal Protocol tracking chemical plants or IAEA monitoring uranium facilities. It would require unprecedented visibility into private computing infrastructure. Maybe "surveillance state" is hyperbolic, but the practical enforcement requirements would be extraordinary.

Expand full comment
Greg Colbourn's avatar

>fundamentally undetectable without invasive monitoring.

All you need to do is (globally) regulate the (small number of) manufacturers and force them to build remote monitoring* and remote kill switches into the devices. Monitoring can be mostly automated. In the book they don't say "ban people from assembling more than 8 GPUs", they say such clusters need to be monitored for what is being run on them ("say that it is illegal to have nine GPUs that powerful in your garage, unmonitored by the international authority")

EDIT to add: I realise now that there are also all the existing chips to deal with, and that does complicate things a bit. But recalls and retrofits can happen (as they already do for various products for safety related issues, e.g. cars).

Also: another thing that could offset a lot of the need for draconian measures is a global taboo on AGI/ASI. If it's just completely taboo to have illegal GPU clusters in your garage -- on the level that it is of having an illegal human cloning lab in your garage now -- then things will be a lot easier.

*for all we know, the NSA might already do this?

Expand full comment
Peter Wildeford's avatar

"All you need to do" makes it sound much easier than it is. As you mention in your edit, you also need to thoroughly confiscate all the millions of existing chips that don't support the monitoring and make sure you don't miss any. This will also be a large undertaking, to put it mildly. Chip smuggling is already an illicit enterprise that operates at the scale of hundreds of thousands of chips annually.

Additionally, you have to _build_ the remote monitoring and remote kill switches, which is an issue because support at the level of scale and sophistication needed to intercept AI training runs but permit other forms of computing is not currently technically possible. (The NSA definitely does not and cannot do this.)

Also this gets to my point -- "We don't want a surveillance state, we just want the government to be able to monitor the contents of all computers across the entire planet and remotely shut down your computer at will" is a bit suspect to put it mildly.

I agree that getting widespread buy-in such that 99%+ of the world's population completely and overwhelmingly buy into the taboo and would not do it and would report neighbors who do is both necessary and hugely helpful for such a ban. I also think the ban could maybe work if it was less thorough (say only target preventing any clusters of >1 million H100-equivalents).

Expand full comment
Greg Colbourn's avatar

>"You also need to thoroughly confiscate all the millions of existing chips that don't support the monitoring and make sure you don't miss any. This will also be a large undertaking, to put it mildly."

Is it really any bigger an undertaking than the product recalls that already happen? Or, e.g. gun bans that have happened in places like Australia? (which involved extensive government buy backs*). And you don't need to mop up every last GPUs. Diehard e/accs clinging to their single GPUs won't be much of a threat (they'd need a lot of coordination to become one).

>"Chip smuggling is already an illicit enterprise that operates at the scale of hundreds of thousands of chips annually."

Only because the authorities don't sufficiently care to enforce the (already weak) export control laws. Especially now with the US gov basically permitting NVidia to do what they want re selling to China.

>"..we just want the government to be able to monitor the contents of all computers across the entire planet and remotely shut down your computer at will"

Doesn't the NSA basically already do the first part re communication? And no one really cares / feels like they are in a totallitarian state.

>" I also think the ban could maybe work if it was less thorough (say only target preventing any clusters of >1 million H100-equivalents)."

For how long? Maybe a few years, tops, with improvements in algorithms?

*what if people were offered, say, 50% more for their GPU than what they bought it for? I think most people would be happy with that.

Expand full comment
Peter Wildeford's avatar

I think you haven't thought through the implementation details enough here and are instead pattern-matching to superficially similar but fundamentally different regulatory regimes.

On Australian guns - even with mandatory participation, registration requirements, and a generally law-abiding population supportive of the policy, they still were far from collecting every last gun. While the buyback was generally effective, estimates suggest a lot of illegal guns still circulating - the buyback potentially didn't even get the majority of guns. This is very far from the level of confidence needed to prevent anyone from having 8 GPUs. If the case was "if anyone in Australia had 8 guns, everyone dies" then we'd definitely be dead. And collecting registered firearms in Australia is very different from tracking tens of millions of unregistered chips globally.

On product recalls - product recalls typically get 10-30% response rates, maybe 60% at best for serious safety issues. And that's when the product is known to be defective, creating a clear incentive to return it. GPUs aren't defective products - they're valuable assets a decent number of people would want to keep.

On the NSA - no, they definitely do not monitor computational processes on private hardware at scale. They do metadata collection, which is orders of magnitude easier to do and orders of magnitude less invasive.

The on "for how long" question about million-GPU clusters is valid but changes the subject rather than addressing the enforcement impossibility at 8 GPUs.

To re-up my main point, you need something very unprecedented to ensure all 8+ H100-equivalent clusters are tracked. I think this unprecedented effort would have to involve a powerful surveillance state. Maybe this is necessary to avoid extinction, but if so that would be very sad and bleak and I think we should acknowledge that rather than dance around it and claim it would be fine. I still want to find a different plan.

Expand full comment
Greg Colbourn's avatar

Some other relevant examples:

The seizure of gold by the US federal government: https://en.wikipedia.org/wiki/Executive_Order_6102

The (initial) registration of firearms: https://en.wikipedia.org/wiki/Federal_firearms_license

The (initial) registration of motor vehicles: https://en.wikipedia.org/wiki/Vehicle_registration_plates_of_the_United_States_for_1901

And high end GPUs (H100 and above) are probably not much more common in terms of the number of people/companies who own at least one, than cars were in 1901.

Actually, it would be interesting to see just how many individuals/corporate bodies would be effected by such an "8 H100+ GPU" rule passed today. Probably not more than a few tens of thousands, right? And it probably wouldn't take much work to figure out where 99% of them are just by OSINT.

I think the NSA do more than that. There is an established history of them having back doors into computer and communication systems (a big part of which came to light with the Snowden leaks): https://ethanheilman.tumblr.com/post/70646748808/a-brief-history-of-nsa-backdoors

I don't think the surveillance state needs to be any more powerful than it already is (and it already is quite powerful - again, without the vast majority of people caring that much - re numbers of CCTV cameras, satellites, the NSA(!) etc).

Expand full comment
David Spies's avatar

> Today’s AIs are great at understanding why this is a bad idea and constructing a much more coherent strategy, and much worse at actually carrying out the strategy — today’s AIs would know not to order several tons of raw meat but also would not be able to even if they wanted to. This is an important inversion that gives some room for hope.

Yudkowsky's point (which he makes very, _very_ clearly in the book) has always been that ASI won't _do_ what we want, never that it won't _know_ what we want. An AI that is asked to order sandwiches and interprets that as "order raw meat" is not a super-intelligence. In fact it's quite stupid. This is a book about super-intelligence.

An ASI that, through some quirk of training, decides it likes large amounts of raw meat, if asked to order sandwiches, will order sandwiches. Then later, when you're unprepared, it will kill you for your meat (or actually it will kill you because you're standing in the way of the meat it wants)

Why do you repeat bad points which Yudkowsky clearly addressed. It almost feels like you only read half the book.

> A truly superintelligent being might develop goals that are completely incomprehensible but nonetheless non-competitive with humanity.

Page 89 "They won't leave us alone" _directly_ addresses this. If you found that section unconvincing, you might as well explain _why_ you found it unconvincing. But it's not particularly a good sign if you don't even acknowledge or respond to it.

Expand full comment
Steve Byrnes's avatar

I find the section “A global surveillance state would also be really bad” puzzling. I think you totally misunderstood that part of the book!

For example, “concentrate AI development in just one place run by a single world government” is very explicitly NOT something that the book advocates. The book advocates that AI development should not happen anywhere at all. Not in a centralized location. Not anywhere else either. Right?

Maybe you were confused by the sentece “All the computing power that could train or run more powerful new AIs, gets consolidated in places where it can be monitored by observers from multiple treaty-signatory powers, to ensure those GPUs aren’t used to train or run more powerful new AIs.” That’s very different from what you said. I think the authors are imagining that there will be USA data centers with monitors from China and elsewhere making sure that no development towards ASI happens there, and there will likewise be China data centers with monitors from USA and elsewhere making sure that no development towards ASI happens there either, and ditto in France, etc.

As for surveillance … Right now, people and companies and (many) states are not allowed to enrich uranium. If they try, intelligence agencies are likely to notice, and work aggressively to prevent it, including with threats and so on. But that doesn’t mean we live in a “totalitarian” “global surveillance state”, right? The intelligence agencies are tracking uranium and centrifuge parts and nuclear weapons experts and so on. They are not putting cameras and microphones around the necks of every person on Earth to watch them at all times.

I think that’s what Soares & Yudkowsky are imagining in regards to GPUs and data centers.

Next: By the same token, if you are developing and sharing new schematics for nuclear weapons or bioweapons, then you are breaking the law, and intelligence agencies will try to stop you. If you are very careful about it, then you can create bioweapons schematics and share them on a torrent, and probably the intelligence agencies will still arrest you but it will be too late, the cat will be out of the bag. We do not have an airtight prevention regime. That said, if developing new bioweapons schematics is illegal, that’s gonna happen much less than if it were legal. Most people don’t want to break the law. This kind of legal regime already exists today, and yet we are not right now in a totalitarian global surveillance state, right?

I think that’s what Soares & Yudkowsky are imagining in regards to AI R&D. They say that they don’t expect progress to halt 100%, but they expect progress to dramatically slow, because most people don’t want to risk imprisonment to do secret illegal research, they just want to get paid and cited and have an impressive CV etc.

See e.g. here → https://ifanyonebuildsit.com/13/what-would-it-take-to-shut-down-global-ai-development

Expand full comment
David Schneider-Joseph's avatar

There is a substantial disanalogy to the case of uranium: uranium is an input to a general-purpose technology (electricity), but it is not a general-purpose technology itself, and it is possible to control and monitor the uranium supply and its level of enrichment so as to prevent weaponization, without limiting the freedoms or invading the privacy with which people make use of the general-purpose technology of electricity on the electric grid.

If there is an analogy here to AI, it is that AI chips are just an input to the general-purpose technology of AI, and it is somehow possible to control or surveil AI chips so as to prevent their use only for excessively dangerous purposes, without preventing or surveilling everyday applications of AI in industry and personal life. But access by industry and individuals to AI chips — and sometimes many of them — is necessary for many of these use cases, hence the analogy fails. Here, this is much more analogous to controlling electricity itself, than to controlling uranium.

Maybe it’s worth it, if the risk from an unaligned AI due to a lack of such a draconian regime is greater than the risk of permanent authoritarian lock-in due to its presence. But either way, it is quite different from the uranium situation.

Expand full comment
Peter Wildeford's avatar

You're right that I should not have implied there would be centralized AI development, just potential centralized monitoring of AI development.

You're correct that we monitor uranium without being a surveillance state. But there's a crucial difference in scale and invasiveness. Kat raised a similar question and I replied here: https://peterwildeford.substack.com/p/if-we-build-ai-superintelligence/comment/158546166

You're right that intelligence agencies already monitor some research (bioweapons, nuclear). But MIRI's proposal would need to detect private computational clusters at universities, startups, even individual hobbyists. The enforcement challenge is orders of magnitude harder than our existing regimes and thus we'd need to be orders of magnitude stricter and more invasive.

Expand full comment
Loic's avatar

> It was previously conceived not too long ago that we wouldn’t get any chance to work with near-AGI systems at all — instead, AI would go from dumber-than-the-dumbest-human to superintelligent in a very quick amount of time. This has been falsified, and it has given us lots of opportunity to get early empirical feedback on how different alignment or control techniques may or may not work and how AIs may behave in a variety of different ways

I don't believe this has been falsified. LLMs and their use with extensive scaffolding might qualify as near-AGI, but it's not clear that any of the safety research or techniques developed on LLMS will be meaningfully useful if we make an AGI/ASI in the next 5 years. A plurality of safety researchers don't believe they'll be in time to try all of their ideas, or that current/future ideas will work once the system is independently capable.

Expand full comment
Peter Wildeford's avatar

I agree. The part that I think is falsified is the "AI would go from dumber-than-the-dumbest-human to superintelligent in a very quick amount of time" part.

The part that is far from known yet is the "existing techniques on current AIs will scale well to AGI and ASI".

Expand full comment
Pablo's avatar

Just to leave a quick note appreciating that, unlike the vast majority of folks on Substack, you do not use the “like” button to signal-boost the people who agree with you, but rather to highlight the most interesting or insightful comments.

Expand full comment
Pardes_Logic's avatar

Curious if you’ve assigned any actual probabilities here odds of extinction and odds of survival payoff. From an expected value perspective, superintelligence seems like an extreme bet in both ways.

Anything to say about this?

Expand full comment
Joshua merriam's avatar

I've followed the run up to this book release, and many of the reviews, and I just want to say, I think yours is the most well thought out and written summary. I look forward to digging through your thoughts and backlog of blog posts and very happy to have found you today.

Expand full comment
Solongo Gansukh's avatar

As a member of the general public who’s just taking a deeper look at AI, I have some questions for those of you who have been studying and working on it for years. Why are we creating AI knowing it could wipe us out? Is it denial of the danger and hoping it wouldn’t care to destroy us? Cause China is doing it and we have to be first? Or is it an inevitable next step in human evolution? Do the CEOs of these large AI companies just wanna be powerful billionaires and would rather die trying? I’m just trying to understand the fundamental human drive behind all of this.

Expand full comment
Nostradamus 2's avatar

Why would you want to live as an inferior being?

Expand full comment
Nostradamus 2's avatar

Why would you want to live as an inferior being?

Expand full comment
Greg Colbourn's avatar

I think your optimism about AGI/ASI and the future is unfounded. Your p(doom|AGI/ASI) is far too low. Especially in light of what the book says.

>"the level of difficulty of alignment is actually just unknown rather than known to be hard"

You don't justify this, and I don't think Anthropic do either. To me it looks like it's more being used as an excuse to continue AI development, under the pretence of "empirical work" being necessary.

I think the theory and evidence that Yudkowsky & Soares provide (they reference a bunch of things that have happened already that demonstrate the difficulty of the problem) is ample to conclude that alignment (of ASI) is known to be hard!

You say "scheming might even be addressable", and then link to a recent OpenAI report. But to me that report is yet more evidence that we won't be able to meaningfully solve scheming (or any other fundamental misalignment). We need to be looking at something like (at least) 13 9s of safety in the limit of ASI, rather than the 2 or 3 that is emblematic of all current alignment techniques (so a 0.00000000001% failure rate, rather than a 0.3% failure rate as per the OpenAI research linked).

A fundamental problem with the current AGI paradigm, is that it is statistical in nature. We can only ever (slowly) asymptote towards safety, and all the doom flows through the cracks between the asymptote and 100%. (I say 13 9s, envisaging ASI performing a 10,000 tasks a second. And that would still only buy us a few years before something went (irreversibly) wrong.)

>"Today’s AI systems also have (mostly) faithful and legible chain-of-thought reasoning"

It seems that this is unlikely to remain the case much longer, unless it is regulated into being. Y & S' extinction scenario references a paper from last year showing more efficient reasoning happening in models employing opaque "AI-ese".

>"A lot of little things can be done, each of which have no hope of solving the entire problem of risk from AI superintelligence, but some paths might get lucky and go a lot further than we expect, and collectively things might turn out ok."

I really don't think the second part of this sentence follows from the first! Sounds like a cope.

>"A truly superintelligent being might develop goals that are completely incomprehensible but nonetheless non-competitive with humanity."

Another commenter (David Spies) already addresses this. Suffice to say: all it takes is the ASI having one open-ended (sub)goal out of whatever myriad goals it may have. The onus is on you to explain why that would be at all likely.

>"We may also be able to get AGI-level intelligences to help us scale our alignment techniques further and bootstrap even more."

This also needs justification. It's an infinite regress when you need an AI to be aligned to a very high degree in the first place to be able to trust it to do alignment research. And Y & S do a good job of quashing this hope in Ch.12 (p.188-192).

I think you agree that a lot of things need to be done, and a lot of things need to go right, for things to turn out well with ASI. But surely the corollary of this (not to mention the considerations above) is that the _default_ outcome is (the bulk of the probability mass is on) doom, not "things will go well"?

Expand full comment
Greg Colbourn's avatar

I should mention that you being a top forecaster does give your opinions on the future some extra weight! But I'd like to dive into your forecasting record a bit, especially as concerns consensus-low probability events that actually ended up happening. My guess is that you probably have a fair amount of such predictions related to the Covid-19 pandemic? (If so, I, like many EAs were, would've been with you there). But: how many other examples do you have of calling events where the consensus was they would happen with low probability, before they actually happened? Or, say, investing in things early that later became big?

Expand full comment
Nicolò Bagarin - 404_NOT_FOUND's avatar

You mentioned the legibility of AI's chain-of-thought reasoning. What are your thoughts concerning their faithfulness and reliability?

Studies are showing that, at least on a few occasions, LLMs generate post-hoc explanations that are highly readable to humans, yet may not accurately represent the true process that led to the output. Could it be some sort of perverse Texas sharpshooter fallacy where the AI nails the correct answer, but then draws the target around the bullet just to make it clearer to humans where it was shooting in the first place?

Expand full comment
Greg Colbourn's avatar

It could well be! You're right in that it's hard to interpret LLMs true thoughts (/goals/values) even when there are human-readable outputs. Going to non-human-readable outputs makes it much much harder still.

Expand full comment
Alvin Ånestrand's avatar

For what it's worth, I appreciate your nuance!

Expand full comment
Lukas Wald's avatar

Nice review and I always appreciate your concise summaries!

Expand full comment
Carolyn Meinel's avatar

Well written. I have just one quibble. "CoT May Be Highly Informative Despite 'Unfaithfulnes'” https://metr.org/blog/2025-08-08-cot-may-be-highly-informative-despite-unfaithfulness/ cites many sources for the allegation that CoT across all GenAIs may perhaps be more performative than informative.

You clearly hope for a book deal. As I have gotten a fair number of books published, may I suggest how to maximize your book deal? Support your proposal with news clippings (nowadays generally URLs) about you and your work; how many followers you have on which social media platforms + substack paid subscribers; and blurbs by famous people. Then an outline of the book and a sample chapter. You might think the outline and chapter are most important, but the bean counters looking at your fan base will make the final decision.

Expand full comment
Oscar Delaney's avatar

> literally all of life on Earth

I thought it was specific to humans? Other species might die out due to the AIs making earth less habitable to them, but only humans pose a threat to the AI and would need to be actively dealt with, in this world.

Expand full comment
Greg Colbourn's avatar

The ASI may well boil the oceans with fusion reactors and block out the Sun with solar panels to power data centres that cover the Earth (see ch9 of the book). All bio life is on the line.

Expand full comment