But Claude are still the best coding models, and Claude Code is still the best coding/agentic tool. If you really believe in a self improving feedback loop, why would you care about short term consumer demand? Maybe I’m missing something, but it feels like OpenAI is not banking on AGI anymore.
* Postponed the December o3 announcement to February
* Named o3 GPT-5 instead of o3
* credibly announced the progress on ARC and Frontier Math
What would our impression of the rate of AI progress be now? It seems like the decision to create the o-series was hoping that pre-training would remain the most powerful lever, but the result has been people getting impatient and claiming AI progress has been underwhelming, maybe prematurely.
Yes, but it's really hard to build a benchmark that's strongly correlated with the ability to provide economic value. The METR task horizon benchmarks may be the closest.
By annualized revenue, do you mean revenue earned in December 2025 multiplied by 12 or the revenue earned in whole of 2025? According to this https://www.wheresyoured.at/howmuchmoney/, OpenAI + Anthropic have made around $7.5B in the first 7 months of 2025 and the ARR based on July 2025 is $12B and $5B equating to $17B.
Great point about revenue. But I am unsure how this release impacts the chances of this paradigm reaching real general intelligence. Isn't it marginally-moderately bearish for it scaling to AGI?
For me, GPT-5 produced exactly the benchmarks I was expecting given my projections. So I think the pace towards AGI is still on track for sometime in the 2029-2040 range (median 2033). But if you were expecting something like 'AI 2027' I agree GPT-5 looks bearish for that.
Thank you for your write-up on this!
But Claude are still the best coding models, and Claude Code is still the best coding/agentic tool. If you really believe in a self improving feedback loop, why would you care about short term consumer demand? Maybe I’m missing something, but it feels like OpenAI is not banking on AGI anymore.
short term consumer demand pays the Stargate bills! Self improving feedback loops aren't cheap!
not to mention (theoretically) reducing compute costs to serve its current users! Check out them per token costs, wowee
This post gave me an interesting thought;
What if Open-AI had
* Postponed the December o3 announcement to February
* Named o3 GPT-5 instead of o3
* credibly announced the progress on ARC and Frontier Math
What would our impression of the rate of AI progress be now? It seems like the decision to create the o-series was hoping that pre-training would remain the most powerful lever, but the result has been people getting impatient and claiming AI progress has been underwhelming, maybe prematurely.
Yeah I think this would've kept the perception of progress higher.
At some point, the best benchmark is revenue / gdp growth etc.
It bothers me that we keep finding indirect ways to measure what we really want to measure; does this change the world?
yes, but it's a lagging indicator
Yes, but it's really hard to build a benchmark that's strongly correlated with the ability to provide economic value. The METR task horizon benchmarks may be the closest.
By annualized revenue, do you mean revenue earned in December 2025 multiplied by 12 or the revenue earned in whole of 2025? According to this https://www.wheresyoured.at/howmuchmoney/, OpenAI + Anthropic have made around $7.5B in the first 7 months of 2025 and the ARR based on July 2025 is $12B and $5B equating to $17B.
Yes, I mean 2025 December * 12. I have added a footnote now to clarify this. Thanks!
Great point about revenue. But I am unsure how this release impacts the chances of this paradigm reaching real general intelligence. Isn't it marginally-moderately bearish for it scaling to AGI?
For me, GPT-5 produced exactly the benchmarks I was expecting given my projections. So I think the pace towards AGI is still on track for sometime in the 2029-2040 range (median 2033). But if you were expecting something like 'AI 2027' I agree GPT-5 looks bearish for that.
Thank God