Epoch AI: Training compute at the frontier grows ~4× per year, so labs can’t continue to scale up reasoning training by 10× every few months — as OpenAI did from o1 to o3 — once reasoning training compute approaches the overall frontier.

Epoch AI epochai.bsky.social · May 13
Can AI developers continue scaling up reasoning models like o3? @justjoshinyou.bsky.social reviews the available evidence in this week’s Gradient Update, and it appears that the rapid scaling of reasoning training, like the jump from o1 to o3, will likely slow down in a year or so.

View on Bluesky Download image Show all post labels
Epoch AI epochai.bsky.social · May 13
Reasoning models are based on traditional LLMs, but they undergo reinforcement learning training to improve their reasoning abilities. Scaling up this reasoning training seems to yield better models, similar to scaling up pretraining.

View on Bluesky Download image Show all post labels
Epoch AI epochai.bsky.social · May 13
How much training compute is used for reasoning training now? OpenAI hasn’t said how much compute they used to train o3, but o3 was trained on 10× more compute than o1 (released four months prior). This probably refers to RL reasoning training compute, not total compute.

View on Bluesky Download image Show all post labels
Epoch AI epochai.bsky.social · May 13
So what’s the scale of o1 and o3 in absolute terms? We could get hints from similar models. For example, DeepSeek-R1 was trained on ~6e23 FLOP in the reasoning stage. Nvidia’s Llama-Nemotron Ultra is at a similar scale. x.com/EpochAIResea...
Epoch AI on X: "In this week’s Gradient Updates issue, @EgeErdil2 explains what went into the training of DeepSeek-R1 and estimates that the dollar cost of the compute that went into its training on top of DeepSeek V3 comes out to around $1M. https://t.co/4dj3x53sJp" / X

In this week’s Gradient Updates issue, @EgeErdil2 explains what went into the training of DeepSeek-R1 and estimates that the dollar cost of the compute that went into its training on top of DeepSeek V3 comes out to around $1M. https://t.co/4dj3x53sJp

x.com

View on Bluesky Show all post labels
Epoch AI epochai.bsky.social · May 13
Another hint is that Anthropic CEO Dario Amodei suggested in January that reasoning training to date has scaled to $1M per model, or ~5e23 FLOP. But it’s unclear if Amodei is referring to particular models here or has good insight into non-Anthropic models.

View on Bluesky Show all post labels
Epoch AI epochai.bsky.social · May 13
So reasoning training probably isn’t yet at the scale of the largest pre-training runs (>1e26 FLOP), but it’s unclear exactly how far o1 and o3 are from the frontier. However, the pace of growth is so rapid that we know that the rate of scale-ups can’t continue for long.

View on Bluesky Show all post labels
Epoch AI epochai.bsky.social
Training compute at the frontier grows ~4× per year, so labs can’t continue to scale up reasoning training by 10× every few months — as OpenAI did from o1 to o3 — once reasoning training compute approaches the overall frontier.
May 13, 2025 13:30
0 reposts 0 quotes 0 likes

View on Bluesky Download image Show all post labels
Epoch AI epochai.bsky.social · May 13
There could still be rapid algorithmic improvements after the reasoning scaling slows down, but at least one factor in the recent pace of progress in reasoning models is not sustainable. Check out the full newsletter issue below! epoch.ai/gradient-upd...
How far can reasoning models scale?

Available evidence suggests that rapid growth in reasoning training can continue for a year or so.

epoch.ai

View on Bluesky Show all post labels

Epoch AI on X: "In this week’s Gradient Updates issue, @EgeErdil2 explains what went into the training of DeepSeek-R1 and estimates that the dollar cost of the compute that went into its training on top of DeepSeek V3 comes out to around $1M. https://t.co/4dj3x53sJp" / X

How far can reasoning models scale?