|
Getting your Trinity Audio player ready…
|
Me: all these massive AI data centers – but what if the training compute is reduced? but then again what if that is offset by an exponential surge in use cases?
all these massive AI data centers – but what if the training compute is reduced? but then again what if that is offset by an exponential surge in uses cases?
You’re hitting on one of the most important unresolved debates in the AI scaling era: is the explosive growth in AI data centers justified primarily by training compute, or is it increasingly being driven by inference (i.e., the “exponential surge in use cases” you mentioned)?
Let’s break it down into the two scenarios you’re describing:
Scenario 1: Training compute growth slows or plateaus
Several things could cause this:
- Hitting diminishing returns on scale (Chinchilla-type scaling laws start bending harder).
- Algorithmic breakthroughs (e.g., better architectures, synthetic data, test-time compute, Mamba-style models, etc.) that deliver big jumps in effective capability per FLOP.
- Regulatory or energy constraints that physically cap new training clusters.
- Economic reality check: if frontier models stop delivering proportional economic value, the $100B+ training runs get harder to fund.
If training compute growth flattens while inference demand keeps exploding, the data-center build-out starts looking much more “inference-heavy” than most new GPUs will be running millions/billions of users’ queries, agents, video generation, real-time AI, etc., rather than the next GPT-5 training run.
In that world, the current wave of 100–500 MW data centers actually makes sense, because inference is embarrassingly parallel and has basically infinite legitimate demand if price drops fast enough (see the smartphone analogy: training a model once, then shipping it to 8 billion devices).
Scenario 2: Training compute keeps growing exponentially anyway
This is the classic “scaling hypothesis” bet (OpenAI, Anthropic, xAI, etc.). Even if inference demand grows fast, training runs still dominate capex for the foreseeable future because:
- Each new model is 5–30× larger than the previous one.
- You still need to retrain/rebuild the frontier model frequently to stay competitive.
- New modalities (video, robotics, scientific simulation) are insanely compute-hungry.
In this world, today’s data centers will look tiny in 3–5 years. We’ll be talking about multi-gigawatt clusters, and the “training vs inference” ratio stays heavily tilted toward training.
What’s actually happening right now (late 2025 evidence)
- NVIDIA’s quarterly guidance and the hyperscaler capex numbers ($65B+ from Microsoft alone in FY2025, similar from Google/Meta/Amazon) are so large that they can only be explained if both training and inference are growing extremely quickly at the same time.
- Inference is already a huge fraction of total cycles in many clouds today (reports from Oracle, CoreWeave, Lambda, etc. suggest 60–80% of their GPUs are doing inference), but the absolute amount of training compute is still growing faster than almost anyone predicted two years ago (e.g., Grok 3, GPT-5-class runs, Llama 4, Gemini 2, Claude 4 all rumored to be 10–50× larger than 2024 models).
- Electricity demand forecasts from grid operators (ERCOT, PJM, CAISO) and utilities are being revised upward every quarter, and the main driver cited is always “AI data centers” with both training and inference mentioned.
Bottom line
The massive data-center build-out is currently being justified by both forces simultaneously:
- Inference use cases are exploding even faster than most people expected (agents, video gen, AI companions, enterprise copilot seats, etc.).
- Training compute is not slowing down yet (actually accelerating).
So we’re in the weird regime where “training compute reduced?” is not happening, and “exponential surge in use cases” is also very much happening.
The bet the industry is making (and so far winning) is that these two curves will keep feeding each other for several more years: better models → way more use cases → way more revenue → fund even bigger training runs → repeat.
Whether that loop breaks in 2026–2028 because of energy, returns-to-scale, or regulation is the multi-trillion-dollar question. For now, the data centers keep getting approved and built at a pace that would have seemed insane in 2023.
Leave a Reply