framing 03 / 05
Training-FLOP Commodities
The FLOP-hour of training compute, at specified precision and interconnect grade, treated as a raw material distinct from inference. The supply-demand dynamics on each side are different enough that pretending they're the same will obscure both.
Training and inference look like the same workload from a distance — both involve large GPUs in large rooms — but the markets they would form look almost nothing alike. Inference is bursty, low-latency, geographically distributed, and dominated by the cost of under-utilisation. Training is bursty in a different way: a single run consumes thousands of accelerators continuously for weeks, with tight interconnect requirements and very loose latency requirements. Pricing them in the same unit is a category error.
The training-FLOP framing treats compute as a raw material. The underlying is roughly one petaFLOP-hour at FP8 with H-bandwidth interconnect grade at scale N — a deliberately unwieldy specification, because the specification is the thing. A buyer of training compute needs to know they are getting an interconnect topology that will not collapse their effective FLOPS to a quarter of the nameplate; a seller knows that a half-built cluster is worth far less than a fully-built one. There is no honest way to express that without grading the substrate.
What this framing makes visible — and what the inference-token framing hides — is that the training compute market has very few buyers. Maybe twenty organisations in the world are training frontier-scale models in any given year. That is not enough buyers to support a public spot market in the conventional sense. What it can support is a forward market (see capacity future): lab A commits to lab B’s data centre for Q3 2027 at a price struck today. The closest analogue here is not electricity at all but the way large LNG buyers contract long-dated capacity from specific liquefaction trains.
The wild card is regulation. If compute thresholds become the unit of export control and capability governance, then a “training-FLOP” is not just a commercial unit but a regulatory one — and at that point the unit definition stops being the labs’ problem to negotiate and starts being something written into law. That is the moment the framing either matures or fragments.
open questions we're exploring
- What is a 'FLOP-hour' once mixed precision and sparsity are in play?
- Should interconnect bandwidth be priced into the unit, or sold separately?
- Does training capacity even want to be a spot market, or only a forward market?
- Who buys this besides frontier labs — and what does the long tail look like?
- How do regulatory regimes (export controls, compute thresholds) shape this market?
essays under this framing