framing 01 / 05

Compute-Backed Inference Tokens

A token of model output, at a defined quality grade and latency window, treated as the unit of trade. The most direct analogue to electricity-by-the-kWh.

The cleanest version of “AI as commodity” treats one token of model output as the underlying unit of trade. Buyers specify a quality grade and a latency envelope; sellers commit to deliver within both, and the price clears against the spread between supply and demand at that grade in that window. This is the framing that maps most directly onto electricity: a fungible product, metered at the point of consumption, priced in standard units, and indifferent to the specific generator on the other end of the wire.

What makes this framing interesting is that the grade definition does all the work. “One token” is meaningless without an agreement about what model class produced it, at what context length, with what acceptable error rate against an evaluation set, and within what latency. Move any of those parameters and you are pricing a different commodity. The first generation of this market — visible already in inference brokers and the $ /M-token dashboards that aggregate provider pricing — is barely past the stage of comparing apples to slightly different apples. There is no enforceable grade.

For this framing to mature, three things have to happen. A neutral or quasi-neutral body has to define the grades. A settlement window has to clear trades faster than the actual inference takes (or at least faster than buyers can re-route around a missed delivery). And a disputes regime has to exist — somebody has to be able to demand the trace of an inference call after the fact and prove that what was delivered was not what was paid for.

The closest historical parallel is not electricity itself but the standardisation of grain grades in nineteenth-century Chicago. Wheat became a commodity not when farmers started selling it, but when the grading system made “No. 2 Spring” mean the same thing to a buyer in Liverpool as it did to an elevator operator in Iowa. The interesting research question is what the AI equivalent of “No. 2 Spring” looks like — and whether the people building inference infrastructure understand that they are, in effect, waiting for someone to write that grade book.

open questions we're exploring

essays under this framing