I think we are about to stop paying flat fees for AI

==========
2026-06-26 · #ai #pricing #models #industry #ndc

I pay a flat monthly fee for the AI tools I use to do my work. So does almost everyone I know who uses these tools heavily. ChatGPT Plus. Claude Max. ChatGPT Team. The price is the same whether I use the tool five times in a month or five hundred. The whole subscription model is built around the assumption that the cost of serving me does not vary much with how much I actually use.

That assumption is going to break. I'm not sure exactly when. What I have is a hunch from watching how the API side of these companies has priced for years. The API has always been per-token. Every call has a unit cost. Every retry costs money. The pricing on the developer side already tells you what a request actually costs to serve. The consumer side does not. That gap is not going to last forever.

When it closes, here is what I think changes.

Behavior changes first. Right now I do not think about cost when I ask a model a question. The cost is sunk in the monthly subscription. There is no friction between thinking of a question and asking it. When every call has a visible unit cost, that friction comes back. The "let me just ask Claude real quick" pattern stops being free at the margin. Some questions get asked anyway. Some questions do not. The ones that do not are the ones I will not even notice I stopped asking.

Who pays for what changes next. Today the heavy user is the winner. If I run a thousand queries against a flat plan and the next person runs five, our bills are identical. Per-token pricing inverts that. The heavy user pays more, the light user pays less. The shape of who buys also changes. A casual user who would never pay twenty dollars a month might happily pay fifty cents for a one-off task. The entry point gets lower. The ceiling gets higher.

How teams budget changes too. Right now the AI line item on a small company's expense sheet is a fixed number. Seats times the seat price. It is predictable. It is approvable once and forgotten. Per-token spend is variable. It moves with how much the team actually does. That changes the conversation with finance. It changes who has the authority to use the tool freely versus who has to ask permission.

The biggest change is how the tools get designed. Today an app that wraps a model can assume the user has a flat-fee subscription absorbing the cost. When the user is paying per token directly, every feature has a cost attached to it. A long response costs more than a short one. A product that does ten model calls per user action costs ten times what a product that does one call costs. Builders are going to have to surface those tradeoffs in a way they are not today.

None of this is bad. Per-token pricing is honest. It tells the user what the work actually costs and lets them decide whether it is worth it. Flat-fee pricing hides that information. The shift will change the relationship between user and tool. Flat-fee treats the model as an infinite resource you have already paid for. Per-token treats it as a service with a price per use. Both can work. They produce different behavior, different product design, and different adoption.

I think the shift is imminent. I'm not sure when. What I have is the structure of the developer-side pricing telling me the consumer-side pricing is a temporary equilibrium. The question of when that equilibrium breaks is more interesting to me than the question of whether it will.

Get new posts by email

Friday digest, no filler. Drop your email below and I'll send what I publish.