AI Costs Are Rising: Why ROI Is a FinOps Problem

A year ago, artificial intelligence looked like an infinite discount on labour. Everyone in every boardroom ran the same back-of-the-envelope sum: a salaried person costs you tens of thousands a year; the same work in tokens costs a few euros a month. AI won by three or four orders of magnitude, the budget got approved, and the pilots began.

Now the first renewal invoices are landing. Usage has sprawled across every team with no one able to say what it cost or what it returned. And the most capable model of its generation just went dark overnight — everywhere, the EU included. The cheapest part of your AI stack — the tokens — is quietly becoming the part you can least afford to leave ungoverned.

The maths that sold AI is breaking

The original sum was seductive because it was simple. Put two numbers side by side:

A mid-level knowledge worker — roughly €70,000 a year fully loaded, before you count the management overhead, the sick days, or the ramp time.
The same volume of drafting, summarising, and triage in tokens — a few euros a month, maybe a few tens of euros if the team is heavy.

On that spreadsheet, AI doesn't just win — it embarrasses the alternative. But the sum quietly assumed two things that no longer hold. It assumed prices only fall. And it assumed the best model is always available to you. Both of those assumptions failed this year, and when they failed, the easy maths failed with them.

Force one: tokens are getting more expensive, not less

For two years the headline price per token mostly fell, and everyone extrapolated the line to zero. That line has bent. Not because providers raised list prices in a way you'd notice on a rate card — but because the way we use models changed underneath the price.

Reasoning models think before they answer, spending tokens you never see. Longer context windows mean every call drags more history with it. And agentic workflows — the ones everyone now wants — don't make one call, they make twenty, each one feeding the next. The per-token price stopped being the story. The story is tokens-per-outcome, and that number has been quietly exploding while the rate card looked stable.

Left unwatched, AI spend behaves exactly like early cloud did: convenient, sprawling, and shocking at renewal. The leak is rarely a single runaway bill. It's that nobody owns the number, nobody can attribute it to a result, and nobody can forecast next quarter. You don't have an AI budget. You have an AI bill, and you meet it after the fact.

Force two: the best model might not be yours to use

The second assumption broke harder. In June 2026, Fable 5 — and the Mythos 5 model beneath it — went dark. Not throttled, not price-hiked: switched off. A US export-control directive ordered the provider to suspend access for any foreign national, and because nationality can't be verified per request in real time, the provider complied completely. The single most capable model of its generation was suddenly serving, in its own words, “exactly zero traffic” — to anyone.

Read that as a European buyer. The model your team might have architected an entire workflow around didn't become unavailable because your EU compliance slipped. It became unavailable because a different government, on the other side of the world, changed its mind about who was allowed to use it. Your EU AI Act homework was immaculate and it made no difference at all.

This is not a one-off, and it cuts in every direction. Frontier models have been withheld from the EU before over the bloc's own regulatory uncertainty; now they vanish over Washington's. Either way, the model you bet on is subject to forces you don't control: where your data sits, which way a regulator leans, how an export-control office classifies a capability this quarter. If your AI roadmap depends on continuous access to one frontier model, you have engineered a single point of failure — and given it a foreign address.

If your ROI depends on one frontier model, you don't have a strategy. You have a single point of failure with a foreign address.

Put the two forces together and the picture is clear. Token economics are getting harder, not softer. And the best tool for the job may simply not be available to you. The era of treating model access as cheap, abundant, and permanent is over.

Stop shopping for models. Start governing spend.

Here is the reframe that separates the companies getting real return from the ones writing off pilots: AI ROI is a financial-operations problem, not a model-shopping problem. You don't win by chasing the cheapest model this month or the biggest model this quarter. You win by treating AI spend like the operational cost it has become — visible, attributed, controlled, and resilient.

Cloud computing went through exactly this. The first wave was “it's cheap, just let teams use it.” The reckoning produced a discipline — FinOps — that made cloud spend legible and governable. AI is arriving at the same reckoning, earlier and sharper, because the volatility is worse: prices move under you and capabilities disappear by region. The four pieces that matter:

Visibility — what every token actually buys

You cannot govern what you cannot see. The first job is a live view of where tokens go: which models, which teams, which use cases, on which surfaces — desktop, browser, embedded in the SaaS you already buy. Most organisations discover their AI spend the way they discover a leak: from the invoice. The goal is to see it as it happens, mapped to the work it's doing.

Attribution — who spends, against which outcome

A number with no owner is a number nobody manages. Spend has to attach to a team and, more importantly, to an outcome — the claim processed, the ticket resolved, the brief drafted. Cost-per-token is an input metric and it lies to you. Cost-per-trusted-outcome is the number that tells you whether a use case earns its budget or quietly burns it.

Controls — budgets, caps, and guardrails

Visibility without controls is just a nicer way to watch the bill grow. Teams need budgets, usage caps, and policy guardrails — the ability to say “this use case gets this much, on these models, with these data rules” — and to have that enforced rather than hoped for. This is also where cost governance and risk governance stop being separate conversations.

Resilience — survive a price spike or a model going dark

The Fable 5 lesson generalises: you need to keep delivering when a price spikes or a model is pulled. In practice that means knowing, per use case, which model it depends on, what it would cost to move, and whether a compliant fallback exists. The organisations that shrugged off the EU cutoff were the ones who could see their exposure and switch — not the ones who found out when the API started returning errors.

What this means for an owner or a CXO

The instinct that produced the problem is the comfortable one: it's cheap, just let people use it. That instinct, in a world of rising token costs and disappearing model access, is precisely how you end up with an unbudgeted, unattributed, geographically fragile cost line — and a Gen-AI spend with no P&L line next to it.

The companies converting AI budget into return aren't the ones with the biggest model or the lowest per-token rate. They're the ones who made their AI spend legible and governable beforeit scaled — so that when prices moved and a model went dark, they adjusted instead of absorbing the hit. That discipline is what we build at Prompt Shields: turning sprawling, invisible AI usage into something you can see, attribute, control, and stand behind in front of a board or a regulator.

AI's free lunch is over. What comes next isn't cheaper tokens — it's better governance of the ones you're already paying for.

The cheapest part of AI just got expensive