Talk to an Expert

May 30, 2026 |

Variable AI pricing suppresses the exploration vendors need: the lesson from Uber’s token blowout

Author

TL;DR — Variable pricing during AI’s current maturity phase suppresses the use-case discovery that vendors need their customers to do. Uber’s recent public disclosure of token-budget burn through Claude Code isn’t a buyer-governance failure. It’s exploration, the work of testing where AI creates value in a specific business context. When the vendor’s pricing model makes that exploration visible as cost, the customer pulls back. The vendor never learns which workflows would have justified higher pricing once proven. This is the cloud-2009 phase of AI pricing architecture, and the vendors who solve for exploration during this phase will earn the production budget when use cases mature.


Last week, Uber’s chief operating officer went on the record about something most enterprises only discuss behind closed doors. The company had burned through what was projected as a year’s worth of AI token budget in a fraction of that time. The vendor was Anthropic, the product was Claude Code. The news coverage described a more interesting dynamic than just runaway consumption: Uber’s leadership was actively encouraging maximum token usage, building incentive structures that pushed employees to explore the boundaries of where AI could create value. The burn wasn’t a failure of governance. It was the deliberate output of an exploration strategy, and Uber isn’t alone in running that playbook.

The story isn’t about reckless spending or the rising cost of doing business with AI. Enterprises are executing deliberate exploration strategies under variable pricing models that turn the strategy’s intended outputs into a public-relations problem. Vendors who haven’t yet seen this dynamic in their own customer base are about to.

What actually happened at Uber

The COO’s public disclosure is itself the data point. It’s unusual at this enterprise scale for a senior operator to talk openly about an AI cost overrun. The fact that it surfaced at all signals the dynamic is widespread enough that one of the most operationally sophisticated companies in the market couldn’t quietly absorb it.

The workload wasn’t passive exploration. Per the news coverage, Uber’s leadership was actively encouraging maximum token usage and building incentive structures that pushed employees to explore the boundaries of where AI could create value. Multiple other enterprises have been adopting similar deliberate-encouragement strategies, treating high consumption as the intended output of an exploration phase rather than a sign of indiscipline. That distinction matters more than the absolute dollar figure. This wasn’t steady-state production load that should have been forecast against a stable baseline. It was discovery work, the kind that has to happen before any enterprise knows which AI use cases will return value in their specific context.

The token burn ran ahead of what the budget model assumed. By how much, in absolute terms, isn’t the most important detail. The shape of the story is what matters: a sophisticated enterprise customer, exploring rationally, generated consumption that ran well past projection. And the COO felt the dynamic was worth talking about publicly, which says something about how broadly the same pattern is playing out across the market.

The pattern we wrote about in April

Six weeks ago, we published the credit-based pricing framework, an argument that credit-based and token-based AI pricing systems share a structural pattern. A vendor-controlled conversion rate between work done and tokens consumed, with opacity that customers’ procurement teams correctly pattern-match as renewal-pressure leverage. The article described how buyers respond: internal usage policing, suppression of exploratory use, renegotiation pressure at renewal.

We weren’t alone in seeing it. Procurement-advisory firms working active enterprise contracts had been documenting the same dynamic from the procurement side. One published case study from February 2026 described a 6,000-seat Atlassian customer flagging $180,000 in projected Rovo credit overage over a 24-month period and renegotiating an enterprise upgrade with a credit cap and price-protection clause. Procurement specialists and pricing-architecture specialists were converging on the same observation from different angles, in real enterprise contracts, before the dynamic surfaced publicly at brand-name scale.

Uber is that public surfacing. The same pattern moving from advisory-firm case studies into brand-name enterprise public admission. When a sophisticated operator like Uber’s COO talks publicly about it, the dynamic has moved from specialist signal to market-wide phenomenon. That’s the harder question this article exists to address: if the pattern is real and now publicly admitted at brand-name scale, what’s the lesson for AI vendors who haven’t yet had their Uber moment?

Token-maxxing isn’t waste. It’s exploration.

The dominant reading misses what actually happened

The news coverage of Uber’s situation describes what Uber was actually doing: leadership was deliberately encouraging maximum token usage and incentivizing employees to push the boundaries of where AI could create value. Two interpretations of that story will dominate among readers. One: Uber was reckless. Two: this is just the cost of doing business with AI, plan accordingly.

Both interpretations miss what’s actually happening. Token-maxxing — actively encouraging maximum usage as a way to find where AI creates value — is a rational exploration strategy at the current stage of AI maturity. No enterprise knows in advance which workflows will return value in their specific context. The technology is too new, the use cases are too diverse, and the integration patterns are too immature for any organization to confidently project consumption from a stable production baseline. The only way to find out is to push hard, encourage experimentation across teams, and build incentive structures that get employees to test the boundaries rather than play it safe.

Exploration is the only thing that produces use-case knowledge

What Uber’s teams were doing was the only thing that produces that knowledge: testing. Pushing AI capability at the fringes of where it might create value, discovering which experiments turn into validated use cases and which ones don’t. Some of that testing produces measurable outcomes. Some of it produces nothing. The ratio between the two is exactly what enterprise teams are trying to learn.

This is exploration. It is the work that has to happen for AI to move from experimentation budget to production budget inside any large organization. And it is exactly what every AI vendor in the market needs their customers to be doing.

Exploration is the substrate for the vendor’s own growth

Customer-validated use cases are the substrate for pricing power. They justify expansion revenue. They produce the case studies and reference customers that mature the category. Without them, the vendor has a product nobody quite knows how to use at scale and a pricing model nobody can budget for. With them, the vendor has the foundation for sustained growth that pricing-architecture work in AI software pricing builds on.

Uber wasn’t undisciplined. Uber was running a deliberate exploration strategy that’s becoming standard practice for enterprises trying to find AI’s value at the fringes. The pricing model is what made that strategy look like a failure from the outside.

Does Your AI Pricing Encourage Token-Maxxing or Punish It?

If customers fear bill shock from exploration, your pricing blocks the usage experimentation that drives feature adoption. We’ll assess whether your current structure enables or suppresses AI exploration.

Why variable pricing suppresses the exploration vendors need

Innovation happens in the engineering department, but scaling happens in the finance department.

That’s the tension variable AI pricing exposes. Engineering wants to explore where AI creates value in the business. Finance, watching the cost lines accumulate, needs to bound the exploration within what the budget can survive. The two functions pull in opposite directions, and the vendor’s pricing model determines how sharply they collide.

Visible costs get governed

Variable pricing transmits every exploratory action into a measurable cost line. Token passthrough means every “let’s see if AI can help with this” request, every fringe experiment, every iteration loop on a prompt design is visible on a budget. Enterprise procurement, finance, and engineering leadership respond rationally to visible cost lines by suppressing the actions that produce them.

This is the predictable outcome of every variable-cost line in business. Visible costs get governed. The corollary is also true: visible costs get reduced by reducing the activity that generates them, regardless of whether that activity is creating future value.

The future use case never gets discovered

For AI workloads, the activity in question is exploration. The visible cost is the immediate token spend. The future value is the use case that hasn’t been discovered yet. The rational response, from the buyer’s standpoint, is to draw boundaries around “approved AI use cases,” the workflows that have already proven themselves, and stop the exploratory work that would have surfaced the next generation of approved uses.

This is the perverse incentive structure for the vendor. The vendor needs its customer to test the fringes, because that’s where category-defining use cases emerge. The vendor’s pricing model creates direct financial pain for the customer when they test those fringes. The customer rationally pulls back. The vendor never learns which workflows would have justified higher pricing once proven, never gets the reference customers that mature the category, never sees the expansion revenue that exploration would have unlocked.

This isn’t a new dynamic. We’ve watched it play out before AI was the technology in question. A B2B aerial-imagery vendor that billed by bandwidth saw the same pattern: customers learned panning and zooming consumed bandwidth, watched it appear as a budget line, and stopped. They typed in addresses instead.

This wasn’t a one-off response. Across the customer base, the same pattern repeated: customers appointed internal usage tzars to police access, reserved the paid software for premium jobs only, and routed everything else through Google Earth. Workflows melded around the free alternative and didn’t come back. Customers downgraded their bandwidth subscriptions.

The vendor lost not just current consumption revenue but customer habits, expansion paths, and reference-customer status. At scale. Across the base.

The dynamic is sharpest on agentic AI workflows, where a single experimental session can produce more token consumption than a quarter of normal usage. Industry observers across the entitlement-platform space have acknowledged that fast-growing AI products eventually face runaway-usage incidents that compound before alerts are seen. That observation is correct but incomplete. The runaway is how customers learn what’s valuable. Pricing-architecture choices determine whether vendors capture or punish that learning.

When pure pass-through is the right call

Variable pricing during exploration phases isn’t categorically wrong. There are vendor and customer situations where pure pass-through is the right call. Early-stage products where the vendor itself is co-discovering value alongside the customer. Partnership terms where the buyer has agreed to absorb variability as part of the relationship. Engagements where the economics just don’t support a bounded model.

The point isn’t that variable pricing during exploration is a mistake. The point is that variable pricing during exploration has implications most vendors aren’t designing for, and the Uber story is what those implications look like at scale.

AI is at cloud-2009. Variable pricing is for mature technologies.

Variable pricing isn’t bad. It’s appropriate at a specific point in a technology’s lifecycle: after use cases are proven, after baseline volume is predictable, after the buyer can confidently project consumption from established production patterns. The market has seen this maturity curve before. Cloud computing’s pricing evolution is the analogue.

How cloud pricing evolved from variable-only to portfolio

Amazon Web Services launched EC2 in 2006 with per-hour variable pricing. The first three years of enterprise cloud adoption were marked by exactly the dynamic AI vendors are now living through. Buyers couldn’t budget. CFOs blocked spend approval. Procurement teams insisted on commit-spend conversations before signing larger contracts. Variable per-hour pricing was structurally incompatible with enterprise procurement workflows that operated on annual budgets and quarterly forecasts.

The market evolved its pricing architecture in response. Reserved instances arrived in 2009. Committed spend agreements followed. Savings plans came later. Hybrid models bounded variability during baseline workloads and reserved variability for spike events. The pure-variable model didn’t disappear. It became one of several pricing structures customers could choose deliberately, depending on where their workload sat between predictable and spiky.

By 2015, mature enterprise cloud buyers had a portfolio of pricing options. Variable for spike and unpredictable workloads. Committed for baseline. Savings plans for portfolio optimization. The market had learned how to absorb variability at the architectural level. Cloud went from being structurally incompatible with enterprise budgeting to being native to it, not because the underlying technology became cheaper, but because the pricing architecture matured around the realities of how enterprise buyers actually work.

Where AI sits on the maturity curve right now

AI in 2026 is at the 2009 phase of cloud-pricing maturity. Token passthrough is the EC2-2009 of AI pricing. The market hasn’t yet built the architectural mechanisms that absorb variability during the exploration ramp. The vendors that build the 2010-2012 generation of AI pricing structures, the ones that bound exploration variability while preserving variable pricing for proven workflows, will earn the enterprise spend that the pure-passthrough vendors are currently losing.

Peer-reviewed research on multi-tenant cloud workload economics documents the same maturity dynamic: cost predictability becomes a prerequisite for enterprise adoption as a category moves from experimentation into production, and architectures that don’t deliver that predictability lose share to ones that do.

That isn’t a prediction. It’s a pattern the market has already worked through once, with a technology category that has more in common with AI than most people remember. The companies that took the architectural lessons from cloud and applied them earned the next decade of growth. The companies that stayed pure-variable longer than the market lifecycle warranted didn’t.

The questions vendors should be asking

For software vendors selling AI capabilities, the active question after reading the Uber story isn’t “what pricing pattern should I copy.” It’s a set of questions about your own customer base, your own competitive posture, and your own AI capability maturity. The answers determine what the right pricing architecture looks like for your situation specifically.

Customer position versus pricing assumption

Where are your customers on the exploration-to-production curve, and does your pricing match their reality or your assumptions? Most vendors are pricing as if their customers had already validated the use cases the vendor is selling against. The customers haven’t. The gap between the two is where exploration happens, and where vendor pricing models either support that work or suppress it.

What failure mode does your current pricing model create if a customer’s exploration produces an Uber-shaped burn? Is that failure mode commercially survivable for you AND for them? Vendors who can answer concretely have already done the work to understand their exposure. Most haven’t. The first big public failure inside your customer base will tell you the answer, but by then the renewal conversation will be different than the one you were planning.

Short-term capture versus long-term capture

Are you optimizing for immediate variable revenue or for the proven use cases that justify higher pricing later? Which one your pricing model is actually optimizing for may not be the one you intended. The two strategies pull in different directions during the exploration phase. The math that looks favorable on a quarterly board slide can look different when it’s the cause of the next quarter’s churn.

Pricing as an ongoing function

Who owns the answer to these questions inside your organization, and is that ownership designed for the 12-18 month evolution ahead? Pricing architecture isn’t a project. It’s an ongoing function. Companies that treat it as a project will be reworking it every two quarters for the next three years.

The vendors who solve this well are designing pricing architectures that bound variability without eliminating it. The right architecture is situation-specific. There isn’t a universal answer, which is partly why most vendors are getting this wrong right now: they’re pattern-matching to whatever the leading AI labs do, and the leading labs are themselves figuring this out in real time. The frame from cloud’s history is that pricing architecture sophistication arrives in waves, not all at once. The vendors who design deliberately during this wave will look like architectural leaders three years from now.

Test AI Pricing Changes Before Your Next Uber Moment

LevelSetter models how switching from variable to fixed AI pricing affects revenue across your portfolio. See which customers become exploration-constrained before you change course.

This is also the buyer-side defense

Where vendors don’t build exploration-aware pricing into their architecture, buyers will build defenses against the failure modes that variable-during-exploration creates. Internal cost-control architectures. Application-layer governance. Hard caps. Vendor reallocation away from pure-passthrough providers toward vendors with bounded alternatives.

We wrote about that buyer-side architecture in hard caps vs budget alerts, architecting AI cost controls for production workloads. The two articles cover the same problem from opposite sides of the transaction. Vendors who haven’t designed exploration-aware pricing into their architecture are creating the failure modes that buyers are now building cost-of-failure architecture to defend against. Same problem, two seats.

Where this matters most for vendors: the buyers who build those defenses don’t unbuild them. The Uber story is one public example of a dynamic playing out in dozens of enterprise procurement conversations right now.

The vendors who lose the production budget to bounded competitors won’t get it back by adjusting their pricing in year three. The architectural decision happens now, in the next two to four quarters, while the market is still figuring out which vendors will be the architectural leaders of this phase.

The ones that figure out exploration-phase pricing first will compound that advantage. The ones that don’t will spend the next few years explaining their pricing to procurement teams that have already developed pattern-recognition for the alternative.

What SPP advises

SPP’s vendor pricing-architecture engagements treat exploration-phase pricing as a first-class question. Not “what’s your pricing model” but “what’s the pricing architecture during the exploration phase, and how does it graduate as use cases mature?” The answers are situation-specific to your customer base, competitive posture, and AI capability maturity.

The five-position consumption-risk transfer spectrum from our GitHub Copilot analysis maps roughly to where vendors sit on this question today. The vendors most exposed to the Uber dynamic are at Pos 1, pure token passthrough with no exploration-phase mechanisms. The vendors structurally protected are at Pos 5, fully bundled, though most of those may be undermonetizing elite workflows. The middle of the spectrum is where intentional architecture lives, and it’s where most B2B software companies selling AI capabilities will need to land. The detailed work of getting to that middle deliberately is what pricing-architecture engagements address.

For software companies running AI capabilities through LevelSetter, the same discipline applies internally. The pricing structures our own product uses for exploratory versus production workloads have evolved with the technology’s maturity, and they will continue to. Pricing architecture is a function, not a project.

If you’re a software vendor thinking about how your pricing model will hold up as your customers’ AI exploration meets your next renewal cycle, our pricing-architecture engagements work through these questions for your specific situation. Book a working session if you want to talk about it.

FAQs

Ready for profitable growth?

Hit the ground running and learn how to fix your pricing.