TL;DR — Usage-based pricing backfires for AI products in the same way every time, for the same structural reason. The model makes a customer’s exploration visible as a cost line. Finance governs that cost line by cutting the activity, and the activity it cuts is the discovery work that would have surfaced the next use case worth paying for.
This is a field guide to the cases that have surfaced publicly, each mapped to its position on the five-position consumption-risk spectrum. The pattern is not “AI vendors should not meter.” It is that metering during the exploration phase, before use cases are proven, carries consequences vendors overlook.
How to read this guide
Every case here shows one mechanism: visible costs get governed. When a pricing model turns each exploratory action into a measurable cost line, finance responds the way it responds to any variable cost line, by cutting the activity that drives it. For AI, that activity is exploration. And exploration is the only thing that produces use-case knowledge this early in the technology’s life.
We map each case to the five-position consumption-risk spectrum from our GitHub Copilot pricing analysis, which ranks pricing models by who carries the cost-curve risk:
- Pos 1 — Pure passthrough. Customer carries all variability. Token or credit metering with no bound.
- Pos 2 — Surrogate metering. Vendor-controlled credit/point conversion layer over the raw usage.
- Pos 3 — Bounded hybrid. Variable within a committed envelope, overage beyond it.
- Pos 4 — Capped / included allowance. Variability absorbed into a tier up to a ceiling.
- Pos 5 — Fully bundled. Vendor carries all variability inside seat or platform pricing.
The closer a model sits to Pos 1, the more exposed it is to the backfire pattern.
A note on how to weigh these. Each entry is either a publicly reported incident or a pattern we observed directly in client engagements, not a measured survey of customer sentiment. That distinction matters less than it looks. In enterprise pricing, the story travels faster than the experience. It does not take mass failure for a pricing model to get repriced at the negotiating table. It takes a few vivid stories and the pattern-matching procurement does with them.
A handful of six-figure-invoice anecdotes reshapes how an entire buying population approaches a pricing model, well ahead of the actual base rate. So these cases are not offered as proof of a customer exodus. They are the stories that travel, and the traveling is the mechanism.
The AI cases
GitHub Copilot — token-based AI Credits
What happened. GitHub moved Copilot to token-based AI Credits, with consumption priced per model and converted to credits at a published rate. When metered billing took effect in June 2026, developers publicly reported bill-shock. Single requests consumed double-digit percentages of a monthly allowance with little to show for it. Developers shared screenshots of GitHub’s own cost estimator projecting monthly bills several times higher than before, in one widely-circulated case jumping from $44.68 to $754.29. Allowances depleted in hours rather than weeks, and some developers began rerouting to direct model access.
Spectrum position. Pos 1–2. Token passthrough wrapped in a branded credit layer.
Diagnosis. The model puts the full cost-curve risk on the developer at exactly the moment they are exploring where the tool helps. A multi-hour agentic session and a quick chat question stop being budgetable, so the rational user response is to stop exploring or to leave. The vendor captures short-term consumption revenue and trains its most active users to use the product less. See the mechanism in full: variable AI pricing suppresses exploration.
Cursor — the credit switch that ran out in a few prompts
What happened. In June 2025, the AI coding tool Cursor switched its $20 monthly plan from a fixed request allowance to “$20 worth of usage” billed at API rates. Users reported burning through the month’s allowance after a handful of prompts on the newest models, then hitting surprise overage charges they had not set limits for. The CEO publicly apologized for the rollout and offered refunds.
Spectrum position. Pos 1–2. Usage billed at passthrough API rates, with a thin prepaid allowance on top.
Diagnosis. This is the GitHub Copilot pattern in miniature, on the same buyer. A developer exploring where the tool helps cannot predict whether a session costs cents or dollars, so the safe move is to throttle back or set a hard limit. The vendor booked a few weeks of higher revenue and taught its most engaged users to ration the product, then reversed the change under public pressure.
Uber — an entire AI budget burned in four months
What happened. Uber rolled out Anthropic’s Claude Code to its engineers in December 2025, and by April 2026 had burned through its entire 2026 AI budget, four months in. Uber’s CTO disclosed the overrun in April; by May its president and COO was publicly questioning whether the token spend tied to any real product improvement, and the company moved to cap employee use.
Spectrum position. Pos 1. Pure token passthrough.
Diagnosis. This is the cleanest example of exploration being mistaken for waste. The enterprise was doing exactly what every AI vendor needs its customers to do, testing the fringes to discover validated use cases. Then the bill forced a premature value audit. The customer pulled back usage sharply and went looking for the return, and at the exploration stage the return isn’t there yet, which is the whole reason to explore in the first place.
So usage was cut before the value could surface, which is exactly how exploration dies. Under pure passthrough, that arc reads as a public-relations problem instead of a maturing-customer signal.
Atlassian Rovo — surrogate-unit overage at renewal
What happened. A February 2026 procurement case study documented a large Atlassian Rovo customer facing roughly $180,000 in projected credit overage over a 24-month term. The customer renegotiated an enterprise upgrade with a credit cap and a price-protection clause.
Spectrum position. Pos 2. Vendor-controlled credit conversion over raw usage.
Diagnosis. Credit and point systems are surrogate units, vendor-controlled accounting layers, not value metrics. Procurement teams correctly pattern-match the opacity as renewal-pressure leverage and respond by demanding caps and price protection. The vendor’s metering choice converts a growth account into a defensive negotiation.
Salesforce Agentforce — three pricing models in under a year
What happened. Salesforce launched its Agentforce AI agents in late 2024 at $2 per conversation. Customers pushed back on what counted as a conversation and how fast the meter ran. One support-team lead calculated that five agents at roughly 70 conversations a day would run past $20,000 a month. Salesforce moved to a per-action credit model by mid-2025, then to flat per-user licensing by late 2025, three pricing models in under a year.
Spectrum position. Pos 1–2. Per-conversation, then per-action consumption, before retreating toward bounded per-seat pricing.
Diagnosis. The per-conversation meter made the cost of trying the agents impossible to forecast, so buyers hesitated and the model drew open criticism. The repeated repricing is itself the signal: a consumption metric that suppresses adoption forces the vendor to keep rebuilding the architecture in public until it lands on something buyers can budget.
Does Your AI Credit System Create GitHub’s Conversion Chaos?
Token-to-credit conversions that seemed logical at launch can spiral into customer confusion within months. We’ll stress-test your AI pricing architecture against real behavioral patterns.
Pre-AI precedents
The pattern predates AI. The same mechanism showed up on entirely different consumption units years before AI was the technology in question, which is the strongest evidence that this is an architectural mistake, not an AI novelty.
Bandwidth-metered software
What happened. A B2B aerial-imagery vendor billed by bandwidth. Customers learned that panning and zooming consumed bandwidth and watched it appear as a budget line. They changed their behavior: typing in addresses instead of exploring, appointing internal usage tzars, reserving the paid tool for premium jobs, and routing everything else to a free alternative. Workflows reformed around the free tool and did not come back. Customers downgraded their subscriptions.
Spectrum position. Pos 1. Pure passthrough on a consumption metric the buyer could not bound.
Diagnosis. The same mechanism, years before AI was the technology in question. Metering the exploratory action (looking around the map) taught customers to stop doing it. The vendor lost not just consumption revenue but customer habits, expansion paths, and reference-customer status, across the base.
Attribution-metered mobile analytics — when the pilot invoice kills the pilot
What happened. A mobile-app attribution platform billed by attribution events: impressions, clicks, installs, and in-app events, each a billable unit. A brand ran a pilot to evaluate the platform. Real campaign volume drove the event count far past anyone’s model, and the trial alone generated a roughly $600,000 invoice. The brand terminated the pilot.
Spectrum position. Pos 1. Pure passthrough on a consumption metric the buyer could not bound, scaling with campaign volume rather than with realized value.
Diagnosis. The pilot was the exploration: a brand testing whether the platform earned a place in its stack before committing. The event-based metric turned that evaluation into a six-figure liability before any value was proven, so the rational move was to kill the pilot. The platform’s own value metric punished the trial meant to sell it. The same mechanism as the AI cases, years earlier, on a different consumption unit.
Observability metering at enterprise scale
What happened. In 2021, a major crypto exchange accumulated an observability bill of roughly $65 million for the year, on usage-based monitoring pricing, during a period of growth-at-all-costs. When it surfaced publicly, it set off a wave of similar complaints about unpredictable consumption billing. The company built a team to migrate off, and the vendor renegotiated to keep the account.
Spectrum position. Pos 1. Uncapped consumption, priced per host, metric, and event ingested.
Diagnosis. Non-AI, enterprise scale, same shape. Usage that nobody governed during the growth phase became a bill nobody could defend at the cost-cutting phase, and the buyer’s response was to build the bounded alternative in-house. The pattern is not specific to AI or to any one vendor. It is what uncapped consumption pricing does when the buyer cannot see the meter until the invoice arrives.
The architectural fix
None of these outcomes argue against variable pricing. Variable pricing is appropriate once use cases are proven and baseline consumption is predictable. The problem is variable pricing during the exploration phase, before the customer knows which workflows pay off.
The market has solved this once already. Cloud computing launched on pure per-hour variable pricing in 2006 and spent its first three years structurally incompatible with enterprise budgeting. The fix was not abandoning variable pricing. It was building reserved instances, committed-spend agreements, and savings plans, mechanisms that bound variability during baseline workloads while preserving it for genuine spikes. By the mid-2010s, mature buyers had a portfolio of pricing options matched to workload predictability.
The same correction is already underway outside AI. Uncapped usage-based billing has produced catastrophic surprise bills, including a 2024 six-figure invoice to the owner of a small static site hit by a denial-of-service traffic spike. In response, infrastructure vendors have started shipping flat-rate options that cap the variability, the same move the cloud market made 15 years ago.
AI pricing in 2026 sits at that same early phase. The vendors who build exploration-aware pricing architecture now, models that bound exploration variability without eliminating it, will earn the production budget when use cases mature. We cover the buyer-side half of this, how customers defend themselves when vendors do not, in hard caps vs budget alerts, and the upstream metric decision in agentic AI pricing strategy. If you want to know whether your own architecture is exposed before the market tells you, the free Pricing Architecture Assessment scores it across licensing, packaging, and pricing in about four minutes.
And it is a continuous function, not a maturity model. The slow-moving consulting version treats pricing as a staged climb a company completes over years. AI moves too fast for a multi-year ladder. The architecture has to be tuned and graduated continuously as customers move from exploration to production, which is the discipline of continuous monetization.
Where your pricing sits
If your AI pricing lives near Pos 1 and your customers are still exploring, you are exposed to every backfire in this guide. The architectural question is not price level. It is what your pricing does to a customer’s exploration, and how it graduates as their use cases mature.
Our pricing-architecture engagements work through that question for your specific customer base, competitive posture, and AI capability maturity. Book a working session to scope where you sit on the spectrum and what the graduation path looks like.
If you’re closer to a backfire than you think, a same-day pricing diagnostic reads your own deal data for the early-warning signals — the proactive prevention before the public chapter.
For the original case for this discipline, see SPP’s 2023 piece on why continuous monetization became urgent for software companies.