Build agents, get paid
Understand your margins and get paid for the value your agents create.
By now everyone knows usage-based pricing charges customers based on product consumption instead of flat subscription fees. In traditional SaaS, you paid per API call, per gigabyte stored, or per message sent.
For AI agents, this model is creating a margin crisis, and that's why usage-based is wrong for AI and specifically AI agents.
Definition: A pricing model where costs scale directly with measurable product usage like API calls, compute hours, or data processed.
You pay for what you use. Nothing more, nothing less.
The model powered some of the biggest SaaS companies:
These work because costs stay relatively predictable. An API call costs roughly the same whether it's your first or your millionth (assuming you didn't pre-buy a package with overage fees).
Factor | Subscription | Usage-Based |
|---|---|---|
Bill predictability | As predictable as you can get | Variable |
Customer risk | You pay regardless of use | Usage can exceed what is planned |
Revenue predictability | Guaranteed | Variable |
Expansion potential | Limited | Unlimited |
CFO preference | Preferred (makes forecasting easy) | Cautious (harder to forecast on) |
Traditional SaaS: If an API call costs $0.001 and you charge $0.005, your margin holds at 80%.
compare with AI agents: A "resolved ticket" costs anywhere from $0.99 to $3.00.
One customer service platform charges $0.99 per resolved ticket. Simple questions cost them $0.04 in resources. Complex issues cost $2.80. Average margin is ~60%, but engaged customers generate losses. This variance isn't a bug. Agents assess complexity and adapt. While simple queries need one LLM call, complex ones trigger research, use memory, multiple reasoning steps, and generate more detailed responses. It's quite hard to price for average when the range is so big.
Manny Medina (our CEO who built Outreach to a $4.4B valuation) sees this playing out across the industry:
Traditional software waited for instructions. Click send, email goes out, charge applies.
AI agents make independent decisions. An AI SDR might send five touchpoints instead of two, adjusting based on engagement. Do you charge for two emails or five? For the outcome or the process? For successful actions or every attempt?
Cursor faces this daily. When their AI generates code, refactors it, then fixes a bug, how many "completions" was that? Developers wanted working auth. They don't know if that costs 1,000 tokens or 10,000.
The whole metering approach of "count the things, charge per thing" breaks down here, as one outcome could have many usage events.
Charging for tokens maps directly to infrastructure costs and obfuscates the value. Anthropic does it. OpenAI does it, but customers think in outcomes, not tokens - which is why so many people complain about their bills.
"Implement user authentication" is clear. Estimating 8,347 input tokens and 2,156 output tokens requires expertise most don't have.
This creates two BIG problems:
AI model costs drop 90% annually while better models launch constantly.
When you price based on tokens or compute, your pricing tracks the collapsing cost structure. As GPT-4 gets cheaper, customers expect matching price drops.
One document processing company started at $0.50 per page. When costs dropped and competitors entered at $0.30, then $0.15, they had no defense.
Companies charging for outcomes maintained pricing power. A reviewed contract is worth the same whether processing costs $2 or $0.20.
Usage-based pricing feels simple in theory - you count, charge, and you're done.
But a customer service agent doesn't just "resolve tickets." It searches your knowledge base, generates responses, checks policies, formats output, and learns from interactions.
Five usage events for one outcome. Which do you charge for? All of them? How do you explain that?
Usage-based pricing assumes you can measure usage accurately and cheaply.
For traditional SaaS, counting API calls is trivial. For AI agents, you need to track LLM calls across providers, attribute costs to workflows, monitor tokens in real time, calculate per-interaction margins, and catch cost spikes before they kill profitability.
Stripe counts transactions but doesn't know workflow costs. Chargebee generates invoices but can't tell you which customers are profitable.
The infrastructure gap means flying blind while thinking you're data-driven because you're "usage-based."
Traditional SaaS had simple economics:
| AI agents invert everything:
|
At this point, you might be thinking: "Fine, usage-based is broken. Seats are dying. What's actually supposed to work?"...
We have a surprisingly simple answer:
"You don't need seats. You don't need a ton of thinking. The moment you establish what the exchange value is going to be, then you're in good shape to price."
This sounds almost too simple. But it cuts through the complexity: figure out the value you're creating, take 20-50% of that, and build a system that lets you price differently for each customer. Not a perfect formula. Not a one-size-fits-all model. A framework for thinking about value exchange.
Martin Casado and Kyle Poyar recently wrote about why all pricing models have flaws. While some of the criticism is fair - not everything requires a PhD or is a pipe-dream. Where they are right is that usage-based pricing for AI is misaligned with the economics.
Usage-based pricing worked in SaaS because it aligned incentives. Customers paid for what was performed, and vendors captured it.
For AI agents, it creates misalignment:
Our SaaS to AI Agent transition report analyzed 250+ companies - from pure usage-based pricing correlates with 70% churn and negative margin to companies evolving to workflow or outcome models that maintain 94% margins.
At this point, you have to recognize that usage-based pricing optimizes for infrastructure efficiency when the world demands outcome delivery.
Understand your margins and get paid for the value your agents create.
Price smarter. Protect margins. Grow revenue.