Every SaaS company is adding AI. Most of them are losing money on every query. Traditional B2B SaaS ran at 80% to 90% gross margins because the marginal cost of an additional user was close to zero. AI-first companies are operating at 50% to 60% margins, with some early-stage companies as low as 25%. The pricing models haven't caught up. That's the trap.
But here's what most of the conversation gets wrong. Not all SaaS companies are the same. Not all products need an AI component. Some products have multiple AI components maturing in real time, each with a different cost structure, usage pattern, and value delivery mechanism. Treating them as one category and debating "seat vs. usage vs. outcome" at the company level is the wrong conversation.
The real architecture happens at the workload level.
The problem statement is simple: not all AI workloads do the same thing or cost the same. How are you going to price them without understanding the cost-to-serve variants? A single product might contain standard inference, RAG-based retrieval, agentic orchestration, and batch processing, each with fundamentally different economics. Pricing them with one model is like charging the same rate for electricity, water, and gas because they all come through pipes.
The Structural Shift: Software as Labor
The industry has moved from an Ownership Era (perpetual licenses) through an Access Era (seat-based subscriptions) into what Andreessen Horowitz calls the Value Era: customers pay for a job to be done. Software is becoming labor. Bessemer reports buyers now evaluate AI as a productive teammate capable of independent execution. If your AI handles 45% of support tickets autonomously, the customer needs fewer seats, not more. For the vendor to grow revenue under a per-seat model, they'd need the AI to fail.
Seat-based pricing dropped from 21% to 15% of the market in a single year. Companies sticking to per-seat pricing for AI products are seeing 40% lower gross margins and 2.3x higher churn.
The Margin Wedge
GitHub Copilot launched at $10/month with unlimited usage. Compute costs ran as high as $80 per user/month for heavy users, with average losses of $20/user. By mid-2025, Microsoft introduced $0.04/request pricing beyond caps. Replit scaled from $2M to $144M ARR but only achieved positive gross margins by moving to usage-based models. Clay accelerated from $2M to $37M ARR by iterating pricing twice per year (seat, then hybrid, then pure usage). Even as inference costs drop 80-90% per year, user consumption accelerates faster. This is the Jevons Paradox applied to AI compute.
Not All AI Workloads Are Created Equal
A SaaS product might contain six or more distinct AI workload types, each with a different cost driver, scaling pattern, and pricing implication. Inference now accounts for 55%+ of total AI cloud infrastructure spend ($20.6B of $37.5B), surpassing training for the first time. Deloitte estimates inference will reach two-thirds of all AI compute by end of 2026. But inference is not one thing. Real-time chat, batch processing, RAG retrieval, and agentic orchestration all fall under "inference" with wildly different cost profiles.
Key insight: the true cost of a resolved AI task is often 10 to 50 times higher than the posted per-call price when vector search, memory, concurrency, and moderation are included. A $0.01 model call becomes a $0.40 to $0.70 workflow. OpenAI burned roughly $8.7 billion on Azure inference in the first three quarters of 2025. That's not training. That's serving outputs.
The ROI Divide: Soft vs. Hard
Bessemer draws a sharp line between Soft ROI (copilots that advise, hard to measure, high churn risk) and Hard ROI (agents that execute, concrete metrics, premium pricing power). If your AI delivers measurable outcomes, your pricing should capture a share of that value.
Value: "better emails," "faster drafts" → hard to prove at renewal.
McKinsey: 14% issue resolution increase, 9% handling time reduction.
Ex: Grammarly, Notion AI, GitHub Copilot (original flat-fee)
Value: tickets resolved, revenue recovered, hours replaced → auditable.
Intercom Fin: 15% → 45% in 5 months at $0.99/resolution.
Ex: Intercom Fin, Chargeflow (25% of recovered $), HighRadius (zero upfront)
Matching the Charge Metric to the Workload
The model isn't something you pick from a menu. It's something you match to the level of autonomy your product delivers and the cost structure of each workload underneath it. Don't force outcome pricing on a copilot. Don't sell an autonomous agent per-seat. 65% of vendors now use a hybrid approach. 85% of SaaS leaders adopted usage or hybrid pricing by 2025.
Packaging: Fence for Willingness-to-Pay, Not Features
Traditional SaaS packaging drew tier boundaries based on feature checklists: Basic gets 5 features, Pro gets 10, Enterprise gets 15. In AI, the fencing dimensions have shifted. The question isn't what capabilities you can access. It's how well, how securely, and how independently the AI performs for you. Every tier gets the AI. The fence determines the quality of execution, the level of trust, and the degree of autonomy.
These aren't arbitrary upsell levers. Each fence maps directly to a different cost-to-serve: a frontier model costs 10x more to run than a basic one, a private instance costs more than shared infrastructure, and full autonomous execution triggers more compute steps than reactive prompting.
The Cost Visibility Problem
None of the pricing model or packaging discussion matters if the buyer can't predict their cost at the time of use. This is the gap most pricing articles ignore, and it's the dimension that determines whether customers adopt or throttle their usage.
The unsolved problem is agentic workloads. A customer sends what looks like a simple request: "research competitors and write a report." Underneath, the agent triggers 50+ steps: multiple LLM calls, tool use, web retrieval, verification loops, and state management. The customer had no idea at time of send what that would cost.
This is why guardrails, spend alerts, and session caps matter as much as the pricing model itself. The model on paper means nothing if the buyer can't predict their cost at time of use. The vendors that solve cost visibility will win adoption. The ones that don't will watch customers throttle usage out of fear, regardless of how much value the AI delivers.
The Renewal Cliff Is Coming
Most AI deals closed in 2025 were subsidized. Those contracts are hitting renewal in 2026. The double-cost transition gap (AI agent + human salary during pilot) stalls enterprise sales. 85% of organizations misestimate AI project costs by more than 10%. 80% of AI projects fail before production due to cost overruns. The companies that survive will have defined success metrics before contract signature and built pricing structures with graduated adoption paths.
The Blueprint
The era of selling potential is over. In 2026, software must earn its keep by delivering measurable work. Expect to spend three dollars on change management for every dollar on technology. The companies that win won't have the best AI models. They'll have figured out how to price and package what those models actually deliver.
Download the complete article with all exhibits including the AI Workload Taxonomy, Bain Strategic Scenario Matrix, and full packaging fence analysis.
Download PDF →
Massoud Ashrafi is the founder of Ashrafi Consulting, where he advises PE-backed and growth-stage companies on pricing architecture, monetization strategy, and commercial governance. He previously held senior pricing and product leadership roles at Amazon, Twilio, GoDaddy, and PwC.
Sources & References
1. The Economics of AI-First B2B SaaS in 2026 (Monetizely)
2. AI Is Driving A Shift Towards Outcome-Based Pricing, a16z (Dec 2024)
3. The AI Pricing and Monetization Playbook, Bessemer Venture Partners
4. Per-Seat Pricing Isn't Dead, but New Models Are Gaining Steam, Bain
5. From Seats to Consumption: Why SaaS Pricing Has Entered Its Hybrid Era
6. AI Agent Monetization: Lessons from the Real World, Stactize
7. How to Monetize Generative AI Features in SaaS, Simon-Kucher
8. Value Monetization in the Age of AI, Simon-Kucher
9. Evolving Models and Monetization Strategies in the New AI SaaS Era, McKinsey
10. Economic Potential of Generative AI, McKinsey
11. Zuora COMPASS Framework for Agentic AI Pricing
12. Bain SaaS Workflow Scenario Framework (via iMerge Advisors)
13. Deloitte: AI Compute Predictions 2026
14. AI Inference Costs: 55% of Cloud Spending in 2026 (byteiota)
15. CloudZero: Guide to Inference Cost
16. FinOps Foundation: Cost Estimation of AI Workloads
17. OpenAI Azure inference spend: ~$8.7B in 9 months
18. Simon-Kucher: Agentic AI Price Metric Spectrum
19. Ibbaka Four-Layer Pricing Framework (HICSS)
If your AI margins are eroding faster than your inference costs are dropping, the pricing architecture is the problem.
Request a Diagnostic →