Why are AI startup gross margins lower than traditional SaaS?

AI cost structure resembles tech-enabled services rather than zero-marginal-cost software. Andreessen Horowitz puts AI gross margins at 50-60%, below the 60-80%+ SaaS benchmark, because every enterprise requires bespoke integration, evaluation, and continuous human alignment that recurs over the contract's life.

What is the Eval Trap in enterprise AI?

The Eval Trap is the need to build custom, human-built evaluation pipelines for every enterprise client to catch hallucinations before production. Because LLM output is probabilistic, this work can't be written once and shipped to all, breaking the build-once-sell-infinitely SaaS model.

Why does removing forward deployed engineers threaten AI revenue?

Much AI recurring revenue depends on continuous human labor to exist. When forward deployed engineers leave, the product degrades, hallucinations return, and renewals fall into doubt, meaning the revenue evaporates rather than shrinking gracefully like genuine SaaS would.

Probabilistic COGS: Why AI ARR Is Disguised Consulting

The ARR That Disappears When the Engineers Leave

Pull apart an AI startup's revenue and it stops looking like software. Freeze the implementation team and the product gets worse. Stop tuning prompts and hallucinations return. Remove the forward deployed engineer from the customer account and renewal risk shows up fast. SaaS does not work like that. A Salesforce seat keeps doing its job whether or not someone hovers over it. An enterprise AI deployment usually does not.

Public SaaS-style multiples on AI names still assume margins will expand like software margins. Series B investors and public-market buyers kept underwriting that story through 2024. The problem is simple: integration is not a phase you finish. The human work required to keep a probabilistic system useful inside bad enterprise data keeps coming back. That cost belongs in cost of goods sold.

The thesis is blunt. A large share of reported AI recurring revenue is Disguised Retainer Revenue: consulting and integration labor billed as subscription software. Large language models force ongoing human alignment for each customer. That breaks the build-once, sell-forever model and keeps gross margins closer to an agency than a software platform. Sooner or later, valuation follows the economics.

Why the SaaS Premium Was Always Misapplied

The SaaS premium exists because software has near-zero marginal cost. It does not fit AI very well, because AI often looks like a tech-enabled services business. Andreessen Horowitz put gross margins for AI companies in the 50 to 60 percent range, below the 60 to 80 percent-plus range for traditional SaaS.¹ That spread is not a footnote. It is the story.

The number matters. The reason matters more. Traditional SaaS spreads one engineering effort across thousands of customers. AI cannot, because each enterprise has its own data stack and its own error tolerance. Tomasz Tunguz gets to the point: AI startups face margin pressure from inference on the back end and implementation labor on the front.² Investor notes usually model inference costs explicitly but treat implementation labor as temporary onboarding. That assumption is doing a lot of work.

"In many cases, AI companies simply don't have the same economic construction as software businesses. At times, they can even look more like traditional services businesses." Andreessen Horowitz, The New Business of AI

When the firm's own analysts say the model looks like services, the burden shifts. The debate is no longer whether margins are lower. The debate is whether they ever become software margins at all.

a sleek consumer app resting on top of a heavy industrial bank vault, the vault dwarfing the app — The visible product is light; the deployment machinery underneath carries the weight.

The Eval Trap and the ARR Mirage

The Eval Trap turns AI software into AI services. LLM output is probabilistic, so every enterprise ends up needing its own evaluation stack to catch failures before they hit production. LlamaIndex says it plainly: building a RAG pipeline is easy, but evaluating it and making it production-ready for a specific enterprise is hard.³ In normal software, you build a feature once and ship it widely. In AI, the feature has to be validated against the client's data and the failure modes that matter to that client. Harrison Chase of LangChain makes the same point from the architecture side. Each company needs a slightly different setup, and there is no universal agent.⁴ The eval does not get written once. It gets maintained, retuned, and checked again as the model drifts and the data changes. The labor stays attached to the contract.

Disguised Retainer Revenue is recurring revenue that depends on ongoing human labor, even though it is billed like subscription software. The test is simple: remove the people and see if the revenue survives. With real SaaS, it does. With a lot of enterprise AI, it does not. C3.ai shows the pattern in its filings. The company says services revenue comes from implementation projects and training engagements, and it also says the model requires significant investment in customer deployment.⁷ A company like Salesforce does not carry forward-deployment labor as a core recurring delivery cost for each account. Palantir's S-1 is even clearer: its Forward Deployed Software Engineers handle deployment and configuration, and their costs are a significant part of cost of revenue.⁸ Once a company books integration labor above the gross margin line, it is admitting that labor is part of the product. Many enterprise AI vendors do some version of this while still presenting ARR as if it were pure software.

Strip the forward deployed engineers out of an AI contract and the ARR doesn't shrink gracefully, it evaporates.

Gross margin: SaaS benchmark vs AI reality — Andreessen Horowitz's benchmark: AI margins sit a full tier below the SaaS band the premium assumes.

Revenue Type	Survives FDE Removal?	Gross Margin Profile	Correct Classification
Traditional SaaS seat	Yes	70-80%	Software
AI subscription (productized)	Yes	55-65%	Software
AI subscription (FDE-dependent)	No	30-50%	Services / Retainer
Pure professional services	No	20-35%	Services

Selling Work Means Owning the Liability

Bessemer describes the shift as software-as-a-service turning into service-as-a-software.⁹ The wording sounds cute. The accounting consequence is not. Sell a tool and the customer owns the outcome. Sell the work and you own the outcome. Sarah Tavel of Benchmark says the next wave will not sell software. It will sell work.¹⁰ Higher-value pitch, worse margin profile. Labor does not go to zero.

Selling an AI agent is economically closer to hiring a worker than buying a seat license. TechCrunch captured the mood inside the industry: startups have realized that selling AI looks more like onboarding a digital employee.¹¹ An employee needs ongoing supervision and correction. So does an agent. Bret Taylor, co-founder of Sierra, says you cannot just drop an LLM into an enterprise. You have to connect it to systems of record, which means deep integration work.¹² Snowflake says the same from the data side. No AI strategy without a data strategy, and data prep is still the bottleneck for generative AI.¹³ Early-stage AI vendors often spend weeks cleaning customer data pipelines before the product works well enough to renew.

a humanoid robot in a business suit sitting at an onboarding desk being handed a stack of training manuals by a human mentor — An agent enters the enterprise like a new hire, not like a license key.

Who Actually Captures the Last Mile

The Palantir Illusion traps a lot of founders. Yes, Palantir built a real business with forward deployed engineers and its AIP Bootcamp model, putting top engineers directly in the room with customer data.¹⁵ But Palantir can carry that model because it has large contracts that absorb expensive deployment labor. A seed-stage startup pointing to Palantir is copying the theater without the balance sheet.

Follow the money. Accenture reported more than $600 million in new generative AI bookings in one quarter, and its clients keep discovering that scaling AI starts with a lot of data and services work.¹⁶ The consultancy wins because the last mile is a services problem, and Accenture never pretended otherwise. AI startups often do the same work while calling it software. Elad Gil's framing is right: many of them are acting as outsourced R&D and integration teams for enterprises.¹⁷ Sequoia's David Cahn has pointed to the widening gap between the revenue implied by AI infrastructure spending and the revenue actually showing up.¹⁸ Part of that gap is human throughput. Deployment cannot scale at the speed of capital because each enterprise needs custom integration and ongoing evaluation.

What Founders and Investors Should Do Before the Correction

If revenue depends on FDE labor, investors will eventually value that revenue more like services than software. The only real question is timing.

Start with one number: subscription revenue that survives if you pull the forward deployed engineer. That is the software business. Everything else deserves a different label, a different margin expectation, and probably a different multiple.

Use a harder operating rule. If an account churns when you remove the FDE, that revenue is services. Track implementation labor in COGS against subscription revenue every quarter. If the ratio rises as you grow, you are scaling a consultancy. And wherever the same eval work shows up twice, turn it into product immediately. That is the only path out of labor-backed ARR.

Before anyone slaps a SaaS multiple on an AI company, they should ask for FDE-independent revenue. Once one public company is forced to split software ARR from labor-backed revenue, the sector's multiples will reset in a week. When boards start asking for that number, half the category will find out it never had software margins in the first place.

In many cases, AI companies simply don't have the same economic construction as software businesses. At times, they can even look more like traditional services businesses.

Andreessen Horowitz · The New Business of AI

Key Takeaways

1Andreessen Horowitz pegs AI gross margins at 50-60%, well below the 60-80%+ that traditional SaaS valuations assume, exposing a structural mispricing.
2The Eval Trap forces AI startups to build bespoke human evaluation pipelines per client, breaking the build-once-sell-infinitely model that justifies software multiples.
3Forward deployed engineer costs land in cost of revenue, not OpEx, structurally depressing AI gross margins toward consulting levels.
4Selling an AI agent resembles onboarding a digital employee that needs training and alignment, not provisioning a self-sustaining SaaS seat.
5Consultancies like Accenture are capturing the GenAI windfall because the last mile of enterprise AI is fundamentally a services problem.

Keywords

AI economicsSaaS valuationsventure capitalenterprise AIgross margins

Back to Articles

Share:

X LinkedIn WhatsApp Facebook