The Real Cost of Running an AI Product in 2025

You see the API pricing: GPT-4o at $2.50/$10 per million tokens (roughly ¾ of a word each). You calculate your monthly OpenAI bill at $10,000. You think that's your AI cost.

Your actual spend includes $15,000-20,000 more.

The AI Cost Iceberg

Cost Category	% of Total	Monthly $ (if API = $10K)
API Costs (OpenAI/Anthropic)	30-40%	$10,000 (visible)
Infrastructure (AWS/GCP/Azure)	40-50%	$12,000-15,000
Monitoring & Observability	5-10%	$1,500-3,000
Caching & Storage	3-5%	$900-1,500
Failed Requests & Retries	2-3%	$600-900
Development & Testing	3-5%	$900-1,500
TOTAL	100%	$25,900-32,900

A $10K/month API bill becomes $26K-33K in total cost. This pattern explains why Cursor's AWS bill doubled from $6.2M to $12.6M/month even as they optimized API usage.

Infrastructure (40-50%): Compute for application servers and background workers. Databases for usage tracking and vector search. Storage for conversation history. Networking, load balancers, and container orchestration.

Monitoring (5-10%): LLM observability tools (LangSmith, Helicone), application performance (Datadog, Sentry), and product analytics. At 10,000 users, monitoring typically runs $2,000-3,500/month.

Caching & Storage (3-5%): Prompt caching has write costs—Anthropic charges 1.25x input price to write to cache. You need 8+ cache hits to break even. Vector databases for semantic search add $200-1,000/month at scale.

Failed Requests (2-3%): An 8% error rate with 3 retries averages to 24% wasted API spend. On a $10K bill, that's $2,400/month most companies don't track.

Development (3-5%): Local testing, CI/CD integration tests, staging environments, and prompt iteration. Typically $1,400-5,300/month depending on team size.

Cursor's Cost Discovery

Cursor reached $500M ARR and discovered their AWS costs were 79% of their Anthropic costs.

Month	AWS Bill	Anthropic (est.)	AWS as % of API
May 2025	$6.2M	~$8M	77%
June 2025	$12.6M	~$16M	79%

Why so high? Massive conversation history storage (200K token context windows), real-time collaboration infrastructure, code indexing, and distributed caching.

Their assumed economics: 64% gross margin based on API costs alone.

Their actual economics: 36% gross margin including infrastructure.

The 28-point margin difference led to four repricing cycles in 12 months, usage limits, a $200/month Ultra tier, and June 2025 pricing adjustments.

Real-World TCO: AI Chatbot Example

Product: Customer support chatbot with 10,000 users and 500K conversations/month on GPT-4o.

Category	Cost	% of Total
API Costs	$5,000	30%
Infrastructure	$1,850	47%
Monitoring	$1,100	12%
Storage	$600	8%
Failed Requests	$700	3%
TOTAL	$8,250	100%

True cost per customer: $0.83/month

At a $15/month price point, actual gross margin is 94.5%. If you only tracked API costs ($0.50/user), you'd calculate 96.7% margin—a 2-point error.

Why 2 points matters at scale:

At $1M ARR: $20K/year difference
At $10M ARR: $200K/year difference
At $100M ARR: $2M/year difference

Key Takeaways

API costs are 30-40% of total spend—infrastructure, monitoring, and overhead add 60-70% more
Cursor's AWS bill was 79% of Anthropic costs—infrastructure often exceeds API spend at scale
Failed requests waste 2-3% of budget—most companies don't track this
Margin errors compound—a 2-point error becomes $2M/year at $100M ARR

Complete cost visibility enables accurate margin calculations.

Bear Lumen tracks API costs, infrastructure allocation, per-customer margins, and failed request overhead automatically.

Join our waitlist for early access.

Usage Variance in AI Products — Per-customer cost distribution
GitHub Copilot Unit Economics — AI margin case study
AI API Costs 2025 — Model pricing comparison

The Real Cost of Running an AI Product in 2025

The AI Cost Iceberg

What's in the Hidden 70%

Cursor's Cost Discovery

Real-World TCO: AI Chatbot Example

Key Takeaways

Related Articles

HTTP 402 and L402: Why They Fail for AI Payment Systems

Billing Infrastructure vs. Cost Visibility: What AI Teams Need

Unit Economics for AI Products: A Cost Framework Beyond Tokens

The AI Cost Iceberg

What's in the Hidden 70%

Cursor's Cost Discovery

Real-World TCO: AI Chatbot Example

Key Takeaways

Related Reading

Related Articles

HTTP 402 and L402: Why They Fail for AI Payment Systems

Billing Infrastructure vs. Cost Visibility: What AI Teams Need

Unit Economics for AI Products: A Cost Framework Beyond Tokens