Introduction
By 2025 the AI ecosystem has moved from the initial shock of rapid innovation into a phase Bessemer calls “First Light”: clearer clusters of companies and patterns are forming, even as significant unknowns remain. This expanded summary paraphrases "The State of AI 2025" (Bessemer’s Atlas) and deepens the original synthesis with additional specifics on benchmarks, infrastructure evolution, developer tooling, vertical and consumer opportunities, unresolved "dark matter," and the five predictions shaping 2025–2026.
Context
Since ChatGPT pushed AI into public consciousness, capital and product focus have surged: Bessemer reports having deployed over $1B in AI-native startups since 2023. The landscape today is defined by intense competition, fast adoption cycles, and new success metrics—what counted as great in the SaaS era no longer maps cleanly to AI-native businesses.
AI benchmarks: two archetypes in detail
AI Supernovas
Supernovas exhibit historical-scale top-line growth: in Bessemer’s sample, ~40M ARR in year one of commercialization and ~125M ARR in year two on average. Yet they often show thin gross margins (~25%) as they sacrifice margin for distribution, and retention can be fragile where switching costs are low. Remarkably, Supernovas show ~ $1.13M ARR per FTE, signaling exceptional revenue efficiency that could translate into sustained scale if retention and unit economics improve.
AI Shooting Stars
Shooting Stars look more like accelerated SaaS winners: roughly ~3M ARR in year one, quadrupling year-over-year growth trajectories, and healthier gross margins (~60%) with ARR/FTE near $164K in early years. Bessemer frames their 5-year path as Q2T3 (quadruple, quadruple, triple, triple, triple)—faster than classic SaaS (T2D3) but still grounded in durable retention and margins.
Roadmaps: model layer and infrastructure
The model layer remains dominated by a few large labs—OpenAI, Anthropic, Google/DeepMind, Gemini, xAI—yet open-source entrants like Kimi, Qwen, Mixtral and others continue to push efficiency and domain-specialized performance. Research trends include adaptive-depth approaches (Mixture-of-Recursions), revived Mixture-of-Experts, and inference-time techniques such as test-time RL and adaptive reasoning that improve few-shot accuracy without linear compute scaling.
"The second half of AI—starting now—will shift focus from solving problems to defining them."
Shunyu Yao, OpenAI (quoted in Bessemer’s report)
Infrastructure's Second Act
The next chapter emphasizes connecting models to real-world experience: RL environments and task curation (Fleet, Matrices, Kaizen), continuous evaluation frameworks (Bigspin.ai, Kiln AI, Judgment Labs), and compound systems combining retrieval, memory, planning and optimized inference. The emphasis shifts from pure scale to grounded systems that define, measure, and act on problems in production.
Developer platforms: MCP and memory
Model Context Protocol (MCP) is emerging as a universal spec for agent integration—persistent memory, multi-tool flows, and permissioning—analogous to USB-C for agentic AI. FastMCP (Prefect), Arcade and Keycard are early ecosystem pieces that make agentic integration tractable. For developers, MCP lowers integration friction and unlocks true agentic products that act on users' behalf across systems.
Memory as product primitive
Memory—short-term via huge context windows and long-term via vector DBs and memory OSes (e.g., MemOS)—is framed as the future moat: persistent, semantic, multi-session memory turns functionality into a personal, sticky asset. Trade-offs remain: cost and latency for long contexts, brittleness of naive persistent memory, and the need for dynamic selection, compression and task isolation. Startups like mem0, Zep and LangMem are actively exploring solutions.
Enterprise & Horizontal AI
AI is enabling challengers to threaten traditional Systems of Record by offering Systems of Action: platforms that ingest unstructured signals, generate code, automate mappings and act on data. Key unlocks include AI Trojan horse features that auto-capture data flow, dramatically faster implementation through codegen, and data translation that enables one-day migrations—making vendor lock-in less permanent.
Areas still hard to disrupt
Large-scale enterprise ERP and the long tail of domain-specific SoRs (identity platforms, dispatch systems, specialized CMS) remain hard to fully replace due to breadth and regulatory complexity. Bessemer suggests disruption here will be a multi-year, possibly decade-long effort.
Vertical AI
Vertical AI adoption is accelerating across healthcare, legal, education, real estate and home services where language and multimodal tasks were previously underserved. Pattern for winners: start with an embedded wedge solving a high-friction, high-value task (often voice-enabled), capture proprietary vertical data, and expand into broader workflow automation. Examples include Abridge (clinical notes), SmarterDx (coding), EvenUp (legal demand packages), and EliseAI (property management).
Consumer AI
General-purpose assistants remain central to consumer behavior, but modality shifts (voice) and richer memory enable deeper habitual use. Perplexity and agentic browsers like Comet point to new UX paradigms. Generative creative tools are proliferating across music, video, and multi-modal production, but specialized consumer apps must deliver persistent, differentiated value to supplant generalist assistants.
Five predictions (summary)
- Agentic browsers become a dominant interface layer for autonomous workflows
- Generative video crosses the chasm in 2026 with commercial-scale quality and tooling
- Private, business-grounded evals and data lineage become essential for enterprise adoption
- A new AI-native social giant could emerge around agents, memory and multimodal expression
- Incumbents accelerate M&A to buy capabilities, driving consolidation especially in vertical AI
Founder guidance and final takeaways
Design for memory and context as product primitives, instrument private evals and lineage from day one, pick an AI wedge with clear 10x ROI, and balance go-to-market speed with durable unit economics. Expect acquisition interest from incumbents and plan defensibility around data, integrations and unique workflows. Success in this era favors teams that combine rapid execution with clarity about which real-world problems they are solving.
FAQ
- How do I measure AI startup benchmarks for my company? Track ARR growth, gross margin, ARR/FTE and cohort retention, and run private evals for model performance and lineage.
- What distinguishes a Supernova from a Shooting Star? Supernova = explosive top-line growth with possible weak retention and low margins; Shooting Star = faster-than-SaaS growth with healthier margins and retention.
- Why is memory a competitive moat for AI startups? Persistent memory enables deeper personalization and higher switching costs, making products stickier.
- What technical priorities should an enterprise AI product have? Private evals, data lineage, RAG/vector DBs, MCP integration, and observability for model drift.
- How to prepare for incumbent M&A interest? Protect data assets, demonstrate customer ROI, and document integrations and APIs to preserve negotiating leverage.
```
Introduction
By 2025 the AI ecosystem has moved from the initial shock of rapid innovation into a phase Bessemer calls “First Light”: clearer clusters of companies and patterns are forming, even as significant unknowns remain. This expanded summary paraphrases "The State of AI 2025" (Bessemer’s Atlas) and deepens the original synthesis with additional specifics on benchmarks, infrastructure evolution, developer tooling, vertical and consumer opportunities, unresolved "dark matter," and the five predictions shaping 2025–2026.
Context
Since ChatGPT pushed AI into public consciousness, capital and product focus have surged: Bessemer reports having deployed over $1B in AI-native startups since 2023. The landscape today is defined by intense competition, fast adoption cycles, and new success metrics—what counted as great in the SaaS era no longer maps cleanly to AI-native businesses.
AI benchmarks: two archetypes in detail
AI Supernovas
Supernovas exhibit historical-scale top-line growth: in Bessemer’s sample, ~40M ARR in year one of commercialization and ~125M ARR in year two on average. Yet they often show thin gross margins (~25%) as they sacrifice margin for distribution, and retention can be fragile where switching costs are low. Remarkably, Supernovas show ~ $1.13M ARR per FTE, signaling exceptional revenue efficiency that could translate into sustained scale if retention and unit economics improve.
AI Shooting Stars
Shooting Stars look more like accelerated SaaS winners: roughly ~3M ARR in year one, quadrupling year-over-year growth trajectories, and healthier gross margins (~60%) with ARR/FTE near $164K in early years. Bessemer frames their 5-year path as Q2T3 (quadruple, quadruple, triple, triple, triple)—faster than classic SaaS (T2D3) but still grounded in durable retention and margins.
Roadmaps: model layer and infrastructure
The model layer remains dominated by a few large labs—OpenAI, Anthropic, Google/DeepMind, Gemini, xAI—yet open-source entrants like Kimi, Qwen, Mixtral and others continue to push efficiency and domain-specialized performance. Research trends include adaptive-depth approaches (Mixture-of-Recursions), revived Mixture-of-Experts, and inference-time techniques such as test-time RL and adaptive reasoning that improve few-shot accuracy without linear compute scaling.
"The second half of AI—starting now—will shift focus from solving problems to defining them."
Shunyu Yao, OpenAI (quoted in Bessemer’s report)
Infrastructure's Second Act
The next chapter emphasizes connecting models to real-world experience: RL environments and task curation (Fleet, Matrices, Kaizen), continuous evaluation frameworks (Bigspin.ai, Kiln AI, Judgment Labs), and compound systems combining retrieval, memory, planning and optimized inference. The emphasis shifts from pure scale to grounded systems that define, measure, and act on problems in production.
Developer platforms: MCP and memory
Model Context Protocol (MCP) is emerging as a universal spec for agent integration—persistent memory, multi-tool flows, and permissioning—analogous to USB-C for agentic AI. FastMCP (Prefect), Arcade and Keycard are early ecosystem pieces that make agentic integration tractable. For developers, MCP lowers integration friction and unlocks true agentic products that act on users' behalf across systems.
Memory as product primitive
Memory—short-term via huge context windows and long-term via vector DBs and memory OSes (e.g., MemOS)—is framed as the future moat: persistent, semantic, multi-session memory turns functionality into a personal, sticky asset. Trade-offs remain: cost and latency for long contexts, brittleness of naive persistent memory, and the need for dynamic selection, compression and task isolation. Startups like mem0, Zep and LangMem are actively exploring solutions.
Enterprise & Horizontal AI
AI is enabling challengers to threaten traditional Systems of Record by offering Systems of Action: platforms that ingest unstructured signals, generate code, automate mappings and act on data. Key unlocks include AI Trojan horse features that auto-capture data flow, dramatically faster implementation through codegen, and data translation that enables one-day migrations—making vendor lock-in less permanent.
Areas still hard to disrupt
Large-scale enterprise ERP and the long tail of domain-specific SoRs (identity platforms, dispatch systems, specialized CMS) remain hard to fully replace due to breadth and regulatory complexity. Bessemer suggests disruption here will be a multi-year, possibly decade-long effort.
Vertical AI
Vertical AI adoption is accelerating across healthcare, legal, education, real estate and home services where language and multimodal tasks were previously underserved. Pattern for winners: start with an embedded wedge solving a high-friction, high-value task (often voice-enabled), capture proprietary vertical data, and expand into broader workflow automation. Examples include Abridge (clinical notes), SmarterDx (coding), EvenUp (legal demand packages), and EliseAI (property management).
Consumer AI
General-purpose assistants remain central to consumer behavior, but modality shifts (voice) and richer memory enable deeper habitual use. Perplexity and agentic browsers like Comet point to new UX paradigms. Generative creative tools are proliferating across music, video, and multi-modal production, but specialized consumer apps must deliver persistent, differentiated value to supplant generalist assistants.
Five predictions (summary)
- Agentic browsers become a dominant interface layer for autonomous workflows
- Generative video crosses the chasm in 2026 with commercial-scale quality and tooling
- Private, business-grounded evals and data lineage become essential for enterprise adoption
- A new AI-native social giant could emerge around agents, memory and multimodal expression
- Incumbents accelerate M&A to buy capabilities, driving consolidation especially in vertical AI
Founder guidance and final takeaways
Design for memory and context as product primitives, instrument private evals and lineage from day one, pick an AI wedge with clear 10x ROI, and balance go-to-market speed with durable unit economics. Expect acquisition interest from incumbents and plan defensibility around data, integrations and unique workflows. Success in this era favors teams that combine rapid execution with clarity about which real-world problems they are solving.
FAQ
- How do I measure AI startup benchmarks for my company? Track ARR growth, gross margin, ARR/FTE and cohort retention, and run private evals for model performance and lineage.
- What distinguishes a Supernova from a Shooting Star? Supernova = explosive top-line growth with possible weak retention and low margins; Shooting Star = faster-than-SaaS growth with healthier margins and retention.
- Why is memory a competitive moat for AI startups? Persistent memory enables deeper personalization and higher switching costs, making products stickier.
- What technical priorities should an enterprise AI product have? Private evals, data lineage, RAG/vector DBs, MCP integration, and observability for model drift.
- How to prepare for incumbent M&A interest? Protect data assets, demonstrate customer ROI, and document integrations