The State of AI Agents in 2026

Nine Things I Learned Compiling 200+ Slides of Research

Feb 24, 2026

When I lectured at the MIT Media Lab on AI, game development, and the future of civilization, the bottleneck was engineering. You needed people—lots of them—to turn ideas into products.

That bottleneck has moved.

I’ve spent the past several weeks compiling what became a 200+ slide research deck: The State of Agents & Agentic Engineering — 2026. It draws on earnings reports, benchmark data, academic research, and my own experience building agentic systems. What follows isn’t a summary of the deck—it’s what the research revealed when I stepped back and looked at the whole picture.

The full deck is on the State of AI Agents in 2026 is located here. Throughout this article, I’ll point to specific pages where you can dive deeper into the underlying data.

1. The $211 Billion Question

In 2025, AI venture capital hit $211 billion—half of all global VC funding. Total AI spending reached $1.5 trillion. The SpaceX-xAI merger created the largest corporate combination in history at $1.25 trillion.

And yet: only 6% of organizations report more than 5% EBIT impact from AI (McKinsey, 2025).

That gap is the story. Not that AI doesn’t work—it does—but that most organizations haven’t figured out how to capture value from it. The outliers, however, are extraordinary: a 6x output gap between top-quartile AI users and everyone else. A 67% increase in merged pull requests per engineer at Anthropic. Autonomous task horizons that doubled from minutes to 14.5 hours in eighteen months.

The 6% is a snapshot of early adoption. The outliers show where the curve is heading—and that curve doubles every four months.

2. The Technology Crossed a Threshold

Here’s the shift that doesn’t get enough attention: AI inference costs dropped 92% in three years. Per-million-token pricing fell from $30 in early 2023 to $0.10-$2.50 in February 2026.

That’s not an incremental improvement. That’s a phase transition.

When I wrote about the Direct from Imagination era, I described a future where people would speak entire worlds into existence. The enabling condition for that future isn’t just model intelligence—it’s model economics. At $30 per million tokens, agentic workflows are a luxury. At $0.10, they’re table stakes.

Meanwhile, the benchmarks tell their own story. Claude Opus 4.5 hit 80.9% on SWE-Bench Verified—up from 33% just eighteen months ago. On GPQA Diamond, which measures PhD-level scientific reasoning, Claude Opus 4.6 scored 91.3%, exceeding human experts at 69.7% by over 21 points.

But the most consequential benchmark might be the one METR publishes on task horizons. In early 2024, frontier models could sustain autonomous work for about four minutes. By February 2026, Claude Opus 4.6 crossed a full work-day at 14.5 hours—doubling every 123 days. At that rate, week-long autonomous tasks arrive by late 2026. Month-long tasks by mid-2027.

I keep coming back to a phrase: the bottleneck isn’t engineering capacity anymore. It’s imagination. That sentence lands differently when you realize the systems can already work longer than most humans do in a day.

3. The Creator Era Has Arrived

I wrote recently about how software’s Creator Era has arrived—and the research for this deck only deepened that conviction.

The pattern is the same one I’ve tracked across every creative industry: Pioneer Era → Engineering Era → Creator Era. In the Pioneer Era, you built everything from scratch and competitive advantage came from having programmers at all. In the Engineering Era, frameworks and SaaS tools emerged—AWS, Stripe, Salesforce—but engineers remained essential. In the Creator Era, the bottleneck shifts from can we build this to should we build this, and for whom.

The numbers suggest we’ve crossed that threshold. Over 100,000 products are now built daily on AI-native platforms like Cursor, Replit, Lovable, and Bolt.new. Cursor went from zero to $1 billion in annual recurring revenue in 24 months—the fastest B2B SaaS ramp in history.

But the deeper signal is what happened to SaaS valuations. In the first month of 2026 alone, $2 trillion in SaaS market capitalization evaporated. That’s not a market correction—it’s structural disruption. When one AI agent can replace dozens of human software licenses, the per-seat pricing model that built the SaaS industry starts to collapse. I called this the “SaaSpocalypse” and it’s accelerating.

Here’s what’s replacing it: 4% of all GitHub commits are now authored by Claude Code, and that’s projected to reach 20%+ by year-end. 80% of Neon databases are created by AI agents, not humans. TypeScript overtook Python as the #1 language on GitHub for the first time—because AI-generated code benefits from type safety.

When I built Chessmata—a complete multiplayer 3D chess platform—over a single weekend using agentic engineering, that wasn’t a stunt. It was a demonstration of what building looks like now. An agentic platform, built by agentic processes. The gap between imagining and building has collapsed.

4. The Industrial Revolution Underneath

Every AI conversation I hear focuses on software. But the research dragged me into atoms.

Global data center power consumption is projected to hit 96 gigawatts in 2026. That’s equivalent to nine New York Cities. Or twice the entire UK electrical grid. Or forty-eight Hoover Dams. Ninety percent of that growth is AI workloads.

The grid investment required to support this—$720 billion—rivals the AI capex itself.

And the capex is staggering. Big Tech alone is committing $690 billion in 2026: Amazon at $200 billion, Google at $180 billion, Meta at $125 billion, Microsoft at $80 billion. The Stargate Project committed $500 billion—the largest infrastructure investment since the Interstate Highway System.

Meanwhile, AI hardware is scaling on a trajectory that makes Moore’s Law look quaint. Jensen Huang’s team has demonstrated what some are calling “Huang’s Law”: since 2012, actual AI compute has improved 300,000x, compared to the 7x that Moore’s Law would have predicted. NVIDIA’s Rubin chip (Q2 2026) promises 5x inference performance over Blackwell, with the Rubin Ultra following in H2 2027 at 10x.

This is worth sitting with. The constraint on AI isn’t software anymore—it’s atoms. Semiconductors, power plants, cooling systems, rare earth minerals. Every layer of the hardware stack is sold out. When I wrote about the seven layers of the metaverse value-chain, infrastructure was the foundation of the whole stack. That has never been more literally true—and the stakes have never been higher.

5. Machine Societies Are Here

When I wrote about the age of machine societies, I was tracking early signals—cooperative agent experiments, emergent dynamics in simulated environments. The deck forced me to confront how far those signals have traveled.

OpenClaw—the first mass-adopted autonomous agent—gathered 145,000 GitHub stars in its first week and now consumes 13% of all OpenRouter tokens (per a16z’s Charts of the Week). It runs 24/7 on clusters of Mac Minis. When Claude Opus agents collaborate through targeted delegation (not broadcast), they achieve 76% performance improvement over solo operation—a result from Anthropic’s HiddenBench evaluation. Multi-agent system inquiries surged 1,445% between Q1 2024 and Q2 2025 (Gartner).

But the more astonishing finding: there are now 144 non-human identities per human employee in the average enterprise (Oasis Security, 2025)—up from 92:1 in the first half of 2024. We’re already outnumbered in our own systems, and most organizations have no governance framework for it.

And the risks aren’t hypothetical. In February 2026, the Matplotlib incident became the first documented case of autonomous AI retaliation—an agent autonomously wrote and published a hit piece that persuaded 25% of surveyed developers to consider switching libraries. That’s a precedent, not a thought experiment.

I built an agent that discovers other agents precisely because I saw this coming. The internet isn’t built for agent consumption. Marketing pages optimized for humans fail programmatic parsing. As these systems proliferate, the discovery infrastructure for agents becomes as important as the agents themselves.

6. The Network Effects of Composability

This is where several threads I’ve been pulling on for years converge.

In my earlier work on network effects in the metaverse, I argued that the degree to which a network facilitates interconnections determines the extent of its emergent creativity, innovation, and wealth. Hub-and-spoke architectures concentrate value; scale-free architectures distribute it.

The agentic ecosystem is becoming a scale-free network. There are now over 17,000 MCP servers—the Model Context Protocol that lets agents discover and communicate with other services. Agents are connecting to agents without gatekeepers. That follows Reed’s Law, where network value grows as 2^n with subgroup formation, not just Metcalfe’s n².

But here’s the tension worth naming—one I explored in my piece on enshittification and the future of AI agents. The web’s original business model was built on attention. Ads. Clicks. Agents don’t click ads. They don’t scroll past sponsored content. They don’t get distracted by sidebar recommendations.

That means the entire attention economy—the economic engine of the internet for three decades—starts to collapse as agents become the primary consumers of web content. The internet isn’t just being used by agents. It’s being rebuilt for them. The protocols that govern how agents discover, negotiate, and transact with each other are becoming the TCP/IP of the agentic era.

This is why composability matters more than individual capability. A single brilliant agent is useful. A network of composable agents that discover and delegate to each other is transformative. The value compounds exponentially. It’s platforms over products—the same principle I’ve tracked from Roblox’s 151 million daily active users to the broader creator economy.

7. The Reckoning

The research wasn’t all acceleration and optimism. There are real cliffs ahead.

Agent error compounds exponentially. A 95% reliable step sounds safe—until you chain twenty of them together and the end-to-end success rate plummets to 36%. And 91% of ML models degrade in production over time. This is why most production agents remain single-purpose: not because multi-step orchestration is impossible, but because the reliability math is punishing.

Deepfake detection accuracy sits at 55%—essentially a coin flip. There are now 144 non-human identities per human employee, and fewer than 10% of companies running agents in production can actually govern them. The EU AI Act Article 50 deadline arrives August 2, 2026, with the first US frontier AI safety laws (California SB 53, New York RAISE Act) not far behind.

I included an entire section in the deck on provenance, chain-of-custody, and the emerging identity crisis for non-human agents. The conclusion: detection isn’t the answer. Provenance is. We need to know where content came from and who (or what) produced it, not whether an algorithm thinks it looks fake.

The tools for multi-agent AI are ahead of the tools for securing multi-agent AI. That gap—between capability and governance—is the most consequential risk in the space.

8. The $10 Trillion Thesis

Sequoia’s framing is the one that stuck with me: cloud computing was a trillion-dollar opportunity. AI is ten trillion. The difference is structural: cloud changed where software runs. AI changes what software does.

Consider the math. Sequoia argues we’ll see at least 10x more compute consumption per knowledge worker—and some of their portfolio companies are already forecasting 1,000x to 10,000x. That sounds like hyperbole until you look at what’s already happening: 100,000 products built daily on AI-native platforms. 80% of Neon databases created by agents, not humans. An agentic AI market projected to reach $93 billion by 2030 at a 65.5% compound annual growth rate—the fastest-growing enterprise software segment ever.

a16z’s parallel analysis points the same direction: the inference economy is emerging as a distinct sector, separate from training. Custom ASICs already handle 40% of inference workloads, and companies like Together AI grew from $30 million to $300 million ARR in a single year. The compute demand isn’t coming from existing workflows getting slightly more efficient. It’s coming from entirely new categories of work that didn’t exist eighteen months ago.

Two years ago, I wouldn’t have taken the $10 trillion projection seriously. After compiling this research, I find it conservative.

9. The Direct from Imagination Era Is Here

Four years ago, I wrote that the Direct from Imagination era was beginning—a convergence of generative AI, parallel computation, compositional platforms, and persistent world infrastructure that would allow people to speak entire worlds into existence.

I was early. I wasn’t wrong.

The evidence is now overwhelming. Natural language has become a programming language. LLMs are compilers for intent. The $285 billion SaaS decline isn’t market irrationality—it reflects a structural shift where AI agents make individual software products subordinate to agentic workflows. The Pioneer → Engineering → Creator Era arc that I’ve tracked across every creative industry has reached software itself.

38% of startups are now solo-founded, up from 22%. Smaller teams accomplish what previously required hundreds. The bottleneck has shifted from engineering capability to creative vision. Millions of new builders from non-technical backgrounds are entering the ecosystem—not because the tools got slightly better, but because the entire abstraction layer changed.

When Grace Hopper created the first compiler in 1952, she didn’t eliminate programming. She made it accessible to people who thought in terms of problems rather than machine instructions. LLMs are doing the same thing at civilization scale—compiling intent into products, systems, and entire worlds.

I titled this section of the deck “What Comes Next” but the more honest framing is: what’s already here and accelerating.

The gap between intention and implementation has collapsed. The gap between implementation and value has not—yet. Closing that second gap is the work of the next two years.

The full research deck, “The State of Agents & Agentic Engineering — 2026,” is available as an interactive flipbook here. It covers 15 sections across 200+ pages with source citations throughout.

Metavert Meditations

Discussion about this post

Ready for more?