Agent-Native Runtimes Rewrite the Cloud: Workflow Engines as Database

On April 3, 2026, Microsoft shipped version 1.0 of its Agent Framework, unifying two previously separate agent SDKs into a single package. The release was meant to clarify the path for developers building on Azure. Instead, it surfaced the deeper question that every cloud provider is now being forced to answer: what is the runtime that an agent actually runs on? A framework that dispatches a tool call is not a runtime. A runtime is the substrate that holds execution state across minutes, hours, or days while a model reasons, waits for a human in the loop, calls an external API that is down, retries with exponential backoff, and resumes from the exact checkpoint where it left off. By that definition, most of the agent stacks announced in the first half of 2026 are frameworks with ambitions, not runtimes with guarantees.

The distinction matters because of a number that keeps appearing in production incident reports. At RSA Conference 2026, CrowdStrike CEO George Kurtz noted in his keynote that the fastest recorded adversary breakout time had dropped to 27 seconds, with the average now at 29 minutes, down from 48 minutes a year earlier. In a world where an attacker can move from initial access to lateral movement in under half a minute, an agent that loses its execution context on a cold start, or forgets which step of a remediation workflow it had already completed, is not a safety feature; it is a liability. The agent runtime question stopped being academic the moment security teams began deploying agentic SOC tools that needed to act faster than the adversary.

CrowdStrike was not alone. VentureBeat reported that Cisco and Palo Alto Networks both shipped agentic SOC tools at the same conference, and all three vendors left unaddressed what the outlet called the "agent behavioral baseline gap": the absence of a standard mechanism to record what an agent did, why it did it, and what state it held at each decision point. The gap is not a logging problem. It is an execution-model problem. If the agent's decision trail is scattered across ephemeral function invocations and model API calls with no durable event ledger, auditing it after the fact requires reconstructing state from side effects, which is fragile, slow, and often impossible under the time pressure of an incident review.

The industry has been here before, and the historical parallel is instructive. In the late 1980s, the database research community debated what came to be called the "workflow long-running transaction" problem: how do you execute a business process that spans multiple database transactions, where any individual transaction can commit or abort independently, and the process itself must survive a crash of the coordinator? The answer, crystallized in papers by Hector Garcia-Molina and Kenneth Salem and later commercialized in systems like IBM's MQSeries and Microsoft's DTC, was a durable, replayable log of state transitions maintained outside any single database. The modern agent runtime problem is structurally identical, except that the "business process" is now a chain of LLM inferences, tool calls, and human approvals that may run for days, and the crash of the coordinator is no longer a rare event; it is the expected failure mode of any system that depends on a model API with a rate limit, a token budget, and a non-zero probability of returning a hallucinated function name.

This structural recognition is driving a quiet consolidation around stateful workflow engines as the foundational layer of the agent-native stack. On April 28, VentureBeat reported that Mistral AI launched Mistral Workflows, an orchestration engine built on Temporal, already running millions of daily executions at the time of its announcement. The choice of Temporal as the underlying engine was not incidental. Temporal provides exactly the property that agent frameworks lack: durable execution with exactly-once semantics, meaning that if an agent step fails and is retried, the side effects of any already-completed step are not duplicated. For a workflow that charges a customer, sends an email, or updates a security policy, that guarantee is the difference between a production system and a prototype that looks good in a demo.

Temporal is not the only contender. Restate, an open-source project that emerged from the same lineage of thinking about durable execution, positions itself as a lightweight alternative that embeds the state machine directly in the application process rather than requiring a separate server cluster. Both systems trace their intellectual lineage to the same observation: serverless functions, for all their operational simplicity, are a poor fit for agentic workloads because they are stateless by design. A Lambda function or a Cloud Run container that spins down between invocations loses the in-memory context that an agent accumulates across a multi-step task. Rebuilding that context on each invocation, either by re-fetching data or by replaying a conversation history from an external store, adds latency, cost, and failure modes that compound with every additional step in the workflow.

The cloud providers have noticed. MSN reported on May 10 that Google Cloud Run is adding ephemeral storage and remote MCP server capabilities, pivoting explicitly toward AI workloads that need to carry state across invocations. The same report noted that edge platforms, including Cloudflare's newly announced Dynamic Workers, are shifting to agentic workloads by replacing container-based isolation with lighter-weight isolates that can spin up in single-digit milliseconds. Cloudflare's Dynamic Workers, covered by VentureBeat in March, represent a bet that the agent runtime needs to be as close to the request origin as possible, because agentic workflows that call multiple models and tools from a centralized cloud region will accumulate enough round-trip latency to render them unusable for real-time use cases.

The edge-versus-region debate, however, obscures a more fundamental architectural choice that every agent runtime must make: whether execution state is durable by default or ephemeral by default. The traditional serverless model, refined over a decade by AWS Lambda, Google Cloud Run, and Azure Functions, treats ephemerality as a virtue. Functions are stateless, idempotent, and horizontally scalable precisely because they carry no memory. The workflow engine model, by contrast, treats durability as the baseline. Every step, every decision, every external call is recorded in an event history that can be replayed after a crash. The trade-off is operational complexity. Running a Temporal cluster at scale requires managing a database for the event history, tuning shard counts, and monitoring replay latency. Running a fleet of stateless functions requires none of that, until the agent that was supposed to close a critical vulnerability at 3 a.m. silently lost its place and nobody noticed until the morning incident review.

This trade-off is not lost on enterprises that are moving agents into production. At ServiceNow's Knowledge 2026 conference in early May, Forbes reported that the company expanded its AI Control Tower with new agent governance capabilities, including what the company calls a "Context Engine" and an MCP Registry that tracks which tools each agent is authorized to invoke. The governance framing is telling. ServiceNow is not selling agent runtime infrastructure; it is selling the auditability and policy enforcement that only become possible when agent execution is mediated by a durable control plane. A context engine that records which data an agent accessed, which model it called, and which workflow step it was executing at each point in time is effectively an event-sourced ledger for agent behavior. It is the same pattern that Temporal and Restate implement at the infrastructure layer, recast as an enterprise governance product.

At Google Cloud Next 2026 in late April, the control plane theme was even more explicit. SiliconANGLE's preview of the conference argued that the real story was not another wave of Gemini model announcements but the emergence of an "agentic layer": a set of services that sit between the model APIs and the application, handling state management, tool routing, policy enforcement, and observability. The preview noted that without this layer, the agent stack collapses into what one observer called "a pile of prompts and API keys," which is generous. In practice, it collapses into a pile of prompts, API keys, and an on-call engineer who gets paged at 2 a.m. because a model returned a malformed JSON that the orchestrator could not parse and the workflow silently terminated.

The investor and analyst community has begun to frame this shift as the transition from "cloud native" to "AI native." SiliconANGLE's coverage of KubeCon EU 2026, published in March, introduced the concept of "context density" as the distinguishing characteristic of AI-native infrastructure. A cloud-native application, the argument goes, processes requests that are relatively self-contained: a REST call carries its parameters in the headers and the body, and the service can fulfill it by consulting a database and returning a response. An agentic application, by contrast, processes a task whose full context accumulates over time. The user's initial prompt is only the starting point. Each tool call, each model inference, each human approval adds to the context. A runtime that treats each step as an independent request, with no shared memory except what the developer manually persists, is fighting the fundamental shape of the workload.

The concept of context density also explains why the major cloud providers are converging on similar architectural patterns despite their competitive differences. At DigitalOcean's Deploy 2026 conference, the company unveiled a five-layer AI-Native Cloud platform that includes a dedicated Inference Engine, a model router, and managed agents for production workloads. The five layers are infrastructure, inference, data, agents, and applications, and the agent layer includes state management as a first-class primitive. DigitalOcean, a company whose historical brand was simplicity for small-to-medium workloads, building a managed agent runtime with durable state is a signal that the market considers this capability table stakes, not a differentiator for hyperscalers with dedicated research teams.

Yet the platform that has drawn the most developer confusion also illustrates how fragmented the agent runtime landscape remains. Forbes reported on April 6 that Microsoft's agent stack now spans at least five distinct surfaces: the Agent Framework SDK, Copilot Studio, Foundry Agent Service, Semantic Kernel, and the Azure AI Agent Service. Each surface has its own state management model, its own tool-calling conventions, and its own path to production. A developer who starts building an agent in Copilot Studio, the low-code environment, discovers that the state model does not easily port to the Agent Framework SDK, the pro-code path, and vice versa. The result is not a platform but a portfolio, and the migration tax for moving an agent from prototype to production, or from one surface to another, is borne entirely by the development team.

The contrast with Google's approach is instructive. The same Forbes analysis noted that Google's Agent Development Kit provides a cleaner default path: developers write agent logic once, and the runtime handles state, tool routing, and deployment consistently across environments. AWS, for its part, has kept its agent stack deliberately thin with a product called Strands that provides durable execution as a managed service, leaving the agent logic to frameworks of the developer's choosing. The divergence is not merely a matter of documentation quality or API design. It reflects a philosophical disagreement about where the boundary between the framework and the runtime should be drawn, a disagreement that will take years to resolve because it is being litigated in production, not in white papers.

Peter FitzGibbon, senior vice president and head of the Google solution line at Insight Enterprises, captured the market momentum in an interview with CRN during Google Cloud Next 2026: "Agentic development has absolutely gone mainstream. There is no more tire-kicking going on like we had in 2024 and '25. The customers that are leaning into it are leaning into it hard." The quote is worth reading carefully. FitzGibbon is not saying that enterprises are experimenting with agents. He is saying that the experimentation phase is over, and the customers who have committed are deploying at a scale and velocity that creates new demands on the underlying infrastructure. A runtime that was good enough for a proof of concept, where the developer restarts the workflow manually when it fails, is not good enough for a production deployment where the workflow runs unattended across a weekend and the first indication of failure is a customer complaint on Monday morning.

Agentic development has absolutely gone mainstream. There is no more tire-kicking going on like we had in 2024 and '25. The customers that are leaning into it are leaning into it hard., Peter FitzGibbon, SVP and head of Google solution line at Insight Enterprises, speaking to CRN at Google Cloud Next 2026

The question that follows from FitzGibbon's observation is what "leaning into it hard" looks like at the infrastructure layer. According to Google's own metrics, three-quarters of Google Cloud customers are now using AI products, 330 customers each processed more than a trillion tokens over the preceding 12 months, and Google's first-party models are processing more than 16 billion tokens per minute through direct API use, up 60 percent quarter on quarter. These numbers describe a throughput problem that is well understood. What they do not describe is the state problem: among those trillions of tokens, how many belong to workflows that ran for more than an hour, involved more than three tool calls, and required a retry because a downstream API returned a transient error? The answer, in any production deployment of meaningful scale, is "most of them."

This is why the Forbes Technology Council identified in April what it called "the new execution-layer security gap" around MCP, the Model Context Protocol that has become the default standard for agent tool access. The gap is that most security programs still focus on human-driven activity, such as people logging into SaaS applications or clicking through browser sessions, while an increasing share of enterprise activity is now driven by agents that authenticate as services, invoke tools programmatically, and operate at speeds and volumes that make human-scale audit logs irrelevant. Securing the agent runtime means instrumenting the workflow engine itself: recording not just that a tool was called, but why the agent chose that tool, what context it had at the time, and whether the choice was consistent with the policy that a human administrator defined.

The same pattern is visible in identity and access management. CSO Online reported in April that Curity, a company known for API-driven IAM, is developing runtime authorization specifically for AI agents, a product category that did not exist two years ago and is now being demanded by enterprises that have agents accessing customer data, financial systems, and infrastructure controls. Runtime authorization for agents is a harder problem than static API key management because an agent's permissions are contextual: the same agent that is authorized to read a database during a customer support workflow may not be authorized to read the same database during a model training run, and distinguishing between those contexts requires the authorization engine to have access to the workflow state that only the runtime holds.

The enterprise demand for agent governance is also reshaping the SaaS market. At ServiceNow's Knowledge 2026, the company positioned its AI Control Tower as an answer to what some analysts have begun calling the "SaaSpocalypse" thesis: the argument that AI agents will disintermediate SaaS applications by interacting directly with data and APIs, rendering the application layer obsolete. Forbes noted that ServiceNow's counter-argument is that the more autonomous agents become, the more valuable a centralized governance and audit layer becomes, and that layer is most naturally provided by the platform that already owns the workflow definitions and the access control policies. Whether that argument holds over the long term depends on whether enterprises trust a single vendor to govern agents that may be built on any framework, deployed on any cloud, and accessing resources across multiple identity domains.

What is clear from the first half of 2026 is that the agent runtime is hardening into a distinct layer of the stack, with its own requirements, its own failure modes, and its own competitive dynamics. The requirements are durability, observability, and policy enforcement. The failure modes are state loss under restart, silent workflow termination, and context drift across long-running executions. The competitive dynamics pit framework vendors, who want the runtime to be a thin library, against infrastructure vendors, who want it to be a managed service with SLAs and a control plane. The resolution of that tension will determine whether the agent-native stack converges on a small set of durable execution engines, as the database market converged on a small set of SQL dialects, or fragments into a dozen incompatible state models that each require their own tooling, their own operators, and their own incident response playbooks.

The checkpoint to watch is the next generation of incident postmortems. When an agentic workflow fails in production at a major enterprise, and the postmortem is published, the root cause will almost certainly not be "the model hallucinated." It will be that the runtime had no durable record of what the agent was doing, no ability to replay the execution from the last known-good checkpoint, and no mechanism to prove, after the fact, that the agent's actions were consistent with the policy that was in effect at the time. The industry has spent three years celebrating the intelligence of the model. The next three years will be spent building the infrastructure that makes that intelligence reliable, auditable, and survivable, and that infrastructure will be judged not by how it performs in a demo, but by what it remembers after a crash.

Read next

Get the Daily Briefbefore your first meeting.

Get the Daily Brief
before your first meeting.