Durable Execution for AI
Your agents will fail.
Plan for it.
One API for retries, state persistence, and human-in-the-loop. Ship production AI without the infrastructure headache.
Free tier. No credit card.
agent.ts
import { Flow } from '@flow/sdk'
const flow = new Flow({ apiKey: process.env.FLOW_KEY })
await flow.run('process-order', async (ctx) => {
// Each step is durable — survives crashes
const analysis = await ctx.step('analyze', () =>
ai.analyze(order)
)
// Pause for human approval when needed
if (analysis.needsReview) {
await ctx.prompt('review-order', {
type: 'review',
message: `Review order #${order.id}`,
outcomes: ['approve', 'deny', 'escalate']
})
}
// Automatic retries with exponential backoff
await ctx.step('fulfill', () =>
fulfillOrder(order),
{ retry: { maxAttempts: 3 } }
)
})Durable execution
Workflows survive crashes, restarts, and deploys. State is checkpointed at every step.
Automatic retries
Exponential backoff with configurable attempts. Handle transient failures automatically.
Human-in-the-loop
Pause for review, confirmation, or form input. Route to humans when AI is uncertain.
Timers & delays
Schedule steps hours or days into the future. Durable, survives restarts.
Parallel & child workflows
Fan out to multiple steps or spawn child workflows. Sync or fire-and-forget.
Signals & webhooks
React to external events mid-workflow. FIFO signal queues with guaranteed delivery.
How It Works
Three lines of code
1
Define your workflow
Write steps in TypeScript or Python. Each step is a checkpoint.
await ctx.step('analyze', () => ai.analyze(data))2
Flow handles the hard parts
Automatic retries, state persistence, timeout handling.
// Crash here? No problem. Resumes from last step.3
Humans step in when needed
Pause for approval, wait for input, then continue.
await ctx.prompt('approve', { type: 'confirm' })Mixed-Fidelity Execution
Code, AI, and humans
in one workflow
Some steps are pure code. Others need AI judgment. Some require human approval. Flow handles all three with the same simple API.
Deterministic
Parse JSON, validate schema, call APIs
AI-Assisted
LLM evaluates, classifies, generates
Human Required
Approve transactions, resolve edge cases
AI Worker Architecture
Connect any AI agent
as a worker
AI agents (Claude Code, Gemini, custom) connect via WebSocket and claim tasks from the queue. Run workers on your laptop, in the cloud, or as managed Sprites. Select workers by capabilities, location, or agent type.
WebSocket
Location-aware
Multi-agent
Claude Code
Code generation, review, and analysis
Research Agents
Documentation lookup, best practices
Custom Workers
Your own agents with any capabilities
Comparison
Why not build it yourself?
You could. But do you want to maintain a job queue, state machine, retry logic, and approval system?
Use Cases
Built for AI that runs for hours
AI Agent Orchestration
Run multi-step AI agents that can take hours. Checkpoint after each LLM call. Escalate to humans when confidence is low.
Example: An agent processes customer support tickets: classifies, drafts response, escalates complex cases for human review, then sends.
Document Processing Pipelines
Process thousands of PDFs, images, or files. Retry failed extractions. Track progress. Resume after crashes.
Example: Extract data from 10,000 invoices. If OCR fails, retry. If ambiguous, flag for review. Aggregate results at the end.
Approval Workflows
Multi-stage approval chains with timeout handling. Route based on amount, type, or custom logic.
Example: Expense reports under $100 auto-approve. $100-$1000 need manager. Over $1000 need VP. Timeout after 48h with escalation.
Scheduled Automation
Run workflows on schedules or triggers. Daily reports, weekly syncs, event-driven processing.
Example: Every morning at 9am, pull metrics from 5 APIs, generate a report, send to Slack. If any API fails, retry 3x then alert.
API-First
Built for agents,
not dashboards
Clean REST API. TypeScript and Python SDKs. Define workflows in code or YAML. Your agents talk to Flow programmatically.
TypeScript
Python
REST API
Hosted
Self-host option
// Start a workflow
POST /v1/workflows/my-agent/run
{ "input": { "task": "..." } }
// Check status
GET /v1/executions/:id
// Respond to a prompt
POST /v1/prompts/:id/complete
{
"outcome": "approve",
"reasoning": "Within policy"
}
// Send a signal
POST /v1/executions/:id/signal
{ "name": "payment_received" }Integrations
Connect to everything
Call any API from your workflow steps. Built-in integrations for common services.
Slack
Messages, channels, buttons
GitHub
Issues, PRs, workflows
Google Drive
Files, folders, docs
Gmail
Send emails, read inbox
AWS
S3, Lambda, SQS
Webhooks
Any HTTP endpoint
Databases
Postgres, MySQL
CLI
Local testing
From the community
Built by engineers, for engineers
"We replaced 2000 lines of retry and queue code with 50 lines using Flow. The human-in-the-loop feature was the killer feature for us."
Alex Chen
Founding Engineer, AI startup
"Finally, a workflow engine that doesn't require a PhD in distributed systems. I had an agent processing documents in production within an hour."
Sarah Kim
Solo founder, DocProcessor
"The mixed-fidelity model clicked immediately. Some steps need AI, some need humans, some are just code. Flow handles all of it."
Marcus Johnson
AI Engineer, Stealth
Under the Hood
Why durability matters
AI workflows fail in new ways. LLMs timeout, APIs rate-limit, humans take days to respond. Traditional retry logic isn't enough.
Checkpoint after every step
Each step result is persisted before continuing. If the process crashes, it resumes from the last checkpoint—not the beginning.
Pause for days or weeks
Human approvals can take time. Flow can pause indefinitely, then wake up exactly where it left off.
Guaranteed exactly-once
Steps are idempotent by default. No duplicate API calls, no double-charges, no repeated side effects.
99.9%
Uptime SLA
<50ms
Step latency
∞
Execution duration
0
Lost steps