Stripe processes millions of payments per day. Each payment involves charging a card, updating an account balance, sending a receipt, and notifying downstream services. If any step fails — network timeout, service crash, database hiccup — the entire transaction needs to either complete correctly or roll back cleanly. No double charges. No lost receipts. No inconsistent state.
Stripe uses Temporal for this. So do Netflix, Snap, Datadog, JPMorgan Chase, and OpenAI. The company just raised $300 million at a $5 billion valuation in February 2026. And the core idea is surprisingly simple: what if your code could survive crashes?
What Durable Execution Actually Means
Here's the traditional way to handle a multi-step backend process:
- Charge the customer's card
- Reserve inventory
- Send confirmation email
- Update analytics
If step 3 fails, you need to manually undo steps 1 and 2. You write retry logic. You add timeout handling. You build a state machine to track which steps completed. You store intermediate state in a database. You add monitoring to detect stuck workflows. You write compensation logic for every possible failure combination.
This is called the Saga pattern, and implementing it correctly is brutally hard. Most teams get it wrong. Not because they're bad engineers, but because distributed systems are fundamentally unreliable, and accounting for every failure mode requires more code than the actual business logic.
Durable execution flips this. Instead of you managing failures, the platform manages them. You write your workflow as normal sequential code. The execution engine records every step. If the process crashes — server dies, network drops, deployment restarts — it replays the recorded steps and resumes exactly where it left off. Your code literally cannot fail to complete.
// This looks like normal code. But it survives crashes.
async function orderWorkflow(order: OrderInput): Promise<OrderResult> {
// Step 1: Charge the card
const payment = await activities.processPayment(order.paymentInfo);
// Step 2: Reserve inventory
const reservation = await activities.reserveInventory(order.items);
// Step 3: Send confirmation
await activities.sendConfirmation(order.email, payment.id);
// Step 4: Update analytics
await activities.trackOrder(order.id, payment.amount);
return { orderId: order.id, status: "completed" };
}
If the server crashes between step 2 and step 3, Temporal replays steps 1 and 2 (using recorded results — not re-executing them), then continues with step 3. The payment isn't charged again. The inventory isn't reserved again. The workflow just picks up where it stopped.
The Numbers
Temporal is no longer a niche tool. The growth numbers from their Series D announcement are striking:
The 380% revenue growth is partly driven by AI. Temporal reports 1.86 trillion action executions from AI-native companies on their Cloud product alone. AI agents — which chain multiple LLM calls, tool invocations, and external API calls — are exactly the kind of multi-step processes where durable execution shines.
How Temporal Works (The Parts That Matter)
Workflows
A Workflow is your business logic — the sequence of steps. Workflows must be deterministic: given the same inputs, they must produce the same sequence of commands. This is how Temporal replays them after crashes.
import { proxyActivities } from "@temporalio/workflow";
import type * as activities from "./activities";
const { processPayment, reserveInventory, sendConfirmation } =
proxyActivities<typeof activities>({
startToCloseTimeout: "30 seconds",
retry: {
maximumAttempts: 3,
nonRetryableErrorTypes: ["CreditCardDeclined"],
},
});
export async function orderFulfillment(order: Order): Promise<void> {
const payment = await processPayment(order);
const reservation = await reserveInventory(order);
await sendConfirmation(order, payment);
}
Activities
Activities are the non-deterministic parts — the actual side effects. HTTP calls, database writes, sending emails, calling external APIs. Activities can fail, retry, and timeout independently.
// activities.ts — these run on Workers, not in the Workflow
export async function processPayment(order: Order): Promise<PaymentResult> {
const result = await stripe.charges.create({
amount: order.total,
currency: "usd",
source: order.paymentToken,
});
return { id: result.id, status: result.status };
}
export async function reserveInventory(order: Order): Promise<void> {
await inventoryService.reserve(order.items);
}
export async function sendConfirmation(
order: Order,
payment: PaymentResult
): Promise<void> {
await emailService.send({
to: order.email,
template: "order-confirmation",
data: { orderId: order.id, paymentId: payment.id },
});
}
Workers
Workers are the processes that execute your code. You can run as many as you want. If a Worker crashes, another picks up the workflow. Workers pull tasks from Task Queues — they don't need to know about each other.
import { Worker } from "@temporalio/worker";
import * as activities from "./activities";
async function run() {
const worker = await Worker.create({
workflowsPath: require.resolve("./workflows"),
activities,
taskQueue: "order-fulfillment",
});
await worker.run();
}
run().catch(console.error);
The Temporal Server
The Temporal Server (or Temporal Cloud) is the orchestration brain. It stores workflow state, manages task queues, handles timeouts, and coordinates replays. You can self-host it (open source, MIT license) or use Temporal Cloud.
The Saga Pattern Without the Pain
Here's where Temporal gets really interesting. Remember that Saga pattern? Here's how you'd implement compensating transactions — rollback logic when a later step fails:
export async function orderWorkflow(order: Order): Promise<OrderResult> {
// Step 1: Charge payment
const payment = await processPayment(order);
try {
// Step 2: Reserve inventory
const reservation = await reserveInventory(order);
try {
// Step 3: Arrange shipping
await arrangeShipping(order, reservation);
} catch (err) {
// Shipping failed — undo inventory and payment
await cancelReservation(reservation.id);
await refundPayment(payment.id);
throw err;
}
} catch (err) {
// Inventory failed — undo payment
if (err instanceof ReservationError) {
await refundPayment(payment.id);
}
throw err;
}
return { status: "completed", paymentId: payment.id };
}
This looks like normal try/catch error handling. But Temporal guarantees that the compensation logic (refund, cancellation) will execute even if the Worker crashes during the rollback. No lost refunds. No orphaned reservations. The engine ensures completion.
Without Temporal, you'd need a state machine, a persistent store, a retry scheduler, dead letter queues, and monitoring — hundreds of lines of infrastructure code. With Temporal, it's a try/catch block.
Temporal vs The Alternatives
| Feature | Temporal | AWS Step Functions | Inngest | Celery |
|---|
| Programming Model | Code-first (any language) | JSON/YAML state machine | Serverless functions | Task queue |
| Self-hostable | Yes (MIT license) | No (AWS only) | No (proprietary engine) | Yes |
| Language Support | TypeScript, Python, Go, Java, .NET | Any (via Lambda) | TypeScript, Python | Python |
| Durable Execution | Yes | Yes (state machine) | Yes | No (retry only) |
| Long-running (days/months) | Yes | Yes (Express: 5min limit) | Yes | No |
| Debugging | Replay + event history | CloudWatch | Dashboard | Logs only |
| GitHub Stars | 12,000+ | N/A (AWS service) | 5,000+ | 25,000+ |
| Learning Curve | Steep | Medium | Low | Low |
| Best For | Complex distributed systems | AWS-native workflows | Simple serverless flows | Background jobs |
When Step Functions Win
If you're all-in on AWS and your workflows are visual state machines with AWS service integrations, Step Functions make sense. No infrastructure to manage, tight integration with Lambda, SQS, DynamoDB, and the rest of the AWS ecosystem. For simple "do A, then B, then C" workflows that stay within AWS, Step Functions are simpler.
When Inngest Wins
Inngest is the right choice when you want serverless-first, event-driven workflows with minimal setup. It's excellent for Next.js background jobs, webhook processing, and simple multi-step flows. The learning curve is much lower than Temporal. If your use case is "run this function when this event happens, with retries," Inngest gets you there faster.
When Celery Isn't Enough
Celery is a task queue, not a workflow engine. It can retry failed tasks, but it can't replay workflows, manage multi-step transactions, or handle long-running processes that span days. If you're using Celery and finding yourself building state tracking, compensation logic, and timeout management on top of it — you need Temporal, not a better Celery configuration.
Real Use Cases (Not Toy Examples)
Payment Processing
The canonical Temporal use case. A payment workflow involves: authorization, capture, fraud check, ledger update, receipt, and downstream notifications. If the fraud check service is down for 5 minutes, the workflow pauses and resumes automatically. If the ledger update fails, compensation runs. With traditional backends, this requires a state machine with 15+ states. With Temporal, it's a sequence of function calls.
User Onboarding
New user signs up. Send welcome email. Wait 3 days. Send follow-up if they haven't activated. Wait 7 more days. Send a second follow-up. If they activate at any point, cancel the remaining steps and send an activation confirmation.
export async function onboardingWorkflow(user: User): Promise<void> {
await sendWelcomeEmail(user);
// Wait 3 days — Temporal handles the timer durably
await sleep("3 days");
if (!(await checkActivation(user.id))) {
await sendFollowUp(user, "first");
await sleep("7 days");
if (!(await checkActivation(user.id))) {
await sendFollowUp(user, "second");
}
}
await sendActivationConfirmation(user);
}
That sleep("3 days") is durable. Your server can restart, deploy new code, crash and recover — the timer persists. Try doing that with setTimeout or a cron job.
Cross-Service Data Pipeline
Extract data from Service A, transform it, load it into Service B, validate the result, then notify Service C. If Service B is temporarily unavailable, retry with exponential backoff. If the validation fails after all retries, roll back the extraction from Service A and alert the team.
This is the kind of workflow that traditionally requires a complex DAG orchestrator (Airflow, Prefect), a message queue (RabbitMQ, SQS), and custom retry/compensation logic. With Temporal, it's a function with try/catch and Activity retries.
Getting Started (For Real)
Option 1: Temporal Cloud (Fastest)
# Install the CLI
brew install temporal # macOS
# or: curl -sSf https://temporal.download/cli.sh | sh
# Create a free dev namespace at https://cloud.temporal.io
Temporal Cloud pricing starts with a free Dev tier. The Growth tier is $200/month with 1 million actions included.
Option 2: Self-Hosted (Free, More Control)
# Run Temporal locally with Docker
git clone https://github.com/temporalio/docker-compose.git
cd docker-compose
docker compose up -d
The self-hosted Temporal Server is fully open source under the MIT license. It requires a database backend (PostgreSQL, MySQL, or Cassandra) and four services. It's production-ready, but scaling requires database expertise.
Option 3: TypeScript Quick Start
# Create a new Temporal project
npx @temporalio/create my-temporal-app
cd my-temporal-app
npm install
Then follow the official TypeScript tutorial — it walks you through workflows, activities, workers, and task queues with runnable code.
Option 4: Python Quick Start
pip install temporalio
The Python SDK uses async/await and decorators. A workflow looks like:
from temporalio import workflow, activity
from datetime import timedelta
@activity.defn
async def process_payment(order_id: str) -> str:
# Your payment logic here
return f"payment_{order_id}"
@workflow.defn
class OrderWorkflow:
@workflow.run
async def run(self, order_id: str) -> str:
payment = await workflow.execute_activity(
process_payment,
order_id,
start_to_close_timeout=timedelta(seconds=30),
)
return payment
When Temporal Is Overkill
Temporal is not the answer for everything. Here's when you should not use it:
Simple background jobs. If you just need "run this function in the background with retries," use a task queue (Celery, Bull, Sidekiq) or a serverless function (Inngest, Trigger.dev). Temporal's infrastructure overhead isn't worth it for simple fire-and-forget tasks.
Transient processes. If the process doesn't need to survive crashes — it's a quick computation that can just be re-triggered — durable execution adds unnecessary complexity. Not everything needs to be durable.
Rapid prototyping. Temporal enforces strict determinism in workflow code. You can't call Math.random(), Date.now(), or make HTTP requests directly in a workflow. This discipline is valuable for production reliability but slows down iteration speed when you're exploring ideas.
Small teams without infrastructure experience. Self-hosting Temporal requires managing a database cluster, four services, and understanding distributed systems concepts. If your team is three developers shipping a SaaS app, the operational burden may outweigh the benefits. Consider Temporal Cloud or Inngest instead.
What I Actually Think
Temporal solves a real problem that most backend developers have been hacking around for years. The "state machine + retry queue + compensation logic + monitoring" stack that every team builds from scratch is exactly the kind of accidental complexity that a platform should handle.
But here's my honest assessment: Temporal is infrastructure for infrastructure people. The learning curve is steep. The determinism constraints are surprising (you can't use Date.now() in a workflow — you have to use Temporal's time API). The debugging model requires understanding event sourcing and replay semantics. The community forum is full of questions from developers struggling with non-obvious constraints.
For complex, multi-step, long-running, business-critical workflows — payment processing, order fulfillment, subscription lifecycle, cross-service coordination — Temporal is the best tool available. Nothing else gives you the same combination of durable execution, language-native SDKs, and self-hosting capability.
For everything else, it's probably too much.
My recommendation: if you're building a backend where a failure in step 5 of an 8-step process means manual intervention and a support ticket, you need Temporal. If your "workflows" are just background jobs with retries, you don't.
The sweet spot is organizations with 10+ developers building systems where reliability matters more than development speed. At that scale, the upfront learning investment pays for itself in reduced operational incidents, simpler error handling, and workflows that just work — even when servers don't.
Temporal made complex backends boring. And boring backends are the best kind.
Sources
- Temporal raises $300M Series D at $5B valuation
- Temporal raises Series D funding — Press release
- Temporal adoption statistics
- Temporal raises $300M — GeekWire
- The definitive guide to Durable Execution — Temporal
- Durable Execution: Build reliable software — The New Stack
- Saga Pattern in Microservices — Temporal
- Microservices Pattern: Saga
- Temporal Workflow documentation
- Self-hosted Temporal guide
- Temporal Cloud pricing
- Temporal Pricing 2026 — Automation Atlas
- Get Started with TypeScript — Temporal Learn
- Get Started with Python — Temporal Learn
- Temporal order fulfillment demo — GitHub
- Inngest vs Temporal — Inngest
- Inngest vs Temporal comparison — OpenAlternative
- AWS Step Functions vs Temporal — Ready Set Cloud
- Self-Hosting vs Temporal Cloud whitepaper
- Temporal.io: Thoughts after 3 months — Hollis Wu
- Temporal community: Cases not suitable for Temporal
- Temporal advantages and drawbacks — Restack
- Celery vs Temporal.io — pedrobuzzi.dev