Distributed transaction comparison

Two-phase commit (2PC) vs Saga pattern

Both patterns answer the same question — how do you keep a multi-step, multi-service operation consistent when any step can fail? Two-Phase Commit answers it with locking and a coordinator; the Saga pattern answers it by allowing each step to commit independently and undoing them with compensations if something later fails.

TL;DR

  • 2PC holds locks on every participant until a coordinator confirms all of them are ready, then commits everywhere atomically — strongly consistent, but blocking and hard to scale across services.
  • Saga lets each step commit independently and immediately, and rolls back completed steps with compensating transactions if a later step fails — no cross-service locks, but only eventually consistent.
  • 2PC dominates within a controlled environment (one team, reliable network, e.g. XA transactions across databases); Saga dominates across independently owned microservices.

Side-by-side comparison

AspectTwo-Phase Commit (2PC)Saga pattern
Consistency modelStrong — atomic across all participantsEventual — intermediate states are observable
Coordination roleCentral coordinator drives prepare/commitOrchestrator (explicit) or choreography (event chain)
Resource lockingYes — participants lock resources until final decisionNo — each step commits and releases immediately
Failure recoveryAbort the whole transaction; risk of blocking on coordinator crashRun compensating transactions for completed steps
AvailabilityLower — participants can be stuck waitingHigher — no cross-service blocking locks
Scales across many services?Poorly — lock contention and coordinator dependency growWell — this is its primary design goal
Typical contextMultiple databases/resource managers in one trust domain (XA)Independently deployed microservices
Rollback mechanismNative abort — nothing was ever committedExplicit compensating logic you must write yourself

Code / protocol side-by-side

2PC — coordinator flow

// Phase 1: PREPARE
for each participant p:
  vote = p.prepare(txId)
  if vote != YES:
    coordinator.decide(txId, ABORT)
    break

// Phase 2: COMMIT or ABORT
if all votes == YES:
  coordinator.decide(txId, COMMIT)
for each participant p:
  p.finalize(txId, decision)
  // Participants held their locks
  // since PREPARE -- released only
  // now, after the final decision

Saga (orchestration) — with compensation

const steps = [
  { do: bookFlight, undo: cancelFlight },
  { do: reserveHotel, undo: cancelHotel },
  { do: chargeCard, undo: refundCard },
];

const completed = [];
try {
  for (const step of steps) {
    await step.do();       // commits immediately
    completed.push(step);  // remember for rollback
  }
} catch (err) {
  // Run compensations in REVERSE order
  for (const step of completed.reverse()) {
    await step.undo();
  }
}

When to use 2PC

  • All participants are in one trust/operational domain. Coordinating multiple database resource managers under a single application server (classic XA transactions) is where 2PC still fits comfortably.
  • Strong, immediate consistency is non-negotiable. If intermediate, partially-applied states are unacceptable for even a moment, 2PC's all-or-nothing atomicity is worth the availability cost.
  • The transaction is short-lived and low-latency. 2PC's blocking window is much less risky when prepare-to-commit takes milliseconds, not the seconds or minutes a cross-service saga step might take.
  • You control the coordinator's reliability tightly. If the coordinator itself is highly available (replicated, monitored), the "blocking problem" risk is minimised.

When to use the Saga pattern

  • The operation spans independently deployed microservices. Each service owns its own database; you cannot (and should not) hold a lock inside another team's service for the duration of a multi-step business process.
  • Availability matters more than instantaneous consistency. Order processing, travel booking, and e-commerce checkouts commonly accept a brief inconsistent window in exchange for never locking resources across services.
  • Steps can take a long time or involve external systems. Waiting on a third-party payment gateway or a slow partner API inside a 2PC prepare phase would hold locks for far too long; sagas let each step complete and release independently.
  • You can design meaningful compensating actions. If every step has a sensible "undo" (cancel, refund, release), sagas give you resilience without the coordination overhead of 2PC.

English phrases engineers use

2PC conversations

  • "The coordinator is still waiting on one participant's vote."
  • "That's the classic blocking problem — the coordinator died mid-transaction."
  • "We're using XA transactions to coordinate these two databases."
  • "A missing vote is treated as a no-vote — the whole transaction aborts."

Saga conversations

  • "Step 3 failed, so we're running compensations for steps 1 and 2."
  • "This is a choreography saga — no central orchestrator, just event listeners."
  • "We need a compensating transaction to reverse the hotel reservation."
  • "Between steps, the system is only eventually consistent — that's expected."

Quick decision tree

  • All participants in one trust domain, short transaction → 2PC / XA transactions
  • Multi-step process across independent microservices → Saga pattern
  • A step may take seconds/minutes or hit external APIs → Saga (2PC locks too long)
  • Strong, instant, all-or-nothing consistency is mandatory → 2PC (accept the availability cost)
  • Few steps, need simpler failure reasoning → Orchestration-based Saga
  • Many services, high autonomy, event-driven culture → Choreography-based Saga

Frequently asked questions

What problem do both 2PC and Saga solve?

Both coordinate a single business operation that must update data across multiple independent services or databases — for example, "book a flight and reserve a hotel" where each lives in a different service. Without coordination, one part could succeed while the other fails, leaving the system in an inconsistent state. 2PC and Saga are two very different answers to "how do we keep that from happening."

Why is 2PC called "blocking"?

If the coordinator crashes after sending "prepare" but before sending the final "commit" or "abort," participants that voted yes are stuck holding their locks indefinitely — they cannot safely proceed without knowing the coordinator's decision, because prematurely committing or aborting could contradict what the coordinator eventually decides. This is called the "blocking problem," and it is the main reason 2PC is avoided in systems that need high availability.

What is a compensating transaction in a Saga?

A compensating transaction is the "undo" step for a saga step that already succeeded. If step 3 of a 5-step saga fails, the saga must run compensations for steps 1 and 2 in reverse order (e.g. "cancel the flight reservation," "refund the hotel charge") — since the underlying operations already committed independently, you cannot simply roll them back like a database transaction; you must explicitly reverse their business effect.