multi-regionavailabilityconsistencyglobal-systems

Geo-Replication (Active-Active)

Serve traffic from multiple regions simultaneously while synchronizing state across them.

Definition

Active-active geo replication allows multiple regions to accept reads/writes with conflict resolution and convergence.

When To Use

Global products requiring low latency in multiple continents.
High availability targets where region outage must not stop writes.
Workloads with partition-tolerant conflict resolution strategy.

When Not To Use

Domains requiring strict single-copy serializability across regions.
Low-scale products where active-passive is simpler and sufficient.
Without mature operational tooling for conflict/replication lag handling.

Tradeoffs

Improves latency and availability globally, but adds conflict resolution complexity.
Reduces single-region dependency, with higher replication and ops cost.
Improves regional resilience, while making correctness reasoning harder.

Common Failure Modes

Replication lag causes divergent user views and stale policy enforcement.
Conflict resolver bug corrupts canonical state.
Region isolation causes split-brain writes with difficult reconciliation.

Interview Framing

Use this structure when the interviewer asks for this pattern explicitly.

Explicitly state conflict strategy, write routing model, lag SLOs, and user-visible semantics during partitions.

Related Project Deep Dives

API Rate-Limiting as a Multi-Region Service

Design a globally consistent rate limiting service with low latency and multi-region enforcement.

advancedFree

Distributed Cache System

Design a distributed cache system like Redis or Memcached that handles millions of requests per second with sub-millisecond latency, high availability, and intelligent eviction policies.

beginnerPremium

Global Video Streaming Platform

Design a large-scale video streaming system with upload processing, adaptive bitrate delivery, CDN distribution, recommendation signals, and strict playback reliability targets.

advancedPremium

Related Concepts

Leader Election

Select a single coordinator for shared work while preserving failover safety.

Quorum Consistency

Use read/write quorum sizes to balance consistency, availability, and latency in replicated stores.

Sharding Strategies

Partition data/work across shards to scale throughput and storage while controlling skew.