multi-regionavailabilityconsistencyglobal-systems
Geo-Replication (Active-Active)
Serve traffic from multiple regions simultaneously while synchronizing state across them.
Definition
Active-active geo replication allows multiple regions to accept reads/writes with conflict resolution and convergence.
When To Use
- Global products requiring low latency in multiple continents.
- High availability targets where region outage must not stop writes.
- Workloads with partition-tolerant conflict resolution strategy.
When Not To Use
- Domains requiring strict single-copy serializability across regions.
- Low-scale products where active-passive is simpler and sufficient.
- Without mature operational tooling for conflict/replication lag handling.
Tradeoffs
- Improves latency and availability globally, but adds conflict resolution complexity.
- Reduces single-region dependency, with higher replication and ops cost.
- Improves regional resilience, while making correctness reasoning harder.
Common Failure Modes
- Replication lag causes divergent user views and stale policy enforcement.
- Conflict resolver bug corrupts canonical state.
- Region isolation causes split-brain writes with difficult reconciliation.
Interview Framing
Use this structure when the interviewer asks for this pattern explicitly.
Explicitly state conflict strategy, write routing model, lag SLOs, and user-visible semantics during partitions.
Related Project Deep Dives
API Rate-Limiting as a Multi-Region Service
Design a globally consistent rate limiting service with low latency and multi-region enforcement.
Distributed Cache System
Design a distributed cache system like Redis or Memcached that handles millions of requests per second with sub-millisecond latency, high availability, and intelligent eviction policies.
Related Concepts
Leader Election
Select a single coordinator for shared work while preserving failover safety.
Quorum Consistency
Use read/write quorum sizes to balance consistency, availability, and latency in replicated stores.
Sharding Strategies
Partition data/work across shards to scale throughput and storage while controlling skew.