data-pipelinesstreamingschemareplication
Change Data Capture (CDC)
Stream database changes from logs to downstream systems with low-latency propagation.
Definition
CDC reads transactional logs (WAL/binlog/oplog) and emits ordered change events for consumers.
When To Use
- Need near-real-time sync from OLTP systems to caches/search/analytics.
- Decoupling read models from transactional databases.
- Integration pipelines where direct DB polling is too costly or stale.
When Not To Use
- Tiny systems where periodic batch export is sufficient.
- Datastores without reliable log-based capture mechanism.
- Without schema registry/governance for downstream contracts.
Tradeoffs
- Low-latency propagation, but adds operational complexity and backfill logic.
- Reduces heavy query polling, while introducing ordering and dedup concerns.
- Enables broad fanout, with stronger schema/version management burden.
Common Failure Modes
- Connector lag grows and downstream staleness exceeds SLO.
- Slot/offset loss forces expensive re-snapshot.
- Schema drift breaks consumers during production deploys.
Interview Framing
Use this structure when the interviewer asks for this pattern explicitly.
Cover ordering guarantees, snapshot strategy, lag monitoring, and consumer idempotency requirements.
Related Project Deep Dives
Change Data Capture (CDC) Pipeline
Design a system that captures database changes in real-time and streams them to downstream systems with schema evolution support, exactly-once delivery, and multi-database compatibility.
Schema Evolution & Compatibility Platform
Design a platform that manages schema changes across services with compatibility guarantees, automated migration workflows, and data contract enforcement at scale.
Related Concepts
Transactional Outbox Pattern
Atomically persist business state and event records in one DB transaction, then publish asynchronously.
Exactly-Once Processing (Practical)
Achieve effective exactly-once outcomes via idempotency, transactions, and dedup rather than magic guarantees.
Event Sourcing
Persist state as an append-only event log and rebuild current state by replay.