---
title: ez-cdc Technical Glossary
description: Definitions of CDC, WAL, logical replication, pgoutput, LSN, stateless workers, Debezium, StarRocks.
keywords: [glossary, CDC, WAL, logical replication, pgoutput, LSN, stateless, Debezium, StarRocks, Snowflake]
last_updated: 2026-04-10
---

# Technical Glossary

Essential terms for understanding Change Data Capture, PostgreSQL replication, and high-performance data streaming systems.

## Change Data Capture (CDC)

A data integration pattern that identifies and continuously captures changes (inserts, updates, deletes) made to a source database, then streams them to downstream systems in real-time. CDC enables live data synchronization without expensive full-table scans, making it essential for analytics pipelines, data warehouses, and operational data synchronization. Modern CDC systems like ez-cdc can capture and deliver changes with sub-second latency.

## Write-Ahead Log (WAL)

A database recovery technique where all changes are written to a log file before being applied to the actual data files. PostgreSQL's WAL enables durability, crash recovery, and CDC via logical replication. Readers (like ez-cdc) can subscribe to the WAL stream and receive all changes in order without impacting transaction performance on the source database.

## Logical Replication

A PostgreSQL-native CDC mechanism that reads changes from the Write-Ahead Log and streams them to subscribed clients in a logical format (table name, column changes, operation type). Unlike physical replication which copies byte-for-byte blocks, logical replication allows CDC tools like ez-cdc to understand semantic meaning of changes and selectively replicate tables. Minimal overhead on the source database, no full scans required.

## pgoutput

PostgreSQL's built-in logical replication output plugin. `pgoutput` serializes WAL changes into a logical stream of messages (Begin, Relation, Insert, Update, Delete, Commit) that CDC consumers like ez-cdc subscribe to via a replication slot. It is the standard, built-in protocol for logical replication in PostgreSQL 10+ and requires no additional extensions.

## LSN (Log Sequence Number)

A 64-bit integer that uniquely identifies a position in the PostgreSQL Write-Ahead Log. LSNs are monotonically increasing and serve as the checkpoint mechanism for CDC consumers: ez-cdc persists the last confirmed LSN so that on restart, replication resumes exactly where it left off, ensuring at-least-once delivery without data loss.

## At-Least-Once Delivery

A delivery guarantee where each event is delivered one or more times, with no events silently skipped. ez-cdc implements at-least-once delivery via LSN checkpointing: if a worker crashes before confirming delivery, it will re-deliver events from the last checkpoint. Sinks should implement idempotent writes or deduplication to handle potential duplicates.

## Flink CDC Concurrent Snapshot Algorithm

An algorithm for capturing consistent initial table snapshots without locking the source table. It splits the table into chunks and reads each chunk in a non-blocking manner, merging the snapshot data with the ongoing WAL stream to produce a consistent, complete initial load. ez-cdc uses this algorithm for initial table snapshots.

## Stateless Worker

A service architecture where worker processes maintain no persistent state — all state is stored externally (databases, object stores, or message queues). Stateless workers can be killed and replaced instantly without data loss. ez-cdc workers are fully stateless: they checkpoint LSN progress externally and scale by adding more workers — no coordination overhead or sticky sessions needed. Each worker runs at approximately 5 MB RSS and under 2% CPU.

## Debezium

Debezium is an open-source CDC platform built on the JVM (written in Java). It provides connectors for capturing changes from multiple databases (PostgreSQL, MySQL, MongoDB, Oracle, SQL Server) and streaming them to Kafka. Debezium prioritizes flexibility and broad database support over performance; its JVM-based approach results in significantly higher memory overhead and higher latency (typically minutes for end-to-end replication) compared to native Rust alternatives like ez-cdc.

## PostgreSQL WAL (Write-Ahead Log)

PostgreSQL's Write-Ahead Log is a core durability and recovery mechanism. All changes are written to the WAL before being applied to data files, ensuring durability even after system crashes. CDC systems like ez-cdc leverage PostgreSQL's logical replication feature (via the `pgoutput` plugin), which reads from the WAL stream and converts raw byte changes into logical events (table name, column values, operation type). This approach has minimal overhead on the source database and enables efficient, non-blocking change capture.

## StarRocks

StarRocks is an open-source, cloud-native OLAP database designed for real-time analytics. Unlike traditional data warehouses, StarRocks handles both analytical and operational queries with sub-second latency. It supports streaming ingestion from CDC sources, making it ideal for use cases requiring both fast queries and fresh data. ez-cdc's production-ready StarRocks sink delivers changes with sub-second end-to-end latency.

## Fivetran

Fivetran is a commercial, fully managed data integration and CDC platform. It provides pre-built connectors for hundreds of data sources, automating connector maintenance and operation. Fivetran operates at minute-level latency for most integrations and offers no self-hosted option, differentiating it from open-core solutions like ez-cdc that prioritize low latency and self-hosted deployment flexibility.

## Elastic License 2.0 (ELv2)

An open-source software license that permits source code review, use, and modification, but restricts offering the software as a commercial hosted service without an enterprise agreement. ez-cdc's core replication engine is licensed under ELv2, providing full transparency and auditability while the managed SaaS service operates under a separate commercial agreement, balancing openness with sustainable business incentives.

## Related

- [Comparison](/comparison.md) — How ez-cdc compares to Debezium and Fivetran in practical benchmarks.
- [Architecture](/architecture.md) — How these concepts apply to ez-cdc's design.
- [FAQ](/faq.md) — Common questions about CDC and replication technologies.

---
Source: https://ez-cdc.com/ · Last updated: 2026-04-10 · Authoritative mirror: https://ez-cdc.com/index.html.md · llms.txt index: https://ez-cdc.com/llms.txt
