.webp)
When building change data capture (CDC) pipelines with PostgreSQL, selecting the right output plugin determines how database changes get formatted and delivered—whether you're replicating to another PostgreSQL instance, streaming to Kafka, or delivering changes to a webhook. Understanding these plugins is crucial for implementing efficient real-time data synchronization across systems.
This guide examines the different PostgreSQL logical decoding output plugins, their technical characteristics, and practical implementation considerations for modern data architectures.
To understand how different output plugins work, let's start with a concrete scenario. We'll set up logical replication, make a change, and see how different plugins format that change.
First, enable logical replication in PostgreSQL by checking your WAL level:
SHOW wal_level;
If it's not set to 'logical', configure logical replication for your PostgreSQL provider (AWS RDS, Google Cloud SQL, Azure).
`-- 1. Create a test table CREATE TABLE users ( id SERIAL PRIMARY KEY, name VARCHAR(255) NOT NULL, email VARCHAR(255) NOT NULL, created_at TIMESTAMP DEFAULT NOW() );
-- 2. Set replica identity for complete change tracking ALTER TABLE users REPLICA IDENTITY FULL;
-- 3. Insert initial data INSERT INTO users (name, email) VALUES ('John Doe', '[email protected]');
-- 4. Create a publication CREATE PUBLICATION user_changes FOR ALL TABLES;
-- 5. Create logical replication slot with test_decoding plugin SELECT pg_create_logical_replication_slot('test_slot', 'test_decoding');`
-- The change we'll observe across different plugins
UPDATE users SET name = 'John Smith' WHERE id = 1;
-- View the formatted output
SELECT lsn, xid, data FROM pg_logical_slot_peek_changes('test_slot', NULL, NULL);
This simple update produces dramatically different output depending on your chosen plugin—a critical factor for automated data sync between applications and integration complexity.
Understanding the data flow helps optimize your real-time data synchronization architecture:
PostgreSQL's WAL stores low-level binary records, not human-readable messages:
# Simplified WAL record structure
WAL Record: LSN 0/1A2B3C4
- Relation OID: 16384 (internal table identifier)
- Transaction ID: 12345
- Operation: UPDATE
- Block/offset: physical storage location
- Old tuple: [binary data for old row]
- New tuple: [binary data for new row]
The WAL contains only internal identifiers and binary data—no table names, column names, or readable values.
This architecture allows PostgreSQL to support many output formats without changing the underlying WAL format or storing duplicate information. The core database only needs to log changes once in the WAL, and then any number of output plugins can decode those logs and present the data in JSON, SQL, binary, etc., as needed.
The decoding process:
Every logical decoding plugin receives the same core information about the change; what differs is how they output it. The test_decoding plugin formats this as human-readable text, wal2json converts it to JSON, and pgoutput encodes it in PostgreSQL's binary logical replication protocol.
Each plugin receives identical decoded information:
public.usersUPDATE{id: 1, name: "John Smith", email: "[email protected]"}{name: "John Doe"}This standardized input enables consistent behavior across different output formats while supporting diverse integration requirements.
PostgreSQL ships with two logical decoding plugins out of the box. These don't require any additional installations—they're ready to use on any Postgres 10+ server.
pgoutput is PostgreSQL's default plugin for logical replication. If you're using the built-in publish/subscribe system, you're already using this plugin behind the scenes.
Sample output (conceptual representation):
BEGIN LSN: 0/1A2B3C4
TABLE: public.users
UPDATE: id[integer]=1 name[text]='John Smith' (old: 'John Doe') email[text]='[email protected]'
COMMIT LSN: 0/1A2B3C4
The actual output uses a binary protocol requiring specialized parsing tools.
Technical characteristics:
PostgreSQL's example plugin, primarily useful for understanding logical decoding mechanics and debugging.
Sample output:
BEGIN 12345
table public.users: UPDATE: id[integer]:1 name[text]:'John Smith' email[text]:'[email protected]'
COMMIT 12345
Technical characteristics:
The wal2json extension allows streaming all changes in a database to a consumer, formatted as JSON.
Sample output:
{
"change": [{
"kind": "update",
"schema": "public",
"table": "users",
"columnnames": ["id", "name", "email", "created_at"],
"columnvalues": [1, "John Smith", "[email protected]", "2024-01-15T10:30:00"],
"oldkeys": {
"keynames": ["id"],
"keyvalues": [1]
},
"oldvalues": [1, "John Doe", "[email protected]", "2024-01-15T10:30:00"]
}]
}
Technical characteristics:
Uses Protocol Buffers for efficient binary serialization, targeting high-throughput scenarios.
Sample output (conceptual protobuf structure):
RowMessage {
transaction_id: 12345
table: "public.users"
op: UPDATE
new_tuple: {
columns: [
{name: "id", type: INTEGER, value: 1},
{name: "name", type: TEXT, value: "John Smith"},
{name: "email", type: TEXT, value: "[email protected]"}
]
}
old_tuple: {
columns: [
{name: "name", type: TEXT, value: "John Doe"}
]
}
}
Technical characteristics:
Your choice depends on environment constraints and performance requirements.
| Plugin | Best Use Case | Trade-offs |
|---|---|---|
| pgoutput | PostgreSQL-to-PostgreSQL replication, high-volume CDC with managed services | Binary format requires specialized parsing tools, not human-readable |
| test_decoding | Learning logical decoding, debugging replication issues | Not production-ready, limited features, text parsing unreliable |
| wal2json | External integrations, Kafka streaming, webhook delivery | Higher overhead at scale, slower with large transactions |
| decoderbufs | High-throughput non-PostgreSQL targets, Debezium pipelines | Self-hosted only, requires protobuf compilation, added complexity |
| wal2mongo | Direct PostgreSQL to MongoDB replication | Specialized use case, limited community support |
| decoder_raw | SQL statement generation for heterogeneous databases | Niche scenarios, requires target database compatibility |
pgoutput is the default choice for managed PostgreSQL services and delivers the best performance for most CDC workloads.
Avoid test_decoding in production—it lacks filtering, type handling, and reliable parsing for automated pipelines.
Choose wal2json when integrating with non-PostgreSQL systems; use decoderbufs for self-hosted high-throughput needs.
Managed Services (AWS RDS, Google Cloud SQL, Azure): Choose pgoutput for Postgres-to-Postgres replication or wal2json for external integrations.
Self-hosted environments: You have full flexibility. Consider decoderbufs for high-performance scenarios or stick with pgoutput for simplicity.
For high-volume real-time data synchronization scenarios:
While logical decoding plugins provide the foundation for change data capture, building production-ready consumers requires significant engineering investment. Organizations implementing automated data sync between applications often face challenges that go beyond plugin selection.
Traditional logical decoding implementation challenges:
Modern data synchronization platforms address these challenges by providing real-time, bi-directional data synchronization that propagates changes between connected systems in milliseconds. When a record is updated in your CRM, the change is reflected quickly in your production database, and vice-versa. This helps ensure that all systems share a single, consistent state, reducing data integrity issues.
Ready to reduce integration complexity? Explore how Stacksync delivers real-time, bi-directional synchronization without heavy infrastructure or brittle pipelines.