/
Data engineering

PostgreSQL Logical Decoding Plugins: A Developer's Guide

Discover PostgreSQL logical decoding plugins for building efficient CDC pipelines: explore built-in options like pgoutput, third-party plugins like wal2json, examples, and implementation tips.

PostgreSQL Logical Decoding Plugins: A Developer's Guide

When building change data capture (CDC) pipelines with PostgreSQL, selecting the right output plugin determines how database changes get formatted and delivered—whether you're replicating to another PostgreSQL instance, streaming to Kafka, or delivering changes to a webhook. Understanding these plugins is crucial for implementing efficient real-time data synchronization across systems.

This guide examines the different PostgreSQL logical decoding output plugins, their technical characteristics, and practical implementation considerations for modern data architectures.

A Simple Example: See an Update Flow End-to-End

To understand how different output plugins work, let's start with a concrete scenario. We'll set up logical replication, make a change, and see how different plugins format that change.

Setting Up Logical Replication

First, enable logical replication in PostgreSQL by checking your WAL level:

SHOW wal_level;

If it's not set to 'logical', configure logical replication for your PostgreSQL provider (AWS RDS, Google Cloud SQL, Azure).

Create a Table and Replication Slot

-- 1. Create a test table
CREATE TABLE users (
   id SERIAL PRIMARY KEY,
   name VARCHAR(255) NOT NULL,
   email VARCHAR(255) NOT NULL,
   created_at TIMESTAMP DEFAULT NOW()
);

-- 2. Set replica identity for complete change tracking
ALTER TABLE users REPLICA IDENTITY FULL;

-- 3. Insert initial data
INSERT INTO users (name, email) VALUES ('John Doe', 'john@example.com');

-- 4. Create a publication
CREATE PUBLICATION user_changes FOR ALL TABLES;

-- 5. Create logical replication slot with test_decoding plugin
SELECT pg_create_logical_replication_slot('test_slot', 'test_decoding');

Make the Change We're Tracking

-- The change we'll observe across different plugins
UPDATE users SET name = 'John Smith' WHERE id = 1;

Reading Changes from the Replication Slot

-- View the formatted output
SELECT lsn, xid, data FROM pg_logical_slot_peek_changes('test_slot', NULL, NULL);

This simple update produces dramatically different output depending on your chosen plugin—a critical factor for automated data sync between applications and integration complexity.

How Changes Flow from Table to Output Plugin

Understanding the data flow helps optimize your real-time data synchronization architecture:

1. Write-Ahead Log (WAL) Storage

PostgreSQL's WAL stores low-level binary records, not human-readable messages:

# Simplified WAL record structure
WAL Record: LSN 0/1A2B3C4
- Relation OID: 16384 (internal table identifier)  
- Transaction ID: 12345
- Operation: UPDATE
- Block/offset: physical storage location
- Old tuple: [binary data for old row]
- New tuple: [binary data for new row]

The WAL contains only internal identifiers and binary data—no table names, column names, or readable values.

2. Logical Decoding Process

This architecture allows PostgreSQL to support many output formats without changing the underlying WAL format or storing duplicate information. The core database only needs to log changes once in the WAL, and then any number of output plugins can decode those logs and present the data in JSON, SQL, binary, etc., as needed. [1]

The decoding process:

  1. Reads WAL records sequentially from slot position (LSN)
  2. Resolves internal identifiers using system catalogs
  3. Transforms binary data into logical representation
  4. Assembles complete transactions in commit order
  5. Passes structured data to the output plugin

3. Plugin-Specific Formatting

Every logical decoding plugin receives the same core information about the change; what differs is how they output it. The test_decoding plugin formats this as human-readable text, wal2json converts it to JSON, and pgoutput encodes it in PostgreSQL's binary logical replication protocol. [1]

Each plugin receives identical decoded information:

  • Table name: public.users
  • Operation: UPDATE
  • New values: {id: 1, name: "John Smith", email: "john@example.com"}
  • Old values: {name: "John Doe"}

This standardized input enables consistent behavior across different output formats while supporting diverse integration requirements.

Built-in Output Plugins

PostgreSQL ships with two logical decoding plugins out of the box. These don't require any additional installations—they're ready to use on any Postgres 10+ server. [1]

pgoutput

pgoutput is PostgreSQL's default plugin for logical replication. If you're using the built-in publish/subscribe system, you're already using this plugin behind the scenes. [1]

Sample output (conceptual representation):

BEGIN LSN: 0/1A2B3C4
TABLE: public.users
UPDATE: id[integer]=1 name[text]='John Smith' (old: 'John Doe') email[text]='john@example.com'
COMMIT LSN: 0/1A2B3C4

The actual output uses a binary protocol requiring specialized parsing tools.

Technical characteristics:

  • ✅ Binary format provides efficiency and compactness
  • ✅ Handles complex PostgreSQL data types without information loss
  • ✅ High performance with incremental streaming
  • ✅ Production-ready, universally supported on managed services
  • ❌ Requires specialized tools for consumption (can't use SQL functions directly)
  • ❌ Binary protocol isn't human-readable for debugging

test_decoding

PostgreSQL's example plugin, primarily useful for understanding logical decoding mechanics and debugging.

Sample output:

BEGIN 12345
table public.users: UPDATE: id[integer]:1 name[text]:'John Smith' email[text]:'john@example.com'
COMMIT 12345

Technical characteristics:

  • ✅ Human-readable text format for easy interpretation
  • ✅ Excellent for learning logical decoding concepts
  • ✅ Included with PostgreSQL by default
  • ✅ Useful for debugging replication issues
  • ❌ Output format not designed for production parsing
  • ❌ Limited functionality compared to specialized plugins
  • ❌ No advanced features like filtering or sophisticated type handling

Popular Third-Party Plugins

wal2json

The wal2json extension allows streaming all changes in a database to a consumer, formatted as JSON. [2]

Sample output:

{
 "change": [{
   "kind": "update",
   "schema": "public",
   "table": "users",
   "columnnames": ["id", "name", "email", "created_at"],
   "columnvalues": [1, "John Smith", "john@example.com", "2024-01-15T10:30:00"],
   "oldkeys": {
     "keynames": ["id"],
     "keyvalues": [1]
   },
   "oldvalues": [1, "John Doe", "john@example.com", "2024-01-15T10:30:00"]
 }]
}

Technical characteristics:

  • ✅ Easy parsing in any programming language
  • ✅ Human-readable format for development and debugging
  • ✅ Supported on most managed services (Cloud SQL supports pgoutput, test_decoding, and wal2json) [2]
  • ❌ Higher overhead than binary formats
  • ❌ Performance degradation with very large transactions

decoderbufs

Uses Protocol Buffers for efficient binary serialization, targeting high-throughput scenarios.

Sample output (conceptual protobuf structure):

RowMessage {
 transaction_id: 12345
 table: "public.users"
 op: UPDATE
 new_tuple: {
   columns: [
     {name: "id", type: INTEGER, value: 1},
     {name: "name", type: TEXT, value: "John Smith"},
     {name: "email", type: TEXT, value: "john@example.com"}
   ]
 }
 old_tuple: {
   columns: [
     {name: "name", type: TEXT, value: "John Doe"}
   ]
 }
}

Technical characteristics:

  • ✅ Very efficient binary format
  • ✅ Schema-defined structure with strong typing
  • ✅ Excellent for high-throughput scenarios
  • ❌ Requires compilation and installation
  • ❌ Not available on most managed services
  • ❌ Added complexity for development teams

Specialized Plugins

  • wal2mongo: Outputs MongoDB-compatible JSON operations for direct MongoDB replication
  • decoder_raw: Generates raw SQL statements executable on target databases

Choosing the Right Plugin

Your choice depends on environment constraints and performance requirements:

Environment Constraints

Managed Services (AWS RDS, Google Cloud SQL, Azure): Choose pgoutput for Postgres-to-Postgres replication or wal2json for external integrations. [1]

Self-hosted environments: Self-hosted: You have full flexibility. Consider decoderbufs for high-performance scenarios or stick with pgoutput for simplicity. [1]

Performance Requirements

For high-volume real-time data synchronization scenarios:

  1. pgoutput: Best overall performance for most use cases
  2. decoderbufs: Optimal for high-throughput non-PostgreSQL targets (self-hosted only)
  3. wal2json: Convenient but potential bottleneck at scale

Modern Alternative: Purpose-Built Synchronization Platforms

While logical decoding plugins provide the foundation for change data capture, building production-ready consumers requires significant engineering investment. Organizations implementing automated data sync between applications increasingly choose purpose-built platforms that eliminate this complexity.

Traditional logical decoding implementation challenges:

  • Custom consumer development and maintenance
  • API rate limiting and error handling
  • Infrastructure scaling and monitoring
  • Security and compliance implementation
  • Bi-directional sync complexity and conflict resolution

Modern data synchronization platforms like Stacksync address these challenges by providing:

  • Real-time, bi-directional data synchronization that propagates changes between connected systems in milliseconds. When a record is updated in your CRM, the change is reflected quickly in your production database, and vice-versa. This helps ensure that all systems share a single, consistent state, reducing data integrity issues.
  • Enterprise connector ecosystem supporting 200+ systems
  • Built-in conflict resolution and error handling
  • No-code configuration with enterprise security compliance

This eliminates the need for custom logical decoding consumer development while providing comprehensive bi-directional sync capabilities across CRMs, ERPs, and databases.

Output Plugin vs. Final Format

The output plugin format isn't necessarily what your application consumes. For example:

  • CDC tools can use pgoutput to receive binary data from PostgreSQL
  • Convert it to JSON for delivery to downstream systems
  • Your application sees JSON format, not the pgoutput format
  • Modern synchronization platforms handle this transformation automatically

This architecture separation allows PostgreSQL to maintain efficient internal formats while supporting diverse integration requirements through automated data sync between applications.

Ready to move beyond custom logical decoding implementations? Modern platforms like Stacksync provide enterprise-grade real-time data synchronization without the complexity of building and maintaining custom logical decoding consumers. Experience bi-directional sync across 200+ systems with no-code configuration and automated reliability.