What Is Database Replication?
Replication is the process of copying data from one database server (the primary) to one or more other servers (replicas) in near real time. Replicas stay in sync with the primary and can serve two purposes:
- 1Read replicas — distribute read traffic to reduce load on the primary
- 2Standby replicas — take over automatically if the primary fails (high availability)
Understanding replication is essential for any application that needs to scale beyond a single database or that cannot tolerate downtime.
How PostgreSQL Streaming Replication Works
PostgreSQL uses Write-Ahead Logging (WAL). Every change to the primary is first written to the WAL, then applied to the actual data files. Replicas stream the WAL from the primary and apply the same changes:
Primary DB ──── WAL stream ────► Replica 1 (read replica)
└───► Replica 2 (hot standby / failover)Replication lag is typically under 100ms on the same network — your replica is almost always current.
Setting Up a Read Replica (PostgreSQL)
On the primary, enable WAL streaming and create a replication user:
-- In postgresql.conf
wal_level = replica
max_wal_senders = 3
wal_keep_size = 1GB -- keep enough WAL for replica to catch up after downtime
-- Create a dedicated replication user
CREATE USER replicator REPLICATION LOGIN PASSWORD 'strongpassword';# In pg_hba.conf — allow replica to connect
echo "host replication replicator replica-ip/32 md5" >> pg_hba.confOn the replica, take a base backup and configure it to stream from the primary:
# Take base backup from primary
pg_basebackup -h primary-ip -U replicator -D /var/lib/postgresql/data -P -Xs -R
# The -R flag creates standby.signal and sets primary_conninfo automatically
# Start the replica
pg_ctl start -D /var/lib/postgresql/dataCheck replication status on the primary:
SELECT client_addr, state, sent_lsn, write_lsn, flush_lsn, replay_lsn,
(sent_lsn - replay_lsn) AS replication_lag_bytes
FROM pg_stat_replication;Routing Reads to the Replica
Your application should direct read-only queries to the replica and writes to the primary:
# Environment variables
PRIMARY_DB_URL="postgresql://user:pass@primary-host:5432/myapp"
REPLICA_DB_URL="postgresql://user:pass@replica-host:5432/myapp"In practice, most application frameworks let you configure a read/write split. For example with a load balancer or PgBouncer, you can route based on query type automatically.
Queries safe to send to the replica:
-- Read-only reporting queries
SELECT date_trunc('day', created_at) AS day, COUNT(*) AS orders
FROM orders
WHERE created_at >= NOW() - INTERVAL '30 days'
GROUP BY day
ORDER BY day;
-- Dashboard aggregates
SELECT status, COUNT(*) FROM orders GROUP BY status;
-- User lookups for authentication (if slight lag is acceptable)
SELECT id, email, password_hash FROM users WHERE email = 'alice@example.com';Always send writes and transactions that require read-your-writes consistency to the primary.
Replication Lag: What to Monitor
-- On the replica: check how far behind it is
SELECT NOW() - pg_last_xact_replay_timestamp() AS replication_lag;
-- Alert if lag exceeds 30 seconds# Shell check for monitoring scripts
LAG=$(psql "$REPLICA_DB_URL" -t -c "SELECT EXTRACT(EPOCH FROM (NOW() - pg_last_xact_replay_timestamp()))::INT;")
if [ "$LAG" -gt 30 ]; then
echo "ALERT: replication lag ${LAG}s"
fiAutomatic Failover
A hot standby replica can be promoted to primary if the primary fails. Tools like Patroni (PostgreSQL) or ProxySQL (MySQL) automate this:
# Manual promotion — only when primary is confirmed down
pg_ctl promote -D /var/lib/postgresql/data
# or
patronictl -c /etc/patroni.yml failover myapp-cluster --master old-primary --candidate replica-1After promotion, update your application's PRIMARY_DB_URL to point to the new primary. With a connection load balancer (HAProxy, AWS RDS Proxy, or PandaStack's pooling layer) this can be transparent to your application.
MySQL Replication
MySQL uses binary logs (binlog) instead of WAL, but the concepts are identical:
-- On the primary
SHOW MASTER STATUS;
-- Note the File and Position values
-- On the replica
CHANGE MASTER TO
MASTER_HOST='primary-ip',
MASTER_USER='replicator',
MASTER_PASSWORD='strongpassword',
MASTER_LOG_FILE='mysql-bin.000001',
MASTER_LOG_POS= 154;
START SLAVE;
SHOW SLAVE STATUS\G
-- Check: Seconds_Behind_Master: 0 (no lag)Replication vs. Backups
Replication is not a backup. If you accidentally run DELETE FROM orders; on the primary, that change replicates to all replicas within milliseconds. You need both:
- Replication for high availability and read scale
- Backups for point-in-time recovery from data loss
See our [Database Backup Strategies](/blog/database-backup-strategies) guide for details.
Managed Replication on PandaStack
[PandaStack](https://dashboard.pandastack.io) manages replication, failover, and connection routing for PostgreSQL and MySQL automatically. Replica promotion, lag monitoring, and connection pooling are handled by the platform — you get a single connection endpoint that always points to the current primary.
All database types (PostgreSQL, MySQL, Redis, MongoDB) are available from the Databases panel. Docs: [docs.pandastack.io](https://docs.pandastack.io).
Summary
| Concept | Purpose |
|---|---|
| Streaming replication | Copy WAL from primary to replicas in real time |
| Read replica | Offload SELECT queries to reduce primary load |
| Hot standby | Replica ready to promote on primary failure |
| Replication lag | Monitor — lag > 30s is a warning sign |
| Failover | Promote replica to primary when primary fails |