Translate into your own language

Tuesday, January 13, 2026

Step-by-Step : PostgreSQL Logical Replication Setup

From prod-1, I need to replicate only two tables — employees and departments — to prod-2 using PostgreSQL logical replication.

Prerequisites:

  • PostgreSQL version 10 or higher on both servers
  • Network connectivity between prod-1 and prod-2
  • wal_level must be set to logical on prod-1
  • Superuser or replication privileges are available


1. Configure Source Server (prod-1)

1.1. Ensure wal_level is logical

Edit postgresql.conf on prod-1:

        wal_level = logical

        max_replication_slots = 10

        max_wal_senders = 10

        Reload the cluster -

        SELECT pg_reload_conf();


1.2. Allow prod-2 to connect to connect prod-1

In prod-1 pg_hba.conf:

    host    replication    replicator    <PROD-2 IP>/32    md5

    host    all            replicator    <PROD-2 IP>/32    md5

    Reload the cluster:

    SELECT pg_reload_conf();


1.3. Create a replication user

On prod-1:

    CREATE USER replicator WITH REPLICATION LOGIN PASSWORD 'StrongPassword';


1.4. Grant SELECT on tables to replication user

    GRANT SELECT ON TABLE public.employees TO replicator;

    GRANT SELECT ON TABLE public.departments TO replicator;

    Also grant usage on schema:

    GRANT USAGE ON SCHEMA public TO replicator;

2. Create Publication (prod-1)

Note - Replicate only employees and departments:

CREATE PUBLICATION pub_hr FOR TABLE employees, departments;

Verify:

    \dRp+ pub_hr

3. Ensure Tables Have Primary Keys

Logical replication requires PK or unique index.

On prod-1:

\d employees

\d departments

If missing:

ALTER TABLE employees ADD PRIMARY KEY (id);

ALTER TABLE departments ADD PRIMARY KEY (dept_id);

4. Prepare Target Server (PROD-2)

4.1. Create same tables on analytics-prod

Tables must exist with same structure (except constraints/triggers).

Either:

• create manually, or

• use pg_dump -s:

pg_dump -s -t employees -t departments prod1_db > hr_schema.sql

psql analytics_db < hr_schema.sql


5. Create Subscription on PROD-2

CREATE SUBSCRIPTION sub_hr CONNECTION 'host=<PROD-1-IP> port=5432 dbname=prod1_db user=replicator password=StrongPassword' PUBLICATION pub_hr;

This will:

• copy initial data

• begin replication

6. Check Replication Status

On prod-2:

SELECT * FROM pg_subscription;

SELECT * FROM pg_stat_subscription;

On prod-1:

SELECT * FROM pg_stat_replication;

7. What Is Replicated Automatically?

Object Replicated?

INSERT Yes

UPDATE Yes

DELETE Yes

8. To Add More Tables Later

On prod-1:

ALTER PUBLICATION pub_hr ADD TABLE some_table;

On prod-2:

ALTER SUBSCRIPTION sub_hr REFRESH PUBLICATION;

9. To Remove the Subscription

On analytics-prod:

DROP SUBSCRIPTION sub_hr;

On prod-1:

DROP PUBLICATION pub_hr;







Script: Automated Reindexing of Indexes with Bloat > 30%

 #!/bin/bash

set -euo pipefail


# ================= CONFIG =================

PGDATABASE="<dbname>"

LOGFILE="/tmp/reindex_concurrently_$(date +%F_%H%M).log"


# DRY RUN mode:

# true  = only print commands (no execution)

# false = execute REINDEX commands

DRY_RUN=true


# Login command (runs psql as postgres user)

psql="psql -XqtA -d $PGDATABASE"

# ==========================================


echo "===== REINDEX CONCURRENTLY STARTED: $(date) =====" | tee -a "$LOGFILE"

echo "DRY_RUN = $DRY_RUN" | tee -a "$LOGFILE"


$psql <<'EOF' |

WITH btree_index_atts AS (

    SELECT

        nspname,

        indexclass.relname AS index_name,

        indexclass.reltuples,

        indexclass.relpages,

        indrelid,

        indexrelid,

        indexclass.relam,

        tableclass.relname AS tablename,

        regexp_split_to_table(indkey::text, ' ')::smallint AS attnum,

        indexrelid AS index_oid

    FROM pg_index

    JOIN pg_class AS indexclass ON pg_index.indexrelid = indexclass.oid

    JOIN pg_class AS tableclass ON pg_index.indrelid = tableclass.oid

    JOIN pg_namespace ON pg_namespace.oid = indexclass.relnamespace

    JOIN pg_am ON indexclass.relam = pg_am.oid

    WHERE pg_am.amname = 'btree'

      AND indexclass.relpages > 0

      AND nspname NOT IN ('pg_catalog','information_schema')

),

index_item_sizes AS (

    SELECT

        ind_atts.nspname,

        ind_atts.index_name,

        ind_atts.reltuples,

        ind_atts.relpages,

        ind_atts.relam,

        indrelid AS table_oid,

        index_oid,

        current_setting('block_size')::numeric AS bs,

        8 AS maxalign,

        24 AS pagehdr,

        CASE

            WHEN max(coalesce(pg_stats.null_frac,0)) = 0 THEN 2

            ELSE 6

        END AS index_tuple_hdr,

        SUM((1 - coalesce(pg_stats.null_frac, 0))

            * coalesce(pg_stats.avg_width, 1024)) AS nulldatawidth

    FROM pg_attribute

    JOIN btree_index_atts AS ind_atts

        ON pg_attribute.attrelid = ind_atts.indexrelid

       AND pg_attribute.attnum = ind_atts.attnum

    JOIN pg_stats

        ON pg_stats.schemaname = ind_atts.nspname

       AND (

            (pg_stats.tablename = ind_atts.tablename

             AND pg_stats.attname = pg_catalog.pg_get_indexdef(

                 pg_attribute.attrelid, pg_attribute.attnum, TRUE))

            OR

            (pg_stats.tablename = ind_atts.index_name

             AND pg_stats.attname = pg_attribute.attname)

        )

    WHERE pg_attribute.attnum > 0

    GROUP BY 1,2,3,4,5,6,7,8,9

),

index_aligned_est AS (

    SELECT

        maxalign, bs, nspname, index_name, reltuples,

        relpages, relam, table_oid, index_oid,

        COALESCE (

            CEIL (

                reltuples * (

                    6 + maxalign

                    - CASE

                        WHEN index_tuple_hdr % maxalign = 0 THEN maxalign

                        ELSE index_tuple_hdr % maxalign

                      END

                    + nulldatawidth

                    + maxalign

                    - CASE

                        WHEN nulldatawidth::int % maxalign = 0 THEN maxalign

                        ELSE nulldatawidth::int % maxalign

                      END

                )::numeric / (bs - pagehdr::numeric) + 1

            ),

            0

        ) AS expected

    FROM index_item_sizes

),

raw_bloat AS (

    SELECT

        nspname,

        index_name,

        bs * relpages::bigint AS totalbytes,

        expected,

        CASE

            WHEN relpages <= expected THEN 0

            ELSE (relpages - expected) * 100.0 / relpages

        END AS realbloat,

        bs * GREATEST(relpages - expected,0)::bigint AS wastedbytes

    FROM index_aligned_est

),

format_bloat AS (

    SELECT

        nspname AS schema_name,

        index_name,

        ROUND(realbloat) AS bloat_pct,

        ROUND(wastedbytes / (1024^3)::numeric, 3) AS bloat_gb

    FROM raw_bloat

)

SELECT

    'REINDEX INDEX CONCURRENTLY '

    || quote_ident(schema_name) || '.'

    || quote_ident(index_name) || ';'

FROM format_bloat

WHERE bloat_pct > 90

  AND bloat_gb > 0.01

ORDER BY bloat_gb DESC;

EOF

while read -r CMD

do

    [ -z "$CMD" ] && continue


    echo "--------------------------------------------" | tee -a "$LOGFILE"

    echo "Command: $CMD" | tee -a "$LOGFILE"


    if [ "$DRY_RUN" = true ]; then

        echo "DRY RUN: Not executing (just printing)" | tee -a "$LOGFILE"

    else

        echo "Executing..." | tee -a "$LOGFILE"

        $psql -c "$CMD" >> "$LOGFILE" 2>&1

        echo "Done." | tee -a "$LOGFILE"

        sleep 30

    fi

done


echo "===== REINDEX CONCURRENTLY COMPLETED: $(date) =====" | tee -a "$LOGFILE"

echo "Log file: $LOGFILE"