Translate into your own language

Tuesday, January 13, 2026

Step-by-Step : PostgreSQL Logical Replication Setup

From prod-1, I need to replicate only two tables — employees and departments — to prod-2 using PostgreSQL logical replication.

Prerequisites:

  • PostgreSQL version 10 or higher on both servers
  • Network connectivity between prod-1 and prod-2
  • wal_level must be set to logical on prod-1
  • Superuser or replication privileges are available


1. Configure Source Server (prod-1)

1.1. Ensure wal_level is logical

Edit postgresql.conf on prod-1:

        wal_level = logical

        max_replication_slots = 10

        max_wal_senders = 10

        Reload the cluster -

        SELECT pg_reload_conf();


1.2. Allow prod-2 to connect to connect prod-1

In prod-1 pg_hba.conf:

    host    replication    replicator    <PROD-2 IP>/32    md5

    host    all            replicator    <PROD-2 IP>/32    md5

    Reload the cluster:

    SELECT pg_reload_conf();


1.3. Create a replication user

On prod-1:

    CREATE USER replicator WITH REPLICATION LOGIN PASSWORD 'StrongPassword';


1.4. Grant SELECT on tables to replication user

    GRANT SELECT ON TABLE public.employees TO replicator;

    GRANT SELECT ON TABLE public.departments TO replicator;

    Also grant usage on schema:

    GRANT USAGE ON SCHEMA public TO replicator;

2. Create Publication (prod-1)

Note - Replicate only employees and departments:

CREATE PUBLICATION pub_hr FOR TABLE employees, departments;

Verify:

    \dRp+ pub_hr

3. Ensure Tables Have Primary Keys

Logical replication requires PK or unique index.

On prod-1:

\d employees

\d departments

If missing:

ALTER TABLE employees ADD PRIMARY KEY (id);

ALTER TABLE departments ADD PRIMARY KEY (dept_id);

4. Prepare Target Server (PROD-2)

4.1. Create same tables on analytics-prod

Tables must exist with same structure (except constraints/triggers).

Either:

• create manually, or

• use pg_dump -s:

pg_dump -s -t employees -t departments prod1_db > hr_schema.sql

psql analytics_db < hr_schema.sql


5. Create Subscription on PROD-2

CREATE SUBSCRIPTION sub_hr CONNECTION 'host=<PROD-1-IP> port=5432 dbname=prod1_db user=replicator password=StrongPassword' PUBLICATION pub_hr;

This will:

• copy initial data

• begin replication

6. Check Replication Status

On prod-2:

SELECT * FROM pg_subscription;

SELECT * FROM pg_stat_subscription;

On prod-1:

SELECT * FROM pg_stat_replication;

7. What Is Replicated Automatically?

Object Replicated?

INSERT Yes

UPDATE Yes

DELETE Yes

8. To Add More Tables Later

On prod-1:

ALTER PUBLICATION pub_hr ADD TABLE some_table;

On prod-2:

ALTER SUBSCRIPTION sub_hr REFRESH PUBLICATION;

9. To Remove the Subscription

On analytics-prod:

DROP SUBSCRIPTION sub_hr;

On prod-1:

DROP PUBLICATION pub_hr;







Script: Automated Reindexing of Indexes with Bloat > 30%

 #!/bin/bash

set -euo pipefail


# ================= CONFIG =================

PGDATABASE="<dbname>"

LOGFILE="/tmp/reindex_concurrently_$(date +%F_%H%M).log"


# DRY RUN mode:

# true  = only print commands (no execution)

# false = execute REINDEX commands

DRY_RUN=true


# Login command (runs psql as postgres user)

psql="psql -XqtA -d $PGDATABASE"

# ==========================================


echo "===== REINDEX CONCURRENTLY STARTED: $(date) =====" | tee -a "$LOGFILE"

echo "DRY_RUN = $DRY_RUN" | tee -a "$LOGFILE"


$psql <<'EOF' |

WITH btree_index_atts AS (

    SELECT

        nspname,

        indexclass.relname AS index_name,

        indexclass.reltuples,

        indexclass.relpages,

        indrelid,

        indexrelid,

        indexclass.relam,

        tableclass.relname AS tablename,

        regexp_split_to_table(indkey::text, ' ')::smallint AS attnum,

        indexrelid AS index_oid

    FROM pg_index

    JOIN pg_class AS indexclass ON pg_index.indexrelid = indexclass.oid

    JOIN pg_class AS tableclass ON pg_index.indrelid = tableclass.oid

    JOIN pg_namespace ON pg_namespace.oid = indexclass.relnamespace

    JOIN pg_am ON indexclass.relam = pg_am.oid

    WHERE pg_am.amname = 'btree'

      AND indexclass.relpages > 0

      AND nspname NOT IN ('pg_catalog','information_schema')

),

index_item_sizes AS (

    SELECT

        ind_atts.nspname,

        ind_atts.index_name,

        ind_atts.reltuples,

        ind_atts.relpages,

        ind_atts.relam,

        indrelid AS table_oid,

        index_oid,

        current_setting('block_size')::numeric AS bs,

        8 AS maxalign,

        24 AS pagehdr,

        CASE

            WHEN max(coalesce(pg_stats.null_frac,0)) = 0 THEN 2

            ELSE 6

        END AS index_tuple_hdr,

        SUM((1 - coalesce(pg_stats.null_frac, 0))

            * coalesce(pg_stats.avg_width, 1024)) AS nulldatawidth

    FROM pg_attribute

    JOIN btree_index_atts AS ind_atts

        ON pg_attribute.attrelid = ind_atts.indexrelid

       AND pg_attribute.attnum = ind_atts.attnum

    JOIN pg_stats

        ON pg_stats.schemaname = ind_atts.nspname

       AND (

            (pg_stats.tablename = ind_atts.tablename

             AND pg_stats.attname = pg_catalog.pg_get_indexdef(

                 pg_attribute.attrelid, pg_attribute.attnum, TRUE))

            OR

            (pg_stats.tablename = ind_atts.index_name

             AND pg_stats.attname = pg_attribute.attname)

        )

    WHERE pg_attribute.attnum > 0

    GROUP BY 1,2,3,4,5,6,7,8,9

),

index_aligned_est AS (

    SELECT

        maxalign, bs, nspname, index_name, reltuples,

        relpages, relam, table_oid, index_oid,

        COALESCE (

            CEIL (

                reltuples * (

                    6 + maxalign

                    - CASE

                        WHEN index_tuple_hdr % maxalign = 0 THEN maxalign

                        ELSE index_tuple_hdr % maxalign

                      END

                    + nulldatawidth

                    + maxalign

                    - CASE

                        WHEN nulldatawidth::int % maxalign = 0 THEN maxalign

                        ELSE nulldatawidth::int % maxalign

                      END

                )::numeric / (bs - pagehdr::numeric) + 1

            ),

            0

        ) AS expected

    FROM index_item_sizes

),

raw_bloat AS (

    SELECT

        nspname,

        index_name,

        bs * relpages::bigint AS totalbytes,

        expected,

        CASE

            WHEN relpages <= expected THEN 0

            ELSE (relpages - expected) * 100.0 / relpages

        END AS realbloat,

        bs * GREATEST(relpages - expected,0)::bigint AS wastedbytes

    FROM index_aligned_est

),

format_bloat AS (

    SELECT

        nspname AS schema_name,

        index_name,

        ROUND(realbloat) AS bloat_pct,

        ROUND(wastedbytes / (1024^3)::numeric, 3) AS bloat_gb

    FROM raw_bloat

)

SELECT

    'REINDEX INDEX CONCURRENTLY '

    || quote_ident(schema_name) || '.'

    || quote_ident(index_name) || ';'

FROM format_bloat

WHERE bloat_pct > 90

  AND bloat_gb > 0.01

ORDER BY bloat_gb DESC;

EOF

while read -r CMD

do

    [ -z "$CMD" ] && continue


    echo "--------------------------------------------" | tee -a "$LOGFILE"

    echo "Command: $CMD" | tee -a "$LOGFILE"


    if [ "$DRY_RUN" = true ]; then

        echo "DRY RUN: Not executing (just printing)" | tee -a "$LOGFILE"

    else

        echo "Executing..." | tee -a "$LOGFILE"

        $psql -c "$CMD" >> "$LOGFILE" 2>&1

        echo "Done." | tee -a "$LOGFILE"

        sleep 30

    fi

done


echo "===== REINDEX CONCURRENTLY COMPLETED: $(date) =====" | tee -a "$LOGFILE"

echo "Log file: $LOGFILE"


Tuesday, December 30, 2025

What ispg_read_all_data & pg_write_all_data role in PostgreSQL

                    pg_read_all_data & pg_write_all_data

pg_read_all_data is a built-in PostgreSQL role that allows a user to:

- Read all tables (SELECT)

- Read all schemas

- Read all sequences

Across all databases in the cluster

It is read-only (no INSERT/UPDATE/DELETE).

Step 1: Check if pg_read_all_data role exists



Expected result:

Role exists (PostgreSQL 14+)

If it does not exist → then PostgreSQL version is very old

Step 2: Create a read-only user

CREATE ROLE readonly_user LOGIN PASSWORD 'postgres@123';


Step 3: Grant pg_read_all_data to the user

    This instantly gives access to all existing & future tables.


Step 4: Grant CONNECT privilege on database


Step 5: Verify select(read-only) access to user readonly_user.

Select should work.


Step 6: Verify insert into a table from readonly_user.

It will not work.



Step 7: Verify pg_write_all_data role.



Step 8: create a user and grant pg_write_all_data role. Also grant connect to the database.


    


Step 9: Run insert command -

        It will work.

Step 10: Now run the select from pg_write_all_data on departments table -

    It will not work.



Step 11: If any user want select, insert, update and delete permission then we have to grant                             pg_read_all_data and pg_write_all_data both.



Authentication Methods - MD5, SCRAM-SHA-256, password, trust, peer, ident, cert

 In PostgreSQL, an authentication method defines how a client proves its identity to the database server during connection.

We have multiple authentication methods available in Postgres -

1. MD5 Authentication

md5 is the older PostgreSQL password authentication method.

When we set:

local   all   all   md5

PostgreSQL stores the user password like this:

md5<md5(username + password)>

Pros

• Supported by all old PostgreSQL clients.

• Easy to set up.

Cons (VERY IMPORTANT)

• Old and weak hashing.

• MD5 is vulnerable to:

○ hash collisions

○ brute force attacks

○ replay attacks

MD5 is deprecated, but still available for compatibility.

2. SCRAM-SHA-256 Authentication (Recommended)

scram-sha-256 is the modern, secure password authentication introduced in PostgreSQL 10.

When we set:

local   all   all   scram-sha-256

PostgreSQL stores passwords using:

Salted, hashed, iterative SHA-256 (RFC 5802 & RFC 7677)

Example stored format:

SCRAM-SHA-256$iterations:salt$data

Pros

• MUCH stronger security

• Uses salted, iterated hashing

• Resistant to:

○ brute force attacks

○ rainbow table attacks

○ replay attacks

• Password never sent in clear or reusable form

• Industry standard authentication

Cons

• Very old clients (PostgreSQL <10) do not support SCRAM.


Important Difference

MD5
• Server stores MD5-hashed password.
• Client sends MD5-hash of hash (double hash).
• Vulnerable to interception and cracking.
SCRAM-SHA-256
• Server stores a salted hash.
• Client and server perform a challenge–response exchange.
• Password is NEVER sent or derivable.
Much safer authentication handshake.


3. Password

local / host   all   all   password

What it does - 
Client sends the raw, plain-text password to PostgreSQL.

Bad because:
• Password goes over the network unencrypted (unless using SSL).
• Very insecure.
• Rarely recommended.

Use only for testing.

4. peer (local connections only)

local   all   all   peer

What it does

Checks Linux OS username == PostgreSQL database username.

Example:
• OS user: postgres
• DB user: postgres → login works
• OS user: ubuntu
• DB user: user1 → FAILS

Problem:
Users must exist in Linux system.
You cannot \c db user1 from psql unless you are OS user user1.

Note - Use only on local servers where OS users map to DB users (rare in production).

5. ident (remote equivalent of peer)

host   all   all   0.0.0.0/0   ident

What it does -

Checks username using an external ident server.

Note -  Not commonly used today. Most systems don’t run ident servers. Complicated and outdated.
        Avoid unless we have a special ident-based setup.

6. trust

local / host   all   all   trust

What it does

Allows anyone to connect without a password.

Extremely insecure

Whoever can reach the server can log in as any DB user.

Use ONLY for:
• Dev VM
• Inside Docker container
• Local development

Never use in production.

7. cert (SSL certificate authentication)

hostssl   all   all   0.0.0.0/0   cert

What it does

User authenticates using an SSL client certificate, not a password.

Requirements:
• SSL enabled
• Client presents valid certificate
• Common Name (CN) must match DB username

Most secure option for enterprises

Used in:
• Banking
• Secure corporate networks
• Zero-trust environments






Monday, December 29, 2025

Real Time Scenario - Step By Step Slow Running Query Tuning(performance Tuning)

                        Real Time Scenario  - Step By Step Slow Running Query Tuning(performance Tuning)

1) We got a requirement that below query is running slow.

SELECT creationtime, serviceId FROM (

SELECT DISTINCT ON (serviceuuid, n.siteId) e.id AS serviceId, creationtime, status 

FROM NotificationMessages n

JOIN EquinetServices e ON n.serviceuuid = e.uuid AND n.siteid = e.siteid

JOIN sites_view sv ON sv.id = e.siteid AND NOT sv.decommissioned

WHERE e.id IN (SELECT instanceId FROM Alarms WHERE ruleName=$1 AND outAlarmTime IS NULL)

AND notificationTypeId=$2

ORDER BY serviceuuid, n.siteId, creationtime DESC, n.id DESC

) notalarmed


2) Checked the tables and indexes used in this query -



3)  Got the $1 and $2 values from tables -



4) Generated the explain plan for the above query -


here we can see the total execution time is 1178 milliseconds.

5) Fix - Index creation - upon investigation, we found that below indexes are needed to improve the performance -

6) Explain plan after index creation - 



7) Summary of the issue
The query was taking approximately 1180 ms to execute. The slowness was caused by sequential scans and high sort operations on large tables such as NotificationMessages, Alarms, and Sites. This occurred because required composite indexes were missing on key filtering and join columns.

8) Root Cause of the issue -

  • No supporting indexes for serviceuuid, siteid, and creationtime ordering
  • Missing indexes on notificationtypeid filter and alarm rule filtering
  • Inefficient site-state filtering due to lack of index support
  • As a result,  query performed full-table scans and expensive sort operation.
9) Fix Applied - 

To improve the query performance, the following indexes were created:

  1. NotificationMessages Table –

                 CREATE INDEX idx_notificationmessages_type_uuid_site_ctime

                ON NotificationMessages (notificationtypeid, serviceuuid, siteid, creationtime DESC);

 

  1. Alarms Table –

                 CREATE INDEX idx_alarms_rule_out_instance

                ON Alarms (ruleName, outAlarmTime, instanceId);

 

  1. Sites Table –

                 CREATE INDEX idx_sites_notdeleted_notdecomm

                ON sites (deleted, decommissioned, id);

 

  1. EquinetServices

                 CREATE INDEX idx_equinetservices_uuid_siteid

                    ON EquinetServices(uuid, siteid);

10. Final Result
    
    

Metric

Before

After

Execution Time

1180 ms

5.862 ms

        



Also, After applying the new indexes, please run the following commands:

VACUUM ANALYZE NotificationMessages;

VACUUM ANALYZE Alarms;

VACUUM ANALYZE EquinetServices;

VACUUM ANALYZE sites;

 

These commands will refresh the table statistics so query planner can fully utilize the newly created indexes. If you don’t have permissions to run them, please let me know and can run them right after the index creation.


============================The End================================












Friday, December 26, 2025

Streaming Replication Useful Queries

How to Check Replication Status Properly

On Primary -

SELECT pid, usename, application_name, state

 --, client_addr

 , date_trunc('minute', backend_start) as backend_start

 , sent_lsn, round(((pg_current_wal_lsn()-sent_lsn)/(1024*1024.0)), 2) as mb_diff

from pg_stat_replication;


SELECT pid, state, sync_state, write_lag, flush_lag, replay_lag

FROM pg_stat_replication;


1) SELECT pg_current_wal_lsn();


2) SELECT pg_current_wal_insert_lsn();


3) WAL Switch Functions -

#SELECT pg_switch_wal();



Run this on STANDBY

SELECT 

    pg_last_wal_receive_lsn()  AS receive_lsn,

    pg_last_wal_replay_lsn()   AS replay_lsn,

    pg_wal_lsn_diff(pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn()) AS byte_lag;



1) Last received LSN

SELECT pg_last_wal_receive_lsn();

        

2) 
Last replayed LSN

        SELECT pg_last_wal_replay_lsn();

    

3) WAL difference

    SELECT pg_wal_lsn_diff(pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn());

    SELECT pg_wal_lsn_diff(pg_stat_replication.sent_lsn, pg_stat_replication.replay_lsn) as byte_lag        FROM pg_stat_replication





Missing WAL File live scenario in PostgreSQL and how to recover from it

 

            Missing WAL File live scenario in PostgreSQL and how to recover from it

If WAL files are missing on primary side then replication will break and this error message will be displayed in the log file.

Answer - 
When a standby requests a WAL segment that the primary has already recycled or which is missing from archive storage, the standby cannot continue recovery. PostgreSQL physical replication does not support skipping missing WAL files. The only recovery is to re-initialize the standby with a fresh base backup.
I simulate this scenario by stopping the standby, generating excessive WAL on the primary, manually removing old WAL files, and restarting the standby.”

On PROD-1



On Replica -



Took the fresh pg_basebackup and built the replica again.