Translate into your own language

Tuesday, February 17, 2026

PostgreSQL - Administrator Role and SUPERUSER

 Step 1: Creating an Administrator Role with SUPERUSER Privileges (Standard PostgreSQL)

 In a self-managed PostgreSQL environment (on-premises or cloud VM), you can create an administrator role with full SUPERUSER privileges using:

 CREATE ROLE administrator SUPERUSER LOGIN PASSWORD 'StrongSecurePassword';

 This role will have:

• Full access to all databases

• Ability to create, alter, and drop any database objects

• Manage roles and permissions

• Access system catalogs and perform backups

• Enable or disable extensions

• Perform maintenance and configuration tasks

 Behavior in Amazon RDS for PostgreSQL

 In Amazon RDS, even though the master user is granted the rds_superuser role, it is NOT a true PostgreSQL superuser.

 AWS restricts certain superuser capabilities for security and managed service operations.

 Important Limitation:

Creating another user with the SUPERUSER attribute is NOT permitted in Amazon RDS, even if you are using the rds_superuser role.

 The SUPERUSER attribute is one of the restricted operations in RDS.

 Workaround / Alternatives in Amazon RDS

 

Although you cannot create a true SUPERUSER role in RDS, you can create a role with almost all administrative privileges by granting the rds_superuser role.

 Example:

 CREATE ROLE administrator NOLOGIN;

GRANT rds_superuser TO administrator;

 You can then grant this role to specific users as needed.

 

Key Takeaways

 

• True SUPERUSER roles are only available in self-managed PostgreSQL.

• Amazon RDS uses rds_superuser as a controlled alternative.

• SUPERUSER creation is blocked in RDS environments.

• Administrative access can still be delegated safely using role grants.

PostgreSQL - Read Write Role

 This guide explains how to create a PostgreSQL role that allows users to:

• Read data and schema
• View query execution plans
• Modify data (INSERT / UPDATE / DELETE) without schema changes
• Create and manage objects (CRUD) within their own schema only

Step 1: Create a Base Role (No Login)

CREATE ROLE read_write NOLOGIN;

Step 2: Grant Read Access on Shared Schema

GRANT USAGE ON SCHEMA public TO read_write;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO read_write;

Step 3: Allow Data Modification (Without Schema Changes)

GRANT INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO read_write;

Step 4: Ensure Future Tables Inherit Permissions

ALTER DEFAULT PRIVILEGES IN SCHEMA public 
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO read_write;

Step 5: Enable Query Performance Insights

ALTER ROLE read_write SET track_io_timing TO on;

Step 6: Create User with Personal Schema

CREATE ROLE rw_user LOGIN PASSWORD 'securepassword';
GRANT read_write TO rw_user;

CREATE SCHEMA AUTHORIZATION rw_user;

Conclusion

The user now has:
• Read and write access to shared schema tables
• No permission to alter shared schema structure
• Full CREATE / ALTER / DROP privileges within their own schema

This approach follows PostgreSQL security best practices by separating:
• Permission roles (NOLOGIN)
• User login roles
• Shared schema access
• Personal schema ownership

PostgreSQL - Read-Only Role

 

PostgreSQL Read-Only Role Setup 

Overview

This guide explains how to create a reusable READ_ONLY role in PostgreSQL that allows users to:

• Read data from tables

• Access schema objects

• Use temporary tables

 The approach uses a NOLOGIN role for permissions and a LOGIN role for end users. All database, schema, and user names are generic so the steps can be reused across environments.

Step 1: Create the Read-Only Role (No Login)

CREATE ROLE read_only NOLOGIN;

Explanation

A NOLOGIN role acts as a permission group. Users inherit privileges by being granted membership in this role.

Step 2: Grant Schema Usage

GRANT USAGE ON SCHEMA schema_name TO read_only;

GRANT USAGE ON SCHEMA public TO read_only;

Explanation

USAGE allows the role to see objects inside the schema.

Step 3: Grant SELECT on Existing Tables

GRANT SELECT ON ALL TABLES IN SCHEMA schema_name TO read_only;

GRANT SELECT ON ALL TABLES IN SCHEMA public TO read_only;

Explanation

This grants read access to all current tables in the schemas.

Step 4: Grant SELECT on Future Tables

ALTER DEFAULT PRIVILEGES IN SCHEMA schema_name

GRANT SELECT ON TABLES TO read_only;

Explanation

This ensures new tables created later automatically grant SELECT permission to the read_only role.

Step 5: Allow Temporary Tables (Database Level)

GRANT TEMP ON DATABASE database_name TO read_only;

Explanation

This allows users to create temporary tables, which is often required for reporting or analytics queries.

Step 6: Create a User and Assign Read-Only Role

CREATE ROLE user_name LOGIN PASSWORD 'secure_password';

GRANT read_only TO user_name;

Explanation

This creates a login user and grants membership in the read_only role.

Example: Applying to Another Schema

CREATE ROLE read_only NOLOGIN;

 

GRANT USAGE ON SCHEMA another_schema TO read_only;

 

GRANT SELECT ON ALL TABLES IN SCHEMA another_schema TO read_only;

 

ALTER DEFAULT PRIVILEGES IN SCHEMA another_schema

GRANT SELECT ON TABLES TO read_only;

 

CREATE ROLE another_user LOGIN PASSWORD 'secure_password';

GRANT read_only TO another_user;

Best Practices

• Use group roles for permission management

• Avoid granting privileges directly to users

• Periodically review role memberships

• Restrict write access unless required

• Use strong passwords or IAM/SSO authentication

Conclusion

Using a READ_ONLY role simplifies access management and improves security by centralizing permissions. This approach is scalable and recommended for production environments.

PostgreSQL - Manual VACUUM Execution Using psql

 

Overview

This guide explains how to connect to a PostgreSQL database using psql, configure session-level memory parameters, and run VACUUM ANALYZE on selected tables. All hostnames, database names, schema names, and table names are kept generic for reuse across environments.

1. Connect to the Database

psql -h <db-host> -U <username> -d <database_name>

Example Output

psql (client_version, server_version)

WARNING: psql major version mismatch may show this message.

SSL connection established.

Type "help" for help.

2. Execute a SQL Script (Optional)

\i /path/to/script.sql

Explanation

This command executes SQL commands stored in a file.

3. Enable Timing

\timing on

Explanation

This shows execution time for each command.

4. Set Session Memory Parameters

SET work_mem = '64MB';

SET maintenance_work_mem = '10GB';

Explanation

work_mem is used for query operations like sorting and hashing.

maintenance_work_mem is used for maintenance operations such as VACUUM and CREATE INDEX.

Increasing maintenance_work_mem can significantly improve VACUUM performance for large tables.

5. Validate Session Time

SELECT now();

6. Run VACUUM ANALYZE on Tables

VACUUM ANALYZE schema_name.table1;

VACUUM ANALYZE schema_name.table2;

VACUUM ANALYZE schema_name.table3;

VACUUM ANALYZE schema_name.table4;

VACUUM ANALYZE schema_name.table5;

VACUUM ANALYZE schema_name.table6;

VACUUM ANALYZE schema_name.table7;

Explanation

VACUUM removes dead tuples and frees space, while ANALYZE updates planner statistics.

Running both together ensures optimal query performance.

Best Practices

• Run during maintenance windows for large tables

• Increase maintenance_work_mem temporarily for large operations

• Monitor disk I/O and CPU during execution

• Avoid running many VACUUM operations concurrently

• Use autovacuum for routine maintenance

Conclusion

Manual VACUUM ANALYZE is useful for large tables or after heavy data modifications. Proper memory tuning and monitoring help reduce execution time and improve performance.

PostgreSQL - Autovacuum Tuning and Troubleshooting

 

Introduction

This document explains important PostgreSQL autovacuum parameters, how to monitor autovacuum activity, memory tuning strategies, and steps for performing manual VACUUM FREEZE operations. All examples use generic database, schema, and table names so the guide can be reused in any environment.

1. Important Autovacuum Parameters

SELECT short_desc, name, setting, unit

FROM pg_settings

WHERE name IN (

'autovacuum_max_workers',

'autovacuum_analyze_threshold',

'autovacuum_analyze_scale_factor',

'autovacuum_vacuum_threshold',

'autovacuum_vacuum_scale_factor',

'maintenance_work_mem',

'autovacuum_naptime',

'vacuum_cost_limit',

'autovacuum_freeze_max_age',

'vacuum_freeze_min_age');

2. Check If Autovacuum Is Running (Performance or I/O Issues)

SELECT datname,

       usename,

       pid,

       state,

       current_timestamp - xact_start AS xact_runtime,

       query

FROM pg_stat_activity

WHERE upper(query) LIKE '%VACUUM%'

ORDER BY xact_start;

Terminate Long Running Autovacuum (If Required)

SELECT pg_terminate_backend(<PID>);

3. Why Autovacuum Runs for a Long Time

One common reason is insufficient memory allocation, particularly a very small maintenance_work_mem setting.

4. What Is maintenance_work_mem

Memory allocated for maintenance operations such as VACUUM and CREATE INDEX. It stores row identifiers that are candidates for cleanup during vacuum.

5. maintenance_work_mem Details

• Default value: 64MB

• Higher values improve vacuum and index build performance

• Usually safe to set higher than work_mem

6. autovacuum_work_mem

When autovacuum runs, memory may be allocated up to autovacuum_max_workers times this value. VACUUM can use up to 1GB memory for dead tuple identifiers.

7. Memory Allocation Recommendations

• Large servers: maintenance_work_mem up to 2GB

• Very large workloads: 2GB–4GB

• Tune based on workload

8. Verify Databases Requiring Vacuum (Wraparound Risk)

SELECT datname,

       age(datfrozenxid)

FROM pg_database

ORDER BY age(datfrozenxid) DESC;

Check Current Setting

SHOW autovacuum_freeze_max_age;

9. Performing Manual VACUUM FREEZE

In addition to autovacuum, manual VACUUM FREEZE should be performed periodically on high-churn tables during maintenance windows.

Identify Running Autovacuum

SELECT datname,

       usename,

       pid,

       current_timestamp - xact_start AS xact_runtime,

       query

FROM pg_stat_activity

WHERE upper(query) LIKE '%VACUUM%'

ORDER BY xact_start;

10. Increase Memory for Large Tables (Session Level)

SET maintenance_work_mem = '2GB';

SET maintenance_work_mem = '128MB';

Run Manual Vacuum

\timing on

VACUUM FREEZE VERBOSE schema_name.table_name;

Terminate Autovacuum If Needed

SELECT pg_terminate_backend(<PID>);

Conclusion

Proper tuning of autovacuum memory and monitoring processes is essential for PostgreSQL performance and stability.

PostgreSQL - Aggressive Vacuuming

 

Aggressive Vacuuming

Overview

This guide explains how to monitor dead tuples in PostgreSQL tables and tune autovacuum settings to improve performance. All schema names, table names, and database references are kept generic for reuse across environments.

1. Identify Tables with High Dead Tuples

SELECT schemaname,

       relname,

       last_vacuum,

       last_analyze,

       vacuum_count,

       analyze_count,

       last_autoanalyze,

       last_autovacuum,

       autovacuum_count,

       autoanalyze_count,

       n_dead_tup

FROM pg_stat_all_tables

WHERE n_dead_tup > 10000

ORDER BY n_dead_tup DESC

LIMIT 50;

Explanation

This query identifies the top tables with significant dead tuples, which may require manual vacuuming or autovacuum tuning.

2. Tune Autovacuum Settings for a Specific Table

ALTER TABLE schema_name.table_name

SET (autovacuum_vacuum_scale_factor = 0.0);

 

ALTER TABLE schema_name.table_name

SET (autovacuum_vacuum_threshold = 10000);

 

ALTER TABLE schema_name.table_name

SET (autovacuum_analyze_scale_factor = 0.0);

 

ALTER TABLE schema_name.table_name

SET (autovacuum_analyze_threshold = 10000);

Explanation

These settings ensure autovacuum triggers based on a fixed threshold instead of percentage growth, which is useful for large or frequently updated tables.

3. Generate Vacuum Commands Dynamically

SELECT 'VACUUM VERBOSE ANALYZE ' || schemaname || '.' || relname || ';'

FROM pg_stat_all_tables

WHERE n_dead_tup > 0

ORDER BY n_dead_tup DESC

LIMIT 50;

Explanation

This query generates VACUUM commands for tables with dead tuples, allowing administrators to execute maintenance efficiently.

4. Example: Applying Autovacuum Settings to Another Table

ALTER TABLE schema_name.another_table

SET (autovacuum_vacuum_scale_factor = 0.0);

 

ALTER TABLE schema_name.another_table

SET (autovacuum_vacuum_threshold = 10000);

 

ALTER TABLE schema_name.another_table

SET (autovacuum_analyze_scale_factor = 0.0);

 

ALTER TABLE schema_name.another_table

SET (autovacuum_analyze_threshold = 10000);

Best Practices

• Monitor pg_stat_all_tables regularly

• Tune autovacuum per table for high-write workloads

• Use VACUUM VERBOSE to observe maintenance behavior

• Avoid aggressive settings on small tables

• Ensure autovacuum workers and memory settings are adequate

Conclusion

Proper monitoring and tuning of autovacuum significantly improves PostgreSQL performance and prevents table bloat. This guide provides a reusable approach for maintaining healthy database tables.

PostgreSQL - Users and Roles Queries - \du+ in DBever

 

PostgreSQL Users and Roles Queries

Introduction

In PostgreSQL, there is no strict distinction between users and roles. Everything is a role; a user is simply a role that has the LOGIN privilege (rolcanlogin = true). This document provides practical SQL queries to audit roles, role memberships, and privileges in a PostgreSQL database environment. All names are kept generic for reuse.

Key Concept

Users: Roles with LOGIN privilege (rolcanlogin = true)

Groups/Roles: Roles without LOGIN privilege (rolcanlogin = false)

A role can be a member of other roles, inheriting permissions.

1.      List All Roles with Attributes and Membership

 

SELECT r.rolname,

       r.rolsuper,

       r.rolinherit,

       r.rolcreaterole,

       r.rolcreatedb,

       r.rolcanlogin,

       r.rolconnlimit,

       r.rolvaliduntil,

       ARRAY(

           SELECT b.rolname

           FROM pg_catalog.pg_auth_members m

           JOIN pg_catalog.pg_roles b ON (m.roleid = b.oid)

           WHERE m.member = r.oid

       ) AS memberof,

       r.rolreplication,

       r.rolbypassrls

FROM pg_catalog.pg_roles r

WHERE r.rolname !~ '^pg_'

ORDER BY 1;

Explanation

This query lists all non-system roles and shows:

• Superuser privileges

• Database creation privileges

• Login capability

• Connection limits

• Membership in other roles

• Replication privileges

• Row-level security bypass capability

2. List Users and Their Granted Roles

SELECT

    u.rolname AS username,

    r.rolname AS granted_role

FROM pg_auth_members m

JOIN pg_roles u ON m.member = u.oid

JOIN pg_roles r ON m.roleid = r.oid

WHERE u.rolcanlogin = true

ORDER BY u.rolname, r.rolname;

Explanation

This query shows which roles are granted to login users only, making it useful for access audits.

3. Get Roles Assigned to a Specific User

SELECT roleid::regrole,

       member::regrole

FROM pg_auth_members

WHERE member = 'username'::regrole;

Explanation

Replace 'username' with the target user. This query returns all roles granted to that specific user.

4. Alternative Query — User Membership View

SELECT

    r.oid AS role_id,

    r.rolname,

    r.rolcanlogin,

    COALESCE(b.rolname, '') AS memberof

FROM pg_catalog.pg_roles r

LEFT JOIN pg_catalog.pg_auth_members m ON r.oid = m.member

LEFT JOIN pg_catalog.pg_roles b ON m.roleid = b.oid

WHERE r.rolcanlogin = true

  AND r.rolname !~ '^pg_';

Explanation

This query focuses only on login roles (users) and shows which group roles they belong to.

Common Audit Use Cases

• Security audits

• Access reviews

• Compliance checks

• Migration planning

• Privilege troubleshooting

Best Practices

• Avoid granting superuser unless absolutely required

• Use group roles for permission management

• Periodically audit role memberships

• Monitor replication and bypass RLS privileges

• Remove unused roles

Conclusion

Understanding PostgreSQL roles and memberships is essential for secure database administration. The queries provided in this guide help administrators audit and manage access efficiently.

Monday, February 16, 2026

PostgreSQL - Step by Step Logical Replication Setup

 

Prerequisites

• PostgreSQL version 10 or higher on both servers

• Network connectivity between source and target servers

• wal_level must be set to logical on the source server

• Superuser or replication privileges available

1. Configure Source Server

1.1 Ensure wal_level is logical (postgresql.conf):

wal_level = logical

max_replication_slots = 10

max_wal_senders = 10

 

Reload configuration:

SELECT pg_reload_conf();

 

1.2 Allow target server to connect (pg_hba.conf):

host replication replication_user <TARGET_IP>/32 md5

host all replication_user <TARGET_IP>/32 md5

 

Reload configuration:

SELECT pg_reload_conf();

 

1.3 Create replication user:

CREATE USER replication_user WITH REPLICATION LOGIN PASSWORD 'StrongPassword';

 

1.4 Grant SELECT on tables to replication user:

GRANT SELECT ON TABLE schema_name.table1 TO replication_user;

GRANT SELECT ON TABLE schema_name.table2 TO replication_user;

GRANT USAGE ON SCHEMA schema_name TO replication_user;

2. Create Publication on Source

CREATE PUBLICATION publication_name FOR TABLE table1, table2;

 

Verify:

\dRp+ publication_name

3. Ensure Tables Have Primary Keys

Logical replication requires primary keys or unique indexes.

 

Check tables:

\d table1

\d table2

 

If missing:

ALTER TABLE table1 ADD PRIMARY KEY (id);

ALTER TABLE table2 ADD PRIMARY KEY (id);

4. Prepare Target Server

Create same tables on target with identical structure (excluding constraints/triggers).

 

Using schema-only dump:

pg_dump -s -t table1 -t table2 source_db > schema.sql

psql target_db < schema.sql

5. Create Subscription on Target

CREATE SUBSCRIPTION subscription_name

CONNECTION 'host=<SOURCE_IP> port=5432 dbname=source_db user=replication_user password=StrongPassword'

PUBLICATION publication_name;

 

This will:

• Copy initial data

• Start ongoing replication

6. Check Replication Status

On target:

SELECT * FROM pg_subscription;

SELECT * FROM pg_stat_subscription;

 

On source:

SELECT * FROM pg_stat_replication;

7. What Gets Replicated Automatically

INSERT → Yes

UPDATE → Yes

DELETE → Yes

8. Add More Tables Later

On source:

ALTER PUBLICATION publication_name ADD TABLE new_table;

 

On target:

ALTER SUBSCRIPTION subscription_name REFRESH PUBLICATION;

9. Remove Replication

On target:

DROP SUBSCRIPTION subscription_name;

 

On source:

DROP PUBLICATION publication_name;

Conclusion

This guide provides a generic approach for configuring logical replication between PostgreSQL servers. Following these steps ensures reliable data synchronization between environments.

PostgreSQL - File transfer from local to remote and vice-verca

 

Secure File Transfer Between Windows and Linux Using SCP

Overview

This document explains how to securely transfer files between Windows and Linux servers using SCP (Secure Copy Protocol). It includes examples for downloading files from a Linux server to a Windows machine and uploading files from Windows to Linux. All server names, database names, schema names, and table names are kept generic for reuse across environments.

 

SCP uses SSH for secure file transfer and is commonly used by database administrators for moving logs, reports, backups, and configuration files.

Prerequisites

• SSH access to the Linux server

• SCP available on Windows (Git Bash, PowerShell with OpenSSH, or WSL)

• Proper file permissions on the Linux server

• Network connectivity between systems

Download File from Linux Server to Windows

Example command:

 

scp user@linux-server:/path/to/file /c/Users/youruser/Downloads/

 

Explanation:

This command copies a file from the Linux server to the Windows Downloads directory.

Upload File from Windows to Linux Server

Example command:

 

scp /c/Users/youruser/Downloads/file.zip user@linux-server:/home/user

 

Explanation:

This uploads a file from the Windows machine to the Linux server home directory.

Common Use Case — Copy Logs with Permission Restrictions

In many environments, log files are owned by root or restricted users. In such cases, first copy the file to a user-accessible directory before downloading.

Step 1: Copy File to Home Directory (Linux Server)

As root or sudo user:

 sudo su -

cp /var/log/application/logfile.log /home/user/

chown user:user /home/user/logfile.log

 

Explanation:

This ensures the file has proper permissions for SCP transfer.

Step 2: Download File from Windows

From Windows (Git Bash or PowerShell):

 

scp user@linux-server:/home/user/logfile.log /c/Users/youruser/Downloads/

 

Explanation:

This safely downloads the file to the Windows machine.

Upload Configuration or Application Files to Linux

Example:

 

scp /c/Users/youruser/Downloads/config.properties user@linux-server:/home/user

 

Explanation:

This is commonly used for transferring configuration files or application packages.

Copy Multiple Files Using Wildcards

Example:

 

scp user@linux-server:/home/user/logfile* /c/Users/youruser/Downloads/

 

Explanation:

Wildcards allow transferring multiple matching files in one command.

Best Practices

• Use home directory as a staging area for transfers

• Verify file permissions before copying

• Use SSH keys instead of passwords when possible

• Compress large files before transfer

• Validate file size after transfer

Troubleshooting

Permission Denied:

Ensure file ownership and read permissions are correct.

 

Connection Timeout:

Verify network connectivity and firewall rules.

 

File Not Found:

Check the correct file path on the server.

Conclusion

SCP provides a simple and secure way to transfer files between systems. Following proper permission handling and staging practices ensures smooth and reliable transfers in production environments.

PostgreSQL - pg_repack Installation

Overview

This document explains how to install the pg_repack utility on a Linux system and enable the pg_repack extension in a PostgreSQL database. All database, schema, and table names are kept generic for reuse across environments.

 pg_repack is used to remove table and index bloat without requiring long locks, making it suitable for production systems.

Step 1: Check Installed PostgreSQL Packages

dpkg -l | grep postgresql

 

Explanation:

This command lists all installed PostgreSQL-related packages on the server.

Step 2: Configure PostgreSQL Repository

sudo rm -f /etc/apt/sources.list.d/pgdg.list

 

echo "deb http://apt-archive.postgresql.org/pub/repos/apt focal-pgdg main" | sudo tee /etc/apt/sources.list.d/pgdg.list

 

curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/pgdg-archive.gpg

 

sudo apt update

 

Explanation:

These steps configure the PostgreSQL package repository and refresh package metadata.

Step 3: Install pg_repack Package

sudo apt install postgresql-XX-repack

 

Explanation:

Replace XX with your PostgreSQL major version (for example 14, 15, etc.).

Alternative: Build pg_repack from Source (If Package Installation Pulls PostgreSQL Server)

sudo apt update

sudo apt install build-essential libpq-dev postgresql-server-dev-XX

 

cd /tmp

git clone https://github.com/reorg/pg_repack.git

cd pg_repack

make

sudo make install

 

Explanation:

This method compiles pg_repack manually when package dependencies are not suitable.

Step 4: Verify Installation

pg_repack --version

 

Expected Output:

pg_repack X.X.X

Step 5: Enable pg_repack Extension in Database

sudo su - postgres

psql

\c database_name

 

CREATE EXTENSION pg_repack;

 

Explanation:

This creates the pg_repack extension inside the selected database so it can be used.

Step 6: Verify Extension Installation

SELECT * FROM pg_extension WHERE extname = 'pg_repack';

 

Explanation:

This query confirms that the extension is successfully installed.

Common pg_repack Usage Example

pg_repack -h hostname -p port -U username -d database_name -t schema_name.table_name

 

Explanation:

This command rebuilds a specific table to remove bloat.

Prerequisites and Notes

• Database user must have sufficient privileges

• Tables should have a primary key or unique index

• Ensure enough disk space for temporary objects

• Monitor locks during execution

Conclusion

pg_repack is a powerful tool for online table maintenance in PostgreSQL environments. Proper installation and verification ensure safe execution in production systems.


PostgreSQL - Log Analysis Using pgBadger


Overview

This document explains how to install pgBadger and generate PostgreSQL performance reports from log files. All database, schema, and table names are kept generic so the guide can be reused across environments.

 

pgBadger is a powerful PostgreSQL log analyzer that converts database logs into detailed HTML reports, helping identify slow queries, errors, connections, locks, and workload patterns.

Prerequisites

• Linux server with PostgreSQL logs enabled

• Access to PostgreSQL log directory

• Perl installed (pgBadger dependency)

• sudo/root privileges for installation

Step 1: Download pgBadger

Navigate to a working directory and download the pgBadger package.

 

cd /opt

curl -L -o pgbadger.tar.gz https://github.com/darold/pgbadger/archive/refs/tags/v12.4.tar.gz

 

Explanation:

This command downloads a specific pgBadger version from GitHub and saves it locally.

Step 2: Extract the Package

tar xzf pgbadger.tar.gz

 

Explanation:

This extracts the downloaded archive into a directory such as pgbadger-12.4.

Step 3: Install pgBadger Binary

Copy the executable to a system-wide path.

 

sudo cp /opt/pgbadger-12.4/pgbadger /usr/local/bin/

sudo chmod +x /usr/local/bin/pgbadger

 

Explanation:

This makes pgBadger available globally so it can be executed from anywhere.

Step 4: Verify Installation

pgbadger --version

 

Expected Output:

pgBadger version X.X

PostgreSQL Logging Requirements

Ensure PostgreSQL logging is configured properly for meaningful reports.

 

Recommended parameters in postgresql.conf:

logging_collector = on

log_min_duration_statement = 0

log_line_prefix = '%t [%p]: [%l-1] db=%d,user=%u,app=%a,client=%h'

log_checkpoints = on

log_connections = on

log_disconnections = on

log_lock_waits = on

 

Explanation:

These settings capture detailed query and connection information required for analysis.

Generate Report — Last 24 Hours

pgbadger /path/to/logfile   --begin "$(date -d '1 day ago' '+%Y-%m-%d %H:%M:%S')"   --end   "$(date '+%Y-%m-%d %H:%M:%S')"   --jobs 8   --sample 10   -o /path/to/output/report_last24hrs.html

Explanation of Parameters

--begin / --end:

Defines the time window for analysis.

 

--jobs:

Number of parallel workers to speed up processing.

 

--sample:

Samples queries for performance statistics instead of processing every occurrence.

 

-o:

Output HTML report file.

Generate Report for Specific Time Window

/usr/local/bin/pgbadger   -p '%t [%p]: [%l-1] db=%d,user=%u,app=%a,client=%h'   /path/to/logfile*   -b 'YYYY-MM-DD HH:MI:SS'   -e 'YYYY-MM-DD HH:MI:SS'   -o report_specific_window.html

Explanation

-p:

Specifies log_line_prefix format so pgBadger can correctly parse logs.

 

-b / -e:

Defines begin and end timestamps for the report.

 

Wildcard (*):

Allows processing multiple rotated log files.

Example — Analytical Environment

pgbadger /path/to/logfile-YYYY-MM-DD*   --begin "YYYY-MM-DD HH:MI:SS"   --end   "YYYY-MM-DD HH:MI:SS"   --jobs 8   --sample 10   -p '%t [%p]: [%l-1] db=%d,user=%u,app=%a,client=%h'   -o /path/to/output/report.html

Best Practices

• Run pgBadger during low system load when processing large logs

• Archive old logs before analysis

• Use parallel jobs for faster report generation

• Monitor disk space for HTML output

• Schedule periodic reports using cron jobs

Common Use Cases

• Identify slow queries

• Analyze peak load periods

• Detect locking and contention

• Monitor connection spikes

• Investigate production incidents

Conclusion

pgBadger provides deep insights into PostgreSQL workload behavior through log analysis. Regular report generation helps database administrators proactively detect performance issues and optimize database operations.

PostgreSQL - Logical Replication Reset

 

Introduction – Logical Replication Reset

A logical replication reset is performed when replication is not functioning correctly or when the source and target databases become out of sync. This may occur due to replication lag, failed table synchronization, replication slot issues, or configuration problems.

 

The reset process involves stopping the existing replication, cleaning up the old configuration, truncating subscriber data if required, and recreating the replication so that data can synchronize again correctly from the publisher (source) to the subscriber (target) database.

 

This document describes the basic steps and checks required to safely reset logical replication using generic database, schema, and table names.

1. Host Details

Publisher Host (Source):

• HOSTNAME_OR_IP_PUBLISHER

 Subscriber Host (Target):

• HOSTNAME_OR_IP_SUBSCRIBER

2. Drop the Existing Subscription (On Subscriber)

Drop the subscription on the subscriber database.

 DROP SUBSCRIPTION subscription_name;

 

Important Subscription Attributes:

• subname → Name of the logical replication subscription.

• subenabled → Indicates whether subscription is enabled.

• subconninfo → Connection details of the publisher database.

• subslotname → Logical replication slot name on publisher.

• subpublications → Publication used by the subscription.

• subsynccommit → Usually off for logical replication.

3. Verify and Remove Replication Slot (On Publisher)

After dropping the subscription, verify whether the replication slot still exists on the publisher.

 

SELECT slot_name FROM pg_replication_slots;

 

If the slot still exists, drop it manually:

 

SELECT pg_drop_replication_slot('subscription_name');

4. Truncate Tables on Subscriber

Before reinitializing replication, truncate the tables on the subscriber to avoid conflicts.

 

You can retrieve the list of tables included in the publication using the following query:

Query to List Tables in Publication

SELECT

    p.pubname,

    n.nspname AS schema_name,

    c.relname AS table_name

FROM pg_publication p

JOIN pg_publication_rel pr ON p.oid = pr.prpubid

JOIN pg_class c ON pr.prrelid = c.oid

JOIN pg_namespace n ON c.relnamespace = n.oid

WHERE p.pubname = 'publication_name'

ORDER BY schema_name, table_name;

5. Create a New Subscription

Recreate the subscription on the subscriber database.

 

CREATE SUBSCRIPTION subscription_name

CONNECTION 'host=PUBLISHER_HOST port=5432 dbname=database_name user=replication_user password=PASSWORD'

PUBLICATION publication_name

WITH (

    copy_data = true,

    create_slot = true,

    enabled = true

);

Parameter Explanation

copy_data = true

→ Copies existing data from publisher to subscriber during initialization.

 

create_slot = true

→ Automatically creates a replication slot on the publisher.

 

enabled = true

→ Starts replication immediately after creation.

6. Monitor Replication Status (On Publisher)

Monitor replication until the subscriber catches up.

 

SELECT

    pid,

    usename,

    application_name,

    state,

    date_trunc('minute', backend_start) AS backend_start,

    sent_lsn,

    ROUND(((pg_current_wal_lsn() - sent_lsn) / (1024 * 1024.0)), 2) AS mb_lag

FROM pg_stat_replication;

Best Practices

• Perform replication reset during a maintenance window.

• Ensure network connectivity between publisher and subscriber.

• Validate row counts after replication completes.

• Monitor replication slots to prevent WAL accumulation.

• Confirm that replication user has required privileges.

Conclusion

Resetting logical replication is a safe and effective way to recover from replication issues when performed with proper validation and cleanup steps. Following a structured approach ensures minimal downtime and consistent data synchronization between environments.

PostgreSQL - Maintenance Plan: Autovacuum Tuning and Configuration Updates

Overview

This document outlines the planned maintenance activities for OLTP and OLAP PostgreSQL environments. The tasks include dropping obsolete tables, updating autovacuum settings at table and database levels, and applying memory and configuration changes where required.

 All database, schema, and table names have been kept generic for reuse across environments.

Maintenance Scope

1. Dropping old or unused tables in both OLTP and OLAP environments.

2. Updating autovacuum settings in OLTP and OLAP databases.

3. Updating memory and other configuration parameters in OLAP environment.

4. Reloading configuration after applying changes.

Autovacuum Parameters Explained

autovacuum_vacuum_scale_factor:

This parameter defines the fraction of table size (in terms of number of tuples) that must be updated or deleted before an autovacuum is triggered. Setting it to 0.0 disables percentage-based triggering, meaning only the threshold value will control when vacuum runs.

 autovacuum_vacuum_threshold:

This specifies the minimum number of dead tuples that must accumulate before autovacuum starts. Lower values cause vacuum to run more frequently, which is useful for high-write tables to prevent bloat.

 autovacuum_analyze_scale_factor:

Similar to vacuum scale factor, but used for ANALYZE operations. It determines how much data change must occur before PostgreSQL updates table statistics used by the query planner.

autovacuum_analyze_threshold:

Minimum number of inserted, updated, or deleted rows required before auto-analyze runs. Lower values ensure statistics stay fresh, improving query plans.

Why Set Scale Factor to 0.0?

For very large tables, percentage-based triggering can delay vacuum significantly. Setting the scale factor to 0.0 ensures autovacuum runs based only on a fixed threshold, providing predictable maintenance behavior.

Autovacuum Configuration — OLTP Environment (Table Level)

ALTER TABLE schema_name.table_name

SET (

    autovacuum_vacuum_scale_factor = 0.0,

    autovacuum_vacuum_threshold = 500000,

    autovacuum_analyze_scale_factor = 0.0,

    autovacuum_analyze_threshold = 250000

);

Example for High-Volume Table

ALTER TABLE schema_name.large_transaction_table

SET (

    autovacuum_vacuum_scale_factor = 0.0,

    autovacuum_vacuum_threshold = 2000000,

    autovacuum_analyze_scale_factor = 0.0,

    autovacuum_analyze_threshold = 1000000

);

Autovacuum Configuration — OLTP Environment (Database Level)

ALTER DATABASE database_name

SET (autovacuum_vacuum_scale_factor = 0.0);

 

ALTER DATABASE database_name

SET (autovacuum_vacuum_threshold = 25000);

 

ALTER DATABASE database_name

SET (autovacuum_analyze_scale_factor = 0.0);

 

ALTER DATABASE database_name

SET (autovacuum_analyze_threshold = 1000);

Autovacuum Configuration — OLAP Environment (Table Level)

ALTER TABLE schema_name.reporting_table

SET (

    autovacuum_vacuum_scale_factor = 0.0,

    autovacuum_vacuum_threshold = 2500000,

    autovacuum_analyze_scale_factor = 0.0,

    autovacuum_analyze_threshold = 1000000

);

Additional OLAP Tables Example

ALTER TABLE schema_name.analytics_table

SET (

    autovacuum_vacuum_scale_factor = 0.0,

    autovacuum_vacuum_threshold = 500000,

    autovacuum_analyze_scale_factor = 0.0,

    autovacuum_analyze_threshold = 750000

);

Autovacuum Configuration — OLAP Environment (Database Level)

ALTER DATABASE database_name

SET (autovacuum_vacuum_scale_factor = 0.0);

 

ALTER DATABASE database_name

SET (autovacuum_vacuum_threshold = 500);

 

ALTER DATABASE database_name

SET (autovacuum_analyze_scale_factor = 0.0);

 

ALTER DATABASE database_name

SET (autovacuum_analyze_threshold = 250);

Reload PostgreSQL Configuration

SELECT pg_reload_conf();

Best Practices

• Perform changes during a maintenance window.

• Monitor autovacuum activity after applying new settings.

• Validate table bloat and dead tuples before and after changes.

• Adjust thresholds based on workload characteristics.

• Ensure sufficient I/O capacity before aggressive vacuum tuning.

Conclusion

Proper autovacuum tuning is essential for maintaining database performance, especially in high-write OLTP systems and large OLAP workloads. Applying table-specific thresholds combined with database-level defaults helps prevent table bloat, improves query performance, and ensures consistent maintenance behavior.



PostgreSQL - Table Migration Using CSV (Legacy to Current Environment)

 Overview

This document describes the step-by-step process to extract data from a legacy PostgreSQL environment, transfer it to a new production environment, and load it into an audit table. The migration includes handling a range datatype (tstzrange) for a time period column.

Source Requirement

Extract the following columns from the legacy table:

• id

• modified_date

• created_date

• organization_id

• template_id

• time_template_id

• time_period

Step 1: Export Required Columns to CSV

\copy (SELECT id, modified_date, created_date, organization_id, template_id, time_template_id, time_period FROM schema_name.source_table) TO '/opt/data_backup/data_audit.csv' CSV HEADER;

Step 2: Compress the Export File

tar -czf /opt/data_backup/data_audit.tar.gz -C /opt/data_backup data_audit.csv

Step 3: Copy File to New Production Server

Copy the compressed file from the legacy server to the new production server using a secure file transfer method such as scp or rsync.

Step 4: Unzip the File on the Target Server

tar -xzf /opt/data_backup/data_audit.tar.gz -C /opt/data_backup

Step 5: Validate Column Data Types in Source Table

Before creating the target table, verify the column data types in the source database to ensure compatibility during import.

Step 6: Create Audit Table in Target Environment

CREATE TABLE generic_tstzrange_audit (

    id BIGINT,

    modified_date TIMESTAMPTZ,

    created_date TIMESTAMPTZ,

    organization_id BIGINT,

    template_id BIGINT,

    time_template_id BIGINT,

    time_period tstzrange

);

Step 7: Import CSV Data into Audit Table

\copy generic_tstzrange_audit FROM '/opt/data_backup/data_audit.csv' CSV HEADER;

Best Practices

• Ensure sufficient disk space before export.

• Validate row counts before and after migration.

• Use transactions where applicable.

• Confirm timezone consistency when working with TIMESTAMPTZ and tstzrange types.

• Perform the activity during a maintenance window if the dataset is large.

Conclusion

This approach provides a reliable way to audit and migrate data between PostgreSQL environments while preserving complex datatypes such as tstzrange. Proper validation at each step ensures data integrity and minimizes risk.

PosgreSQL - Bulk Updating PostgreSQL Table Using CSV and Temporary Table

-------------- Bulk Updating PostgreSQL Table Using CSV and Temporary Table-----------------------------

Introduction

In this document, we walk through a practical approach to bulk update records in a PostgreSQL table using a CSV file. This method is particularly useful when dealing with large datasets (millions of rows) where direct updates may be inefficient or error-prone.

 The scenario involves updating columns in a target table using data provided in a CSV file.

Step 1: Create a Temporary Staging Table

We first create a temporary table to load the CSV data. Temporary tables are session-specific and automatically dropped when the session ends, making them safe for intermediate operations.

 CREATE TEMP TABLE tmp_target_table (

    id bigint,

    column_a varchar(45),

    column_b bigint

);

Step 2: Load CSV Data Using \copy Command

Since the CSV file resides on the client machine, we use the \copy command in psql, which reads the file from the client side.

 \copy tmp_target_table

FROM 'C:/path/to/your/file.csv'

DELIMITER ','

CSV HEADER;

Step 3: Validate Loaded Data

Always verify the data count after loading.

 

SELECT count(*) FROM tmp_target_table;

Step 4: Verify Matching Records with Target Table

Before performing updates, ensure that all IDs exist in the target table.

 

SELECT count(*)

FROM tmp_target_table s

JOIN schema_name.target_table t

ON t.id = s.id;

 

If needed, check for any missing records:

 

SELECT s.*

FROM tmp_target_table s

LEFT JOIN schema_name.target_table t

ON t.id = s.id

WHERE t.id IS NULL;

Step 5: Perform Bulk Update

Now update the target table using the staging table.

 

UPDATE schema_name.target_table t

SET

    column_a = s.column_a,

    column_b = s.column_b

FROM tmp_target_table s

WHERE t.id = s.id;

Step 6: Post-Update Validation

Finally, confirm that updates were applied correctly.

 

SELECT count(*)

FROM schema_name.target_table t

JOIN tmp_target_table s

ON t.id = s.id;

Key Benefits of This Approach

• Efficient for millions of records

• Minimal locking compared to row-by-row updates

• Easy validation before applying changes

• Safe rollback if wrapped inside a transaction

Best Practices

Always test in a lower environment first.

Wrap the update inside a transaction for safety:

 BEGIN;

 -- update statement

 COMMIT;

 Consider indexing the staging table if join performance is slow:

 CREATE INDEX idx_tmp_target_table_id ON tmp_target_table(id);

 Run VACUUM ANALYZE after large updates if needed.

Conclusion

Using a temporary table with the \copy command is one of the most reliable and performant ways to perform bulk updates in PostgreSQL. This method ensures data integrity while handling large volumes efficiently.