Translate into your own language

Monday, December 30, 2024

STEP BY STEP - HOW TO SETUP PATRONI WITH ETCD AND HAPROXY

 STEP BY STEP - HOW TO SETUP PATRONI WITH ETCD AND HAPROXY

I have taken 2 CentOS EC2 instances:

Machine

Public IP

Private IP

VM1

3.91.208.95

172.31.95.231

VM2

54.146.6.100

172.31.90.199

 

Step 1 – ETCD INSTALLATION AND CONFIGURATION

-          Install etcd on VM1 and VM2.

 #sudo yum install etcd

 -          Edit the configuration file on each node at location - /etc/etcd/etcd.conf.

Below is the conf file for vm1 after edit. Make sure to uncomment as below and keep the appropriate values. If we want to add the cluster node then we have to add the IP for that in

ETCD_INITIAL_CLUSTER” parameter.

 [root@ip-172-31-95-231 etcd]# cat etcd.conf

ETCD_DATA_DIR="/var/lib/etcd/default.etcd"

ETCD_LISTEN_PEER_URLS="http://172.31.95.231:2380"

ETCD_LISTEN_CLIENT_URLS="http://172.31.95.231:2379"

ETCD_NAME="vm1"

#[Clustering]

ETCD_INITIAL_ADVERTISE_PEER_URLS="http://172.31.95.231:2380"

ETCD_ADVERTISE_CLIENT_URLS="http://172.31.95.231:2379"

ETCD_INITIAL_CLUSTER="vm1=http://172.31.95.231:2380,vm2=http://172.31.90.199:2380"

  =============Conf file for vm2=============

 [root@ip-172-31-90-199 etcd]# cat etcd.conf

ETCD_DATA_DIR="/var/lib/etcd/default.etcd"

ETCD_LISTEN_PEER_URLS="http://172.31.90.199:2380"

ETCD_LISTEN_CLIENT_URLS="http://172.31.90.199:2379"

ETCD_NAME="vm2"

#[Clustering]

ETCD_INITIAL_ADVERTISE_PEER_URLS="http://172.31.90.199:2380"

ETCD_ADVERTISE_CLIENT_URLS="http://172.31.90.199:2379"

ETCD_INITIAL_CLUSTER="vm1=http://172.31.95.231:2380,vm2=http://172.31.90.199:2380"

#ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"

 Step 2 – Start ETCD on VM1 and VM2

              # sudo systemctl start etcd

              Check the status of ETCD after stating it.

 

Step 3 – Check the cluster member from VM1 and VM2

VM1-

#ETCDCTL_API=3 etcdctl member list --write-out=table -endpoints=http://172.31.95.231:2379

VM2 –

# ETCDCTL_API=3 etcdctl member list --write-out=table --endpoints=http://172.31.90.199:2379


Step 4 - Install PostgreSQL 12 on VM1 and VM2

# sudo yum -y install https://download.postgresql.org/pub/repos/yum/reporpms/E L-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm

# sudo yum -y install epel-release yum-utils

#sudo yum-config-manager --enable pgdg12

#sudo yum install postgresql12-server postgresql12 postgresql12-contrib

Step 5 - Install Python package manager on VM1 and VM2

# sudo yum install python-pip

Step 6 - Upgrade Python package manager VM1 and VM2

# sudo pip install --upgrade "pip < 21.0"

Step 7 - Install Python setuptools

#sudo pip install --upgrade setuptools

Step 8 - Install PostgreSQL database adapter for the Python

#sudo pip install psycopg2-binary

Step 9 - Install Python library to access installed packages metadata

# sudo pip install importlib-metadata

Step 10 – Install Python wheel

# sudo pip install wheel

Step 11 - Install Patroni

# sudo pip install patroni

Step 12 - Install Python etcd client

# sudo pip install python-etcd

Step 13 – Configure patroni config file for VM1 and VM2.

I have created the config file with name pg.yml  and under /var/lib/pgsql/ location for both the nodes. It should be in proper format otherwise there will be lots issue. So I am sharing the way it should be formatted.

Pg.yml in VM1 –

                               

pg.yml  in VM2 –


                                    


Step 14 – Change the ownership of patroni config files(*.yml) to postgres:postgres

#sudo chown postgres:postgres /var/lib/pgsql/pg.yml

Step 15 – Login as a Postgres user and make the directory for Patroni on VM1 and VM2.

#su – postgres

# mkdir /var/lib/pgsql/patroni

Step 16 – Start Patroni on VM1 and VM2. Run below command as Postgres user.

===============From VM1 output should be as below============

-bash-4.2$ patroni pg.yml

The files belonging to this database system will be owned by user "postgres".

This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".

The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory /tmp/data ... ok

creating subdirectories ... ok

selecting dynamic shared memory implementation ... posix

selecting default max_connections ... 100

selecting default shared_buffers ... 128MB

selecting default time zone ... UTC

creating configuration files ... ok

running bootstrap script ... ok

performing post-bootstrap initialization ... ok

syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections

You can change this by editing pg_hba.conf or using the option -A, or

--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    /usr/pgsql-12/bin/pg_ctl -D /tmp/data -l logfile start

 2022-04-21 10:58:56.886 UTC [29955] LOG:  starting PostgreSQL 12.10 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit

2022-04-21 10:58:56.886 UTC [29955] LOG:  listening on IPv4 address "172.31.95.231", port 5432

2022-04-21 10:58:56.890 UTC [29955] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"

2022-04-21 10:58:56.907 UTC [29955] LOG:  redirecting log output to logging collector process

2022-04-21 10:58:56.907 UTC [29955] HINT:  Future log output will appear in directory "log".

172.31.95.231:5432 - rejecting connections

172.31.95.231:5432 - accepting connections

^C-bash-4.2$

 ==================From VM2 output should be as below========

-bash-4.2$ patroni pg.yml

172.31.90.199:5432 - no response

2022-04-21 11:24:11.514 UTC [4032] LOG:  starting PostgreSQL 12.10 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit

2022-04-21 11:24:11.514 UTC [4032] LOG:  listening on IPv4 address "172.31.90.199", port 5432

2022-04-21 11:24:11.517 UTC [4032] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"

2022-04-21 11:24:11.530 UTC [4032] LOG:  redirecting log output to logging collector process

2022-04-21 11:24:11.530 UTC [4032] HINT:  Future log output will appear in directory "log".

172.31.90.199:5432 - accepting connections

172.31.90.199:5432 - accepting connections

-bash-4.2$

Step 17 – Check the status of Patroni from VM1 and VM2.

# systemctl status patroni



VM2 –


Step 18 – Check if all the cluster nodes are part of Patroni configuration.

VM1 –

# patronictl -c /var/lib/pgsql/pg.yml list


VM2 –

#patronictl -c /var/lib/pgsql/pg.yml list


Step 19 – Check if Postgres cluster is started on VM1 and VM2. I had to manually start the cluster on VM2. And run the below command to check which is primary node. If it shows ‘f’ means is it primary.

-bash-4.2$ ./psql -p 5432 -h 172.31.95.231 -U postgres postgres

psql (12.10)

postgres=# select pg_is_in_recovery();

 pg_is_in_recovery

-------------------

 f

VM2 -

-bash-4.2$ ./psql -p 5432 -h 172.31.90.199 -U postgres postgres

psql (12.10)

Type "help" for help.

postgres=# select pg_is_in_recovery();

 pg_is_in_recovery

-------------------

 t

Step 20 – Perform failover testing

#[root@ip-172-31-95-231 ~]# patronictl -c /var/lib/pgsql/pg.yml failover

Candidate ['vm2'] []: vm2



Step 21 – Install HAProxy on VM1

#sudo yum install haproxy

Step 22 – Install SELinux to allow HAProxy to connect to TCP port 5432

              # sudo setsebool -P haproxy_connect_any 1

Step 23 – Configure HAProxy config file. Config file is at /etc/haproxy/ with name haproxy.cfg.

              Modify this configuration file according below.

  

Step 24 - Start the HAProxy server and check the status

#systemctl start haproxy

# systemctl status haproxy

     


             

Step 25 - Check by connecting to the HAProxy, it should route psql to the current primary.

-bash-4.2$ ./psql -p 5000 -h 172.31.95.231 -U postgres postgres

psql (12.10)

Type "help" for help.

 

postgres=#  select inet_server_addr();

 inet_server_addr

------------------

 172.31.95.231

(1 row)

VM2 is my current primary –


 ====================THE END=======================

 

Hypothetical index in Postgres

A hypothetical index in PostgreSQL is a special type of index that doesn't actually exist physically on disk but is "imagined" or "simulated" by PostgreSQL. This allows you to evaluate the potential benefits of an index without incurring the cost of creating it.

Hypothetical indexes are used primarily for performance tuning and query optimization. You can analyze how a query would behave if a particular index were to exist, helping you decide whether creating the index is worthwhile.

#Install the extension

CREATE EXTENSION hypopg ;

\dx

Create a hypothetical index

 CREATE TABLE hypo (id integer, val text) ;

INSERT INTO hypo SELECT i, 'line ' || i FROM generate_series(1, 100000) i ;

VACUUM ANALYZE hypo ;

EXPLAIN SELECT val FROM hypo WHERE id = 1;

SELECT * FROM hypopg_create_index('CREATE INDEX ON hypo (id)') ;

EXPLAIN SELECT val FROM hypo WHERE id = 1;

EXPLAIN ANALYZE SELECT val FROM hypo WHERE id = 1;

  

Manipulate hypothetical indexes

SELECT * FROM hypopg_list_indexes ;

SELECT * FROM hypopg() ;

SELECT indexname, hypopg_get_indexdef(indexrelid) FROM hypopg_list_indexes ;

SELECT hypopg_create_index('CREATE INDEX ON hypo (id, val text_pattern_ops)');

EXPLAIN SELECT * FROM hypo

 

EXPLAIN SELECT * FROM hypo

WHERE id < 1000 and val LIKE 'line 100000%';

 

How to do Index size estimation ?

SELECT indexname,pg_size_pretty(hypopg_relation_size(indexrelid))

FROM hypopg();

 

Test for size validation :

CREATE INDEX ON hypo (id);

CREATE INDEX ON hypo (id, val text_pattern_ops);

SELECT relname,pg_size_pretty(pg_relation_size(oid))

FROM pg_class WHERE relkind = 'i' AND relname LIKE '%hypo%';

  


Concurrent indexes in Postgres

A concurrent index in PostgreSQL refers to an index that is created or rebuilt in a way that allows normal operations (INSERT, UPDATE, DELETE, SELECT) to continue on the table during the process. This feature is especially useful in production environments where downtime or blocking operations need to be minimized.

 Demo -

 create table demo_table ( id int, name varchar(100));

insert into demo_table

select id.*, md5(id::varchar) from generate_series(1,200000) id;

 create index idx_id on demo_table(id);

delete demo_table where id='100000';

 postgres=#

create index idx_id on demo_table(id);

delete from demo_table where id='100000';

Concurrent index demo

drop index idx_id;

create  index concurrently idx_id on demo_table(id);

  

Note :if the creation fails it exists and needs to be dropped:

 

\d demo_table

explain select * from demo_table where id= 100000;

select relpages from pg_class where relname = 'idx_id';

Note - When a create index operations fails in concurrent mode make sure that you drop the index immediately.

One more thing to keep in mind: When you create an index concurrently and there is another session already modifying the data the create index command waits until that other operation completes: