Streaming Replication
Set up a high available PostgreSQL
setup using streaming replication and Pgpool-II
for high availability and fail over.
End Goal
The diagram below shows the set-up that will be configured in this guide.
In a first step streaming replication will be setup up, in a second step Pgpool-II
will be used for both load balancing and high availability.
//ToDo: Add Diagram
Setup
For the steps describe here the following environment will be set up:
Name | IP Address | Initial Role |
---|---|---|
db01.nomadesk.org | 192.168.122.10 | primary |
db02.nomadesk.org | 192.168.122.11 | standby |
db03.nomadesk.org | 192.168.122.12 | standby |
db.nomadesk.org | 192.168.122.20 | high available virtual IP |
The roles describe above are not fixed except for the virtual IP
The primary and standby role can change when a fail over event occurs.
The role only matters for the initial setup.
It is recommended to set up three identical machines.
When a fail over occurs each needs to be able to become the primary in the setup.
Additionally when setting up Pgpool-II
load balancing will be configured so that read queries will be done on the standby nodes and write actions on the primary.
The virtual IP
is what will make sure that even if the primary
node goes down, the cluster will stay up and running.
In the sections below next to the title it will be mentioned on what server to run the commands.
Configuring the Base OS [All]
For this guide AlmaLinux 9 will be used but can easily be modified for any other distribution.
The steps assume that the starting point is a AlmaLinux 9 minimal installation.
SELinux [All]
SELinux is out of scope for this guide where we want to focus on setting up a high available PostgreSQL
setup.
To disable SELinux:
sudo sed -i 's/enforcing/disabled/g' /etc/selinux/config
sudo setenforce 0
Firewall [All]
Configure the firewall:
sudo firewall-cmd --zone=public --add-port=5432/tcp --permanent
sudo firewall-cmd --reload
In later steps the ports for Pgpool-II
will be opened, the first step is to have the streaming replication up and running.
Install PostgreSQL
[All]
Add the EPEL
and PostgreSQL
repositories:
sudo dnf install -y elrepo-release
Most EPEL
packages require packages from PowerTools/CRB
repository as dependencies. Run the suitable command to enable the PowerTools/CRB
repository:
sudo dnf config-manager --set-enabled crb
Add the PostgreSQL
repository:
sudo dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-9-x86_64/pgdg-redhat-repo-latest.noarch.rpm
Make sure PostgreSQL
does not update automatically:
sudo dnf -qy module disable postgresql
Now install PostgreSQL
, Pgpool-II
will also be installed as this is used in the next section to set up high availability:
sudo dnf install -y postgresql16-server postgresql16 pgpool-II-pg16-*
#install some tools used in this guide
sudo dnf install -y vim rsync
Create the directory /var/lib/pgsql/archivedir
to store WAL
segments on all servers, this will be used in the postgres.conf
in a later step.
mkdir -p /var/lib/pgsql/archivedir
chown postgres:postgres /var/lib/pgsql/archivedir
Setting up the Primary PostgreSQL
server [db01]
Initialize the database on the primary (db01) node:
sudo /usr/pgsql-16/bin/postgresql-16-setup initdb
sudo systemctl enable --now postgresql-16
It is not needed to initialize the databases on the other nodes as these will be copied from the primary node with pg_basebackup
.
Setting up Replication [db01]
On the primary server (db01), modify the postgres.conf
file in /var/lib/pgsql/16/data
:
vim /var/lib/pgsql/16/data/postgresql.conf
At the bottom, add the following:
listen_addresses = '*'
archive_mode = on
archive_command = 'cp "%p" "/var/lib/pgsql/archivedir/%f"'
max_wal_senders = 10
wal_keep_size=256
wal_level = replica
wal_log_hints = on
max_replication_slots = 10
hot_standby = on
Create the replication user, by switching to the postgres
user and running the CREATE USER
query:
su - postgres
Log into the database:
psql
The query to create the user is:
SET password_encryption = 'scram-sha-256';
CREATE USER replicator REPLICATION LOGIN ENCRYPTED PASSWORD '<my password>';
Note*: Make sure to update the password in the query.
Allow access from the secondary servers by modifying the pg_hba.conf
file:
vim /var/lib/pgsql/16/data/pg_hba.conf
At the bottom, add the following:
host replication replicator 192.168.122.10/24 trust #trust db01 without password
host replication replicator 192.168.122.11/24 trust #trust db02 without password
host replication replicator 192.168.122.12/24 trust #trust db03 without password
host all all 0.0.0.0/0 scram-sha-256 #enable password authentication to this host from anywhere else
Make sure to update the IP
addresses to match your own setup.
Restart the PostgreSQL
server to apply all the new settings:
sudo systemctl restart postgresql-16
Configuring the Standby PostgreSQL
Servers [db02, db03]
Mirror the state of the primary server it's database:
Switch to the postgres
user:
su - postgres
pg_basebackup -h 192.168.122.10 -U replicator -Fp -Xs -P -R -D /var/lib/pgsql/16/data/
The pg_basebackup
command uses streaming replication to mirror the databases.
Exit the postgres
user and start the databases and enable it on boot:
sudo systemctl enable --now postgresql-16
Testing the Replication
Insert Data [db01]
Log into the database on the primary database server and create a table:
su - postgres
psql
CREATE TABLE cars (
brand VARCHAR(255),
model VARCHAR(255),
year INT
);
Insert some data:
INSERT INTO cars (brand, model, year) VALUES ('Ford', 'Mustang', 1964);
Read the Data [db02,db03]
On the standby servers also log into the database and run a select query.
su - postgres
Now run the query:
psql -c "select * from cars;"
Both of the secondary servers should return:
[postgres@secondary01 data]$ psql -c "select * from cars;"
brand | model | year
-------+---------+------
Ford | Mustang | 1964
Conclusion
Streaming replication is now set up and you have a backup in case anything goes wrong with the primary and/or standby nodes.
This does not mean the setup is high available yet as the connection string used in the application would be that of the primary only.
In the next step Pgpool-II
will be used to set up the high availability.