This repository contains code for the setup of a Postgresql Database Cluster with a Master server and 2 replicas. It also uses HAProxy to implement loadbalancing and monitoring within the cluster.
The following technologies were used in the set up:
- Google Cloud: The whole infrastructure for the project is built on Google Cloud.
- Terraform: Terraform makes it easy to spin up and tear down complex infrastructure. It does this through Infrastructure As Code (IAC). The infrastructure for this project is built with Terraform.
- Postgres: Postgres is an extensible Object Relation Database Management System (ORDBMS). The databases used in this project are Postgres databases.
- HAProxy: HAProxy is a free, fast, and efficient open source software that provides high availability loadbalancing and proxy servers for TCP and HTTP-based applications. HAProxy is used to loadbalance and monitor the cluster.
- Create an account on Google Cloud. Skip this step if you already have an account
- Go to https://console.cloud.google.com
- Click on the drop down shown in the image below
- From the modal that pops up, copy the project ID of the project you want to create the instance in and paste it somewhere. For a new Google Cloud account, a default project is automatically created for you. You can also create a new project by clicking on the NEW PROJECT button in the top right corner of the screen.
a. Click on the menu icon > APIs and Services > Credentials
b. Click on Create credentials > Service account key
c. Click the Service account dropdown, then select New service account
d. Give the service account a name. Select Project > Owner for role; this would give that service account full access. Leave the JSON key type selected then click on the Create button to download the service key.
Go here for instructions on how to install Terraform. Proceed to the next stage once you have Terraform installed.
As stated in the Technology Stack section, Terraform is used to build the infrastructure for this project. There are 4 instances; 1 Master DB, 2 Replica DBs, and 1 NAT instance which doubles as the HAProxy server. Follow the steps below to build the infrastructure.
- Clone this project if you haven't done so already by running
git clone https://github.com/baasbank/output2.git
- Change directory into the terraform folder by running
cd output2/terraform
-
In the provider section of the
main.tf
file, replacekey.json
in credentials with the path to the Service account key downloaded from the Google Cloud Setup section above. -
In the
variables.tf
file, replace the current project name with the one you copied from Step 4 of the Google Cloud setup above. -
Make sure you are in the terraform directory, then run the following
terraform init
terraform plan
terraform apply
Running the last command produces a prompt. Enter yes
.
-
Go back to https://console.cloud.google.com
-
Click on the menu icon > Compute Engine > VM instances
You should see the instances you just created, similar to the image below.
As a matter of best practices, the databases are placed in a private subnet and do not have external IP addresses. The NAT(Network Address Translation) instance however, even though it is in the same VPC(Virtual Private Cloud), is in the public subnet. Hence, the only way to access the database instances is through the NAT instance. Follow the steps below to set up the replica databases and HAProxy.
There is no setup necessary for the MasterDB. The master_db_setup.sh
script gets executed when the machine starts, and does the necessary setup on the MasterDB.
More information about the workings of this is available here.
- From the list of instances, click on the
SSH
button for the nat-instance. This will open a new ssh connection to the nat-instance in the browser.
- Once you're in the nat-instance, run the following command to connect to replicationdb1.
gcloud beta compute ssh --internal-ip replicationdb1
- Press
enter
at the prompts for passphrase, then typeY
at the prompt for zone confirmation. Wait for a second a two, and it takes you intoreplicationdb1
.
- Clone this repository by running
git clone https://github.com/baasbank/output2.git
- Change into the output2 directory by running
cd output2
- Execute the
replica_db_setup1.sh
script by running
. replica_db_setup1.sh
- Run the following command to take a backup of the MasterDB. Make sure to replace
10.0.0.4
with the IP address for your masterdb
sudo su postgres -c "pg_basebackup -h 10.0.0.4 -D /var/lib/postgresql/9.6/main -P -U replication --xlog-method=stream;"
Running the above command prompts for a password. This is the replication user password which was set in master_db_setup.sh
, which in this case is password
. Type in the password and press Enter.
A backup of the MasterDB is created as shown in the image below
- Execute the
replica_db_setup2.sh
script by running
. replica_db_setup2.sh
Repeat the above steps for replicaDB2. Be sure to change the name of the DB you're connecting to in step 2 to replicationdb2
As stated below, the NAT instance doubles as the HAProxy server. Follow the steps below to set up HAProxy for loadbalancing and monitoring.
- From the list of instances, click on the
SSH
button for the nat-instance. This will open a new ssh connection to the nat-instance in the browser.
- Clone this repository by running
git clone https://github.com/baasbank/output2.git
- Change into the output2 directory by running
cd output2
- Edit the
haproxy_setup.sh
script to make sure the IP addresses for the MasterDB and the ReplicaDBs match what you have on Google Cloud. The edit should be made in thelisten postgres
section of the file as shown below. Run the following command to open the file.
sudo vim haproxy_setup.sh
When you're done editing, save and exit the file.
- Execute the
haproxy_setup.sh
script by running
. haproxy_setup.sh
- Copy the external IP address of the nat-instance, and open that in the browser. Ensure you use http and you specify port 7000. You should see the HAProxy Statistics Report Page, showing the status of the loadbalanced Databases. The Report Page address for my setup is http://35.246.134.67:7000/ . Below is an image of the page.