2 Node Cluster

Overview

In this scenario, we are going to set up two Storware Backup & Recovery servers in High Availability, Active/Passive mode. This is possible by using techniques such as a pacemaker and corosync. At least a basic understanding of these is highly desirable. This how-to is intended for RPM-based systems such as Red Hat / CentOS. If you run Storware Backup & Recovery on a different OS, you may need to refer to your distribution docs.

Our environment is built on the following assumptions:

  1. node1 - first Storware Backup & Recovery server + Storware Backup & Recovery node, IP: 10.41.0.4

  2. node2 - second Storware Backup & Recovery server + Storware Backup & Recovery node, IP: 10.41.0.5

  3. Cluster IP: 10.41.0.10 - We will use this IP to connect to our active Storware Backup & Recovery service. This IP will float between our servers and will point to an active instance.

  4. MariaDB master <-> master replication

Make sure to run all of the commands with administrative privileges. For simplicity, the following commands will be executed as root

HA cluster setup

Preparing the environment

  1. Stop and disable the Storware Backup & Recovery server, node and database as the cluster will manage these resources.

  2. Enable HA repo:

  3. Use yum to check updates pending

  4. Check the hosts file /etc/hosts, as you might find an entry such as:

    Delete it, as this prevents the cluster from functioning properly (your nodes will not "see" each other) and add entries of your two nodes:

    In this case, we will add:

Installation

Run these commands on both servers

  1. On both servers run

  2. Add a firewall rule to allow HA traffic - TCP ports 2224, 3121, and 21064, and UDP port 5405 (both servers)

  3. (Optional) While testing, depending on your environment, you may encounter problems related to network traffic, permissions, etc. While it might be a good idea to temporarily disable the firewall and SELinux, we do not recommend disabling that mechanism in the production environment, as it creates significant security issues. If you choose to disable the firewall, bear in mind that Storware will no longer be available on ports 80/443. Instead, connect to ports 8080/8181 respectively.

  4. Enable and start PCS daemon

Cluster configuration

Installation of a pcs package automatically creates a user hacluster with no password authentication. While this may be good for running locally, you will require a password for this account to perform the rest of the configuration - configure the same password on both nodes:

  • Set password for hacluster

Corosync configuration

  1. On node 1, issue a command to authenticate as a hacluster user:

  2. Generate and synchronise the corosync configuration

    ​Take a look at your output, which should look similar to below:

  3. Enable and start your new cluster

  4. OK! You have our cluster enabled. You have not created any resources (such as a floating IP) yet, but before you proceed, we still have a few settings to modify. Because you are using only two nodes, we need to disable the default quorum policy (this command should not return any output)

  5. You should also define default failure settings These two settings combined will define how many failures can occur for a node to be marked as ineligible for hosting a resource, and after what time this restriction will be lifted. You define the defaults here, but it may be a good idea to also set these values at the resource level, depending on your experience. Run these commands:

  6. As long as you are not using any fencing device in our environment (here we are not), you need to - disable stonith. The second part of this command verifies running-config. These commands normally do not return any output. Run this command:

Resource creation

  1. First, you will create a resource that represents our floating IP 10.41.0.10

From this moment, you need to use this IP when connecting to your vProtect server.

  1. Adjust your IP and cidr_netmask, and you're good to go:

  2. Immediately, you should see our IP is up and running on one of the nodes (most likely on the one you issued this command for):

  3. As you can see, our floating IP 10.41.0.10 has been successfully assigned as the second IP of interface ens160. We should also check if the Storware Backup & Recovery web interface is up and running. You can do this by opening the web browser and typing in https://10.41.0.10.

  4. The next step is to define a resource responsible for monitoring network connectivity. Note that you need to use your gateway IP in the host_list parameter

  5. You have to define a set of cluster resources responsible for other services crucial for the Storware node and the server itself. Here, we will logically link these services with our floating IP. Whenever the floating IP disappears from our server, these services will be stopped. You also have to define the proper order for services to start and stop, as for example, starting the Storware server without a running database makes little sense.

    These commands do not return any output.

  6. Define resource colocation

  7. Set node preference

At this point, the pacemaker HA cluster is functional.

However, there is still thing we need to consider - Creating DB replication

MariaDB replication

In this section, we explain how to set up master<->master MariaDB replication.

  1. On both nodes, if you have the firewall enabled, allow communication via port 3306

Steps to run on the first node - in this case 10.41.0.4

This server will be the source of DB replication.

  1. Stop the Storware server, node and database

  2. Copy your license and node information from the first node to the second node:

  3. Edit the config file, enable binary logging, and start MariaDB again. Depending on your distribution, the config file location may vary. Most likely it is /etc/my.cnf or /etc/my.cnf.d/server.cnf

    In the [mysqld] section, add the lines:

  4. Now log in to your MariaDB, create a user used for replication, and assign appropriate rights to it.

    For the purpose of this task, we will set the username to 'replicator' and the password to R3pLic4ti0N

    Don't log out just yet, we need to check the master status and

  5. Write down the log file name and position, as it is required for proper slave configuration.

  6. Dump the vprotect database and copy it onto the second server (node2).

Steps to run on the 2nd server, node2: 10.41.0.5

  1. Stop the vprotect server, node, and database

  2. Edit the MariaDB config file. Assign a different server id, for example: 2. Then start MariaDB.

  3. Load the database dump copied from storware1.

At this point, you have two identical databases on our two servers.

  1. Log in to the MariaDB instance, create a replication user with a password. Use the same user as on node1. Grant the necessary permissions.

    Set the master host. You must use the user_master_log_file and master_log_pos written down earlier. Change the IP of the master host to match your network configuration.

  2. Start the slave, check the master status, and write down the file name and position.

Go back to the first server (node1)

  1. Stop the slave, then change the master host using the parameters noted down in the previous step. Also, change the master host IP to match your network configuration.

At this point, you have successfully configured MariaDB master<->master replication.

Testing the setup

The fastest way to test our setup is to invoke

This puts node1 into standby mode, which prevents it from hosting any cluster resources.

After a while, you should see your resources up and running on node2.

Note that if you perform a normal OS shutdown (not a forced one), the pacemaker will wait for a long time for a node to come back online, which in fact will prevent completion of shutdown. As a result, resources will not switch correctly to the other node.

Last updated