Failing Over and Failing Back PostgreSQL Streaming Replication
  • 2 Minutes to read
  • Dark
    Light
  • PDF

Failing Over and Failing Back PostgreSQL Streaming Replication

  • Dark
    Light
  • PDF

Article summary

Failing Over to the Secondary Database Server

The secondary database acts as a passive database, meaning it can only process read requests. In the event that the primary database fails, you have to fail over to the secondary database server, making it active, in order to assure proper continuity of the platform.

Note

You can also switch between the primary and secondary database servers, making the secondary database server the primary database server and the primary database server a secondary server, when the primary database server is still up.

To failover to the secondary server:

  1. Using SSH, log in as root to the CTERA Portal secondary database server.
  2. In the command line, enter the following command: portal-failover.sh become_master

The primary database server becomes the secondary server, and the secondary database server becomes the primary server.

Failing Back to the Primary Database Server

When the original primary server comes back online you can failback to it, to return to the original configuration.

To prevent both database servers being identified as the primary database server, triggering inconsistencies in the databases, before you can return the original primary server to being the primary server, you must first define it as a replication, secondary, database.

To failback to the primary database server:

  1. When the former primary database server is running again, using SSH, log in as root to the original primary database server.
  2. In the command line, enter the following command: portal-failover.sh become_replica
  3. Wait for the script to finish so that the original primary database is recognized as the secondary database.
  4. Log in to the portal as a global administrator and In the global administration view, select Main > Servers in the navigation pane and click the replication server name.
  5. Click DB Replication in the server window that is displayed and under Database Replication verify that the Status value is set to OK.
    Note

    If there is a mismatch between the requested WAL files and their location on the server the Status value can be set to Failed until the mismatch is resolved when the WAL file position reaches the location, which can take a few hours. CTERA recommends the following manual procedure to resolve this issue:

    1. Log in to the portal as a global administrator and In the global administration view, select Main > Servers in the navigation pane.
      The Servers page is displayed, listing all the servers for the CTERA Portal.
    2. Click the replication server name and in the server window that is displayed, under General Settings, uncheck Replication of.
    3. Click Save.
    4. Click the replication server name again and in the server window that is displayed, under General Settings, recheck Replication of.
    5. Click Save.

    The replication process will reinitialize which can be monitored by clicking DB Replication in the server window and under Database replication verify that the Status value is set to Reinitializing. After the replication has reinitialized, which can take some time, depending on the portal size and the amount of data to be replicated, and the Status value is set to OK.

  6. Only after the Status is set to OK, in the command line, enter the following command: portal-failover.sh become_master

Was this article helpful?