Restore Coordinator Mirroring After a Recovery
After you activate a standby coordinator for recovery, the standby coordinator becomes the primary coordinator. You can continue running that instance as the primary coordinator if it has the same capabilities and dependability as the original coordinator host.
You must initialize a new standby coordinator to continue providing coordinator mirroring unless you have already done so while activating the prior standby coordinator. Run gpinitstandby on the active coordinator host to configure a new standby coordinator. See Enabling Coordinator Mirroring.
You can restore the primary and standby coordinator instances on the original hosts. This process swaps the roles of the primary and standby coordinator hosts, and it should be performed only if you strongly prefer to run the coordinator instances on the same hosts they occupied prior to the recovery scenario.
Restoring the primary and standby coordinator instances to their original hosts is not an online operation. The coordinator host must be stopped to perform the operation.
For information about the Apache Cloudberry utilities, see the Apache Cloudberry Utility Guide.
To restore the coordinator mirroring after a recovery
-
Ensure the original coordinator host is in dependable running condition; ensure the cause of the original failure is fixed.
-
On the original coordinator host, move or remove the data directory,
gpseg-1. This example moves the directory tobackup_gpseg-1:$ mv /data/coordinator/gpseg-1 /data/coordinator/backup_gpseg-1You can remove the backup directory once the standby is successfully configured.
-
Initialize a standby coordinator on the original coordinator host. For example, run this command from the current coordinator host, scdw:
$ gpinitstandby -s cdw -
After the initialization completes, check the status of standby coordinator, cdw. Run
gpstatewith the-foption to check the standby coordinator status:$ gpstate -fThe standby coordinator status should be
passive, and the WAL sender state should bestreaming.
To restore the coordinator and standby instances on original hosts (optional)
Before performing the steps in this section, be sure you have followed the steps to restore coordinator mirroring after a recovery, as described in the To restore the coordinator mirroring after a recoveryprevious section.
-
Stop the Apache Cloudberry coordinator instance on the standby coordinator. For example:
$ gpstop -m -
Run the
gpactivatestandbyutility from the original coordinator host, cdw, that is currently a standby coordinator. For example:$ gpactivatestandby -d $COORDINATOR_DATA_DIRECTORYWhere the
-doption specifies the data directory of the host you are activating.noteBefore running
gpactivatestandby, be sure to rungpstate -fto confirm that the standby coordinator is synchronized with the current coordinator node. If synchronized, the final line of thegpstate -foutput will look similar to this:20230607:06:50:06:004205 gpstate:test1-m:gpadmin-[INFO]:--Sync state: sync. -
After the utility completes, run
gpstatewith the-boption to display a summary of the system status:$ gpstate -bThe coordinator instance status should be
Active. When a standby coordinator is not configured, the command displaysNo coordinator standby configuredfor the standby coordinator state. -
On the standby coordinator host, move or remove the data directory,
gpseg-1. This example moves the directory:$ mv /data/coordinator/gpseg-1 /data//backup_gpseg-1You can remove the backup directory once the standby is successfully configured.
-
After the original coordinator host runs the primary Apache Cloudberry coordinator, you can initialize a standby coordinator on the original standby coordinator host. For example:
$ gpinitstandby -s scdwAfter the command completes, you can run the
gpstate -fcommand on the primary coordinator host, to check the standby coordinator status.
To check the status of the coordinator mirroring process (optional)
You can run the gpstate utility with the -f option to display details of the standby coordinator host.
$ gpstate -f
The standby coordinator status should be passive, and the WAL sender state should be streaming.