Failback after failover

After a Failover with master server participation, failback restores the previous role of that master server, the pre-failover master server. After a successful failback,

  • The pre-failover master server is restored to its original role as the master server.

  • (Optional) The pre-failover standby server is restored to its original role of standby server.

Prerequisites for using the failback command

  • Both the master and the standby servers must be version 2022.1 or later.

  • The original p4 failover command must have been performed with master participation.

  • The p4 failback command must be performed with master participation.

  • When the master is an Edge Server, the failback command requires Commit Server participation.

  • The server ID for the failed-over server must not be manually altered prior to performing the Procedure for using the failback command.

  • Journals created after failover or failback must still exist in their expected location before the standby-to-be is started as a standby. In other words, keep the journal locations and journals from failover through completion of all intended failback operations.

  • We recommend that a DNS alias point to the IP address of the master server. This allows the same DNS alias to point the new master server (former standby server).

  • We recommend that a DNS alias point to the IP address of the standby server.

Procedure for using the failback command

Step Description

1.

 

On the machine where the pre-failover master server was running, convert the pre-failover master server to a standby.

  1. Preview the p4d -Fm command described at Failback options:
    p4d -r p4root -Fm masterServerID standbyServerID
    where
    masterServerID and standbyServerID refer to the pre-failover server IDs of the master and standby, and
    the p4root location is the P4ROOT location of the pre-failover master server.

  2. If the preview looks acceptable, include the -y option to perform the operation:
    p4d -r p4root -y -Fm masterServerID standbyServerID

Note

This command changes this pre-failover master server to a restricted standby, which allows it to ignore configuration settings that would interfere with its operation as the target for the failback command.

2.

Start the pre-failover master server, which is the standby server you converted in Step 1 above, and wait for its replication to catch up with the current master.

Note

This restricted standby server uses the same IP address as the pre-failover master server. At this point, the current master is using the IP address of the pre-failover standby. When failback is done, the master that was running at that time will go down, and this new master will be using its original IP address.

3. (Optional) Change the standby server spec to make it a mandatory standby.

4.

On the pre-failover master server:

  1. As the standby service user, log in to the post-failover master server.

  2. Verify that the login ticket is stored in the file indicated by the standby server’s P4TICKETS configurable.

  3. Check whether failback is possible by running the p4 failback command.

  4. If so, run the p4 failback -y command. Otherwise, see If p4 failback cannot be used.

5.

 

After the completion of a successful failback,

  1. Verify that the pre-failback standby has been restarted as the new master by issuing the p4 info command and checking the ServerID to ensure that it is the same ServerID that the pre-failover master server used.

  2. Site-specific changes might be needed to use the new master server. It might be necessary to make DNS changes so that users and replicas can connect to the new master server. For example,

If you have a DNS alias set up for the master server If you do not have a DNS alias set up for the master server

Update the IP address of that DNS alias to point to the IP address of its new location.

After this is done, users will be able to connect to the master using the same P4PORT as before.

The DNS alias for the standby must be changed to point to the IP address of the pre-failover standby server.

 

Update the P4TARGET environment variable and server specifications to use the correct host. The port number should remain the same, but the host name must be changed to use its new location.

  • Change your P4PORT to point to the post-failback master host and the same port number so that you can connect to the new master.

  • On the post-failback master server, change the P4TARGET for each replica or Edge Server by issuing the p4 configure show allservers command and then issuing the p4 configure set "replica-name#P4TARGET=new-master-server:port-number" command.

  • Update each replica's own P4TARGET by issuing the p4d -r $P4ROOT "-cset replica-name#P4TARGET=new-master-server:port-number"command.

  • Update your server specifications with the proper hostname and port number by issuing the p4 server servername command.

  • Inform your users if they need to update their P4PORT to connect to the post-failback master host. The port number should remain the same as before, and your users can now issue new commands.

 

6.

(Optional) On the machine where the pre-failover standby was running, restore the pre-failover standby to its former status of standby for the pre-failover master server that is now the restored master.

  1. Preview the p4d -Fs command described at Failback options:
    p4d -r p4root -Fs masterServerID standbyServerID
    where
    masterServerID and standbyServerID refer to the pre-failover server IDs of the master and standby, and
    the p4root location is the P4ROOT location of the pre-failover standby.

  2. If the preview looks acceptable, include the -y option to perform the operation:
    p4d -r p4root -y -Fs masterServerID standbyServerID

Note

This command changes this server to a restricted standby. When this server discovers from replication that p4 failback has been successfully run, it will function as an unrestricted standby for the restored master.

7. (Optional) Change the standby server spec to make it a mandatory standby.

If p4 failback cannot be used

Failback might still be possible even if the Prerequisites for using the failback command are not fully met.

Prerequisites if p4 failback cannot be used

Ensure that the pre-failover master server (the original commit or master server) is:

  • Reconfigured as a standby before starting it.

  • assigned the standby server's serverID, and that this serverID is different from the serverID of the current commit or master server.

Reseeding the original commit or master server

If p4 failback cannot be used, it is a best practice to reseed the pre-failover master server from the post-failover master before performing the Failback steps if p4 failback cannot be used. Consider the following:

If the pre-failover master did not participate in the failover ... If the pre-failover master participated in the failover ...
... the metadata of the pre-failover master server might contain transactions that did not make it to the pre-failover standby at the time of failover. Those transactions could reappear at failback. To avoid this possibility, reseed the pre-failover master from the post-failover master before performing the failback steps. ... reseeding might not be necessary. However, if you have any doubts about metadata integrity, the safest option is to reseed the pre-failover master server from the post-failover master before performing the failback steps.

Failback steps if p4 failback cannot be used

At the new master server

  1. Verify that the standby is pulling from the master server by issuing p4 servers -J
  2. Check the result, which might be something like:

    commit '2019/07/09 16:41:36' commit-server 40/13642 40/13642 wadL/1 1

    standby '2019/07/09 16:41:31' standby 40/10000 40/10000 wAdl/4 1

    where 10000 is lower than 13642, which indicates that the standby is not yet fully caught up with the master server.

  3. Wait a moment, then reissue p4 servers -J to verify that standby is fully caught up with the master server. For example:

    commit '2019/07/09 16:41:36' commit-server 40/13642 40/13642 wadL/1 1

    standby '2019/07/09 16:41:36' standby 40/13642 40/13642 wAdl/4 1

At the post-failover standby that was the pre-failover master server

  1. Issue the failover command: p4 failover
  2. Follow the steps at Failover.