Perforce Public Knowledge Base - Failing over to a replica server
Reset Search
 

 

Article

Failing over to a replica server

« Go Back

Information

 
Problem

A replica server is being maintained as a standby server for disaster recovery. In the event the master server goes down, how do I failover to the standby server?

Solution

Failing over to a replica

To fail over to the standby server:

  1. Ensure that the replica server has its own valid license file installed in its P4ROOT directory

  2. Log on to the replica

  3. Confirm the consistency and age of the replica when failover occurred.
  4. Restart the replica server as the new master.

  5. Point Perforce end-users and other clients at the new master.

  6. Adjust any replicas to use the new master server.

  7. Perform end user tasks

  8. Optionally convert the old master back into the master again
 

Log on to the replica

It is helpful to gather replica information before the replica is converted to a master.  The replica may already be running and a super user logged in.  But if you cannot run p4 commands because a user with super privileges is not logged in, disable any authorization server and disable rpl.forward.login and then log in.

Disable an authorization server by removing the -a flag in the startup script or by unsetting the P4AUTH configurable. 

Disable rpl.forward.login by using the "p4d -cunset" flag on the replica.

cd Perforce replica root
p4d -r . "-cshow"
p4d -r . "-cunset rpl.forward.login"
p4d -r . "-cshow"


Then log in as a super user.  Even though you will receive a message, you will still be able to run commands.  For example, you can still run "p4 pull -lj" despite the warnings:

$ p4 -u super -p gabriel:44108 login
Enter password:
Replica access to P4TARGET server failed.
Remote server refused request. Please verify that service user is correctly logged in to remote server, then retry.
TCP connect to 10.12.18.242:44106 failed.
connect: 10.12.18.242:44106: Connection refused

$ p4 pull -lj
Current replica journal state is:       Journal 24,     Sequence 767.
The statefile was last modified at:     2016/10/24 12:02:45.
The replica server time is currently:   2016/10/24 12:45:51 -0700 PDT
Replica access to P4TARGET server failed.
Remote server refused request. Please verify that service user is correctly logged in to remote server, then retry.
TCP connect to 10.0.0.242:44106 failed.
connect: 10.0.0.242:44106: Connection refused

 
 

Confirm the consistency and age of the replica when failover occurred

As Perforce stores metadata and versioned file data separately, each need to be checked for their consistency with each other.

Note: These steps should be carried out before restarting the standby server as the new master.

 

Confirm the metadata

 

Consistency of the replica database files at time of failover

 

p4 journaldbchecksums

The p4 journaldbchecksums command can be run on a regular basis against the master server adding journal notes pertaining to database table checksums on the master. When a replica receives these journal notes, it performs the same checksum computations on its database files and writes the results to the replica server log file. The entries in the replica server log will appear similar to the following:

Perforce server info:
    Table db.config checksums match. 2011/09/16 12:36:23 version 1: expected 0xB5D23219, actual 0xB5D23219.

In this case the "checksums match" and it can be assumed that the data in the replica server's db.config table is the same as the data on the master. An example of output where the checksums do not match is as follows:

Perforce server info:
    Table db.working checksums DIFFER. 2011/09/10 22:58:42 version 9: expected 0x3201495D, actual 0x4BBE7670.

This tells us that the data in the replica server's db.working table is different than the data on the master. If  "checksums DIFFER" output appears in the Perforce log, contact Perforce support for assistance.
 

Age of the replica metadata at time of failover

 

Choose one or more of the following to determine the point where work must be resubmitted.



state

The replica server maintains file named "state" file containing a journal position token indicating what journal records from the master have been processed. The journal position token consists of a journal number and position (byte) offset, for example:

cd Perfore replica root
cat state

22/6494

In this example the replica server has replicated metadata from the master through journal number 22 up to byte offset 6494.
 

rpl.pull.position

If desired, the rpl.pull.position configurable can be enabled resulting in additional metadata replication status messages being written to the replica server log file each time a metadata pull completes. The output will look something like the following in the replica server log file:

Debug 2011/07/15 06:12:10 pid 27375: MetaData Pull position 22/6494

Use the contents of the state file and the metadata pull messages in the replica server log and compare the position token values to the journal files on the master server, if available. It is advisable to contact Perforce support for advice about replaying these against the replica server.


p4 changes

In addition to the state file contents, run commands against the replica server which will give data points to work from.  Use p4 changes to determine the last submitted change on the replica:

p4 changes -m1 -t -s submitted
Change 356967 on 2011/09/15 18:15:58 by p4@support 'Merge down from Release to Main'

The "-t" flag provides the date and time for the last submission to the server.  This will show the time of the last submitted changelist.. Users should be advised that any submissions made to the master server after this changelist (that is, after this date) will need to be resubmitted.
 

p4 jobs

if Perforce Jobs is used, or if Perforce Jobs are integrated with an external defect tracking system, you can use p4 jobs to check the last job submitted.

p4 -Ztag jobs -m1 -r  | grep ReportedDate
... ReportedDate 2011/09/15 18:20:04

All jobs after this date will need to be resubmitted.  It is advised that when a third party defect tracker is being used that it be re-seeded from this time onwards. Consult the defect tracker's documentation on how to do this.


Confirm the versioned files

 

Age of the replica versioned file data at time of failover

Before taking the replica offline and switching it as the master, capture the output of the following command:

p4 pull -l

This command gives information about pending versioned file content transfers for the replica that never made it from the master to replica.  In the case of switching to a master, this list can be used to determine which files and revisions have missing versioned file content on the replica. Be aware that these missing file archives may need to be replaced.

The output of this command looks similar to:

//depot/unicycle.txt 1.990 text new edit 1D346A0E3555561CA05C9ADB29D2C47B 287588 3573 2011/09/16 15:08:49 0
//depot/users.txt 1.1029 text new edit 8374C36CC6DD04821A5B7C52832CA632 699 3573 2011/09/16 15:08:49 0

At this point, end users should be checked with to see if they have these files available in their workspace. If using Perforce Proxy Servers (P4P), consider checking the P4PCACHE for these files.

It is likely that the metadata on the replica server will be more current than the versioned files. This is why Perforce strongly recommends running multiple p4 pull -u startup commands seen in p4 configure show allservers.  This will help the replica to keep as up to date as possible with versioned file data submitted to the server.
 

Consistency of the replica versioned file data at time of failover

To determine the consistency of the replica's versioned file data, run the following command and capture the output:

p4 verify -q //... > verify.txt 2>&1
cat verify.txt

Note: This command may take a significant time to run. This gives a definitive list of versioned files that are missing based on the metadata that is available on the replica server. Details on how to handle MISSING errors are covered in MISSING! errors from p4 verify.

Since a verify of the entire contents of the server can take significant time to run, an alternative approach would be to verify a smaller subset of files, those submitted most recently to the server. For example, if you determine the latest change submitted on the replica happened on 2011/09/15 , you can run a verify of only the files revisions submitted to the server in the last few days by running:

p4 verify -q "//...@>2011/09/13"


Restart the standby server as the new master

To make the standby server the new master, simply stop the replica then restart the instance under a new name thereby making the former replica into a master.  A Perforce replica server can use a combination of options passed to the Perforce Server process on startup.  How this is accomplished depends on how the options are being passed to the Perforce Server process. 
 

Replica configuration variables in the startup script or command line

Replica options can be set directly on the p4d command line used to start the replica server using

-t host:port - Set target address for replica (default $P4TARGET); this is the equivalent to the P4TARGET variable.

-M readonly - Indicates readonly replication of metadata; this is equivalent to db.replication = readonly

-D readonly - Indicates readonly replication of depot contents;. this is equivalent to lbr.replication = readonly

-In name - Indicates the name of the replica seen in p4 configure show allservers.

To make a replica into a master where the replica is defined by command line options, remove (or comment out) the replica options from the P4D startup command on the replica server.

Before

p4d -t oldmaster:1666 -M readonly -D readonly -In replica1 -r /replica/p4root -p 1667 -L pathto/log -J pathto/journal -d

After

p4d -r /replica/p4root -p 1667 -L pathto/log -J pathto/journal -In newmaster -d

This causes the (former) replica server to start as a new master server.
 

Replica options stored in server configuration

Replica configurables may be defined by p4 configure set and seen in  p4 configure show. 
 
$ p4 configure show replica1
replica1: db.replication = readonly
replica1: lbr.replication = readonly
replica1: P4TARGET = master:1666
replica1: serviceUser = service
replica1: startup.1 = pull -i 5
replica1: startup.2 = pull -i 5 -u
replica1: startup.3 = pull -i 5 -u

To make a replica into a master defined by -In or P4NAME, start the replica under a new name or without the previous name

Before

When the replica server starts using -In (or P4NAME):

p4d -In replica1 -r /replica/p4root -p 1667 -d

the replica name allows the replica server to pick up the appropriate options from the server configuration.


After

p4d -In newmaster -r /replica/p4root -p 1667 -d

Choose a new name for your new master.  Because the server will have new name, the server will not use any of the variables seen in p4 configure show allservers..

If the -L option and -J options are not in the startup script, after the master is restarted later, you can set the log and journal with

p4 configure set "newmaster#P4JOURNAL=pathto/journal
p4 configure set "newmaster#P4LOG=pathto/log

Replica options defined by a server specification record

Later versions of the Perforce Server can use server specifications defined by the 'p4 server'  command. This can in conjunction with the 'server.id' file in the P4ROOT directory define the replication role without any options being passed on the command line  If "p4 servers" is used the 'server.id' file must be removed/renamed prior to the replica being restarted as a master.

To make a replica into a master where the replica is defined by server.id:

  1. Save off the "p4 server" information
p4 -ztag servers > serverinfo.txt

  1. Stop the replica and rename or erase the server.id file

On Unix:

cd Perforce root
mv server.id server.id.orig

On Windows:

cd Perforce root
rename server.id server.id.orig
  1. Adjust the startup script to point to the running journal and current log
Make a backup of the startup script, then adjust the startup script to add the -J journal -L log variables from the command line or from "p4 configure set" as needed.


Before

p4d -In replica1 -r /replica/p4root -p 1667 -d

After

p4d -In newmaster -r /replica/p4root -p 1667 -d

If the -L option and -J options are not in the startup script, you can set the log and journal with

p4 configure set "newmaster#P4JOURNAL=pathto/journal
p4 configure set "newmaster#P4LOG=pathto/log
  1. Start the new master and create a new server.id file with a new name of your choice
p4 serverid newmaster

Alternatively, create a new file named "server.id" with the new name and place this file into the replica server root.

  1. Create a new server spec for this new master.
p4 server newmaster

You can use the former master specification as a guide on create the new master specification.  You may want to keep the original replica specification if you want to convert the new master back to a replica later, or you can delete the original replica specification.
 

Point Perforce end-users and other clients at the new master server

Perforce clients, Perforce Proxy Servers, Perforce brokers, and potentially third-party integrations or scripts that connect to the Perforce Server may need to be reconfigured so that their P4PORT setting is pointing to the host:port of the new master server. To the degree which this is necessary depends on the particular setup. If no mechanism is in place to point clients, proxies, brokers, or third-party integrations to the new master server (such as a DNS server entry for the Perforce server), update each Perforce instance accordingly.
 

Adjust any replicas to use the new master server

It is likely that a replica that now points to the new master will not resume replication for two reasons:

  1. The service user needs to be logged into the new master from the machine the replica is running on and the ticket needs to be in the tickets file defined for the replica.

  2. The offset that the replica currently has into the old master's journal is not correct for the new master journal (see Why replicated journals are not identical); the replica log file will potentially contain messages showing 'Bad opcode'. The replica should be stopped, its state file copied to a safe place and the original state file modified to remove the offset leaving only the journal number. A restart of the replica will cause replication to start from the beginning of the journal specified in the state file and to continue until it is up to date. Check with 'p4 pull -l -j'.

Perform End User Tasks

In the event end user workspaces are out of date compared to the metadata that the replica has, they should consider following the steps outlined in Working Disconnected From The Perforce Server. This allows them to bring their workspace back into line and potentially restore files that may be missing from the server.

Optionally convert the old master back into the master again

Once the old server hardware is fixed, you may want to reseed the old master's database (db.*) files with a checkpoint from the "new" master machine. For guidance, follow the steps outlined in Chapter 10 of the Perforce System Administrator's Guide. Most of the replication settings will already exist in the server configuration from the previous replica setup.

Note:  The "P4TARGET" variable may have changed as the current master may exist on a different host than before. A new ticket for the service user will need to be created on the new replica machine as outlined in the System Administrator's Guide.

Once the new replica is up and running, a verify -t should be run to setup the transfer of any missing files. If the archive size is significant, archives should be first restored to the new replica location from a current backup, then a verify scheduled. To schedule the transfer run:

p4 verify -t //...


Any missing files will then be scheduled for transfer. The following command can be used for a summary of the remaining files to be transferred:

p4 pull -l -s


The output of this command is similar to the following:

File transfers: 10 active/1327 total, bytes: 1551384 active/106561581 total.


Notes

Note 1: This article applies to the case where both metadata and versioned file data are being replicated by p4 pull.

Note 2: The following commands and configurables referenced in this article are available only with Perforce Server 2011.1 and later:

p4 journaldbchecksums
p4 verify -t
p4 pull -l -s             (the '-s'  option is new for 2011.1)
rpl.pull.position

Note 3: Replica servers do not always have their own copy of a license file, relying on the license in the master server; when failing over to a replica the replica must have a valid license file of its own. Should it not have one and the replica may potentially be used in a DR scenario, you can obtain a license from here:

 http://https://www.perforce.com/support-services/duplicate-server-request

Related Links

Feedback

 

Was this article helpful?


   

Feedback

Please tell us how we can make this article more useful.

Characters Remaining: 255