Perforce Public Knowledge Base - Why replicated journals are not identical
Perforce Software logo
Reset Search
 

 

Article

Why replicated journals are not identical

« Go Back

Information

 
Problem

Rotated and replicated journal files created using p4 pull have the same journal file number, but are slightly different in size on each of the main server and replicas.  If the data was replicated identically, and correctly, why are the files different?

Solution

Within Perforce journal file there are special meta-records used for parsing and processing.  These allow Perforce to properly group and replay the other records contained within the file.

They include the following types of records:

  • @nx@ -- Journal notes, which contain event information
  • @mx@ -- Mid-transaction markers
  • @ex@ -- End-transaction markers
Journal notes contain information including the location of P4ROOT and the name of the journal file. If this data is different between the servers, there will be different information stored in the journal files created by the servers.

The second data field in an @nx@ record is a timestamp.  These are unlikely to be identical when comparing journal records written by a master and a replica.  Furthermore, in the case of journal header notes (type 2 in the first field) the P4ROOT and name of the journal file are stored.  If these are not identical, the data on these lines will be different.

The three lines listed here come from three different replicated journal files.  The first was a master. The other two were replicas.  Note how the information in the journal note is different for each:

@nx@ 2 1321655625 @31@ 1 0 0 0 0 @/perforce/p4root@ @journal@ @@ @@ @@
@nx@ 2 1321654991 @31@ 1 0 0 0 0 @/perforce/p4replica@ @journal@ @@ @@ @@
@nx@ 2 1321655614 @31@ 1 0 0 0 0 @/perforce/p4replica@ @journal@ @@ @@ @@

A thorough discussion of the @nx@ journal note entries can be found in this knowledge base article.

The @mx@ and @ex@ records in journal files contain processID and timestamp information. These, too, will differ across machines. The processID can also cause the journal file size to vary.  ProcessID values are between one and five digits on most systems.

[Adding the following:

The replica's journals are *logically* equivalent, but not *physically* identical.

During replication, the journal processing logic does some re-arrangement
of the journal data, which can result in a more compact journal on the
replica side.

For example, consider the case in which you modify a group definition
using 'p4 group', and add the 101st user to the group.

The master will perform this operation by performing 100 @dv@ operations
to delete the prior members of the group, then 101 @pv@ operations to add
the current members of the group.

The replica, however, will process these records differently: after sorting
the records, it will realize that 100 of those users were deleted and then
immediately re-added, and only one record was new, so it will just journal
the 101 new users of the group as 100 @rv@ operations and 1 @pv@ operation.

There are other cases like this; there are also a few odd cases in which
the replica can journal *more* data than the master (for example, a Build Farm
replica's journals include all the work replicated from the master, as well
as the 'sync' operations performed by build automation clients on the Build Farm).

The bottom line is that you can't directly compare the raw bytes in the
journals of the two servers, as they are only logically equivalent, not
physically identical.

... ]

Related Links
On a similar note, the following KB article may also be of interest. 
Content of a replica journal

Feedback

 

Was this article helpful?


   

Feedback

Please tell us how we can make this article more useful.

Characters Remaining: 255