Perforce Public Knowledge Base - Filtering Data for Replica Servers
× PRODUCTS SOLUTIONS CUSTOMERS LEARN SUPPORT
Downloads Company Partners Careers Contact Free Trials
Menu Search
Perforce
Reset Search
 

 

Article

Filtering Data for Replica Servers

« Go Back

Information

 
Problem

I do not want my replica to be a full copy of all the master server's data. I only want specific data to be replicated.

Can I filter the data between the master and replica server?

Solution

All Helix SCM servers greater than version 2013.1 allow filtering data destined for a replica server. See Distributing Perforce for more information.

Caveats

Please, be aware that in its current implementation files/revisions data filtering works the best only if you re-seed the replica/edge server after Perforce Administrator updates the server specification.

Detail

In addition to the existing -T option, which provides table-level exclusion filters, additional filtering options are available to address the following use cases:

  1. A replica server is to contain a subset of the master data.
  2. A checkpoint dump is to contain a subset of the data.
  3. A journal export is to contain a subset of the data.

Replica filtering adds new functionality to three commands: p4 pull, p4d -jd, and p4 export. This provides a common mechanism to describe the filtering.

The optimal mechanism for describing metadata filtering is to specify it as part of a server specification, and then to specify that server as an argument to the p4 pull, p4 export, and p4d commands.

Setting up Filtered Replication

Let us assume that the Perforce Replica named gabriel is a forwarding replica with the following p4 info and configurables output:

p4 info

User name: bruno
Client name: client1
Client host: replica1
Client root: /home/perforce/bruno/client1
Current directory: /home/perforce/bruno
Peer address: 127.0.0.1:44768
Client address: 127.0.0.1
Server address: replicaClientDataFilter1:1666
Server root: /home/perforce/bruno
Server date: 2013/06/27 12:46:51 -0700 PDT
Server uptime: 00:00:06
Server version: P4D/LINUX26X86_64/2013.1/659207 (2013/06/18)
ServerID: myforward
Server license: Perforce Software Inc. 3000 users (expires 2014/01/30)
Server license-ip: 10.20.30.222
Case Handling: sensitive

p4 configure show gabriel

gabriel: P4LOG = /home/perforce/bruno/repllog
gabriel: P4TARGET = master1:1666
gabriel: P4TICKETS = /home/perforce/bruno/.p4tickets
gabriel: db.replication = readonly
gabriel: lbr.replication = readonly
gabriel: monitor = 2
gabriel: rpl.forward.all = 1
gabriel: server = 3
gabriel: serviceUser = service
gabriel: startup.1 = pull -i 1
gabriel: startup.2 = pull -u -i 1
gabriel: startup.3 = pull -u -i 1
  1. Add "-P serverID" to the startup.1 configurable

    To enable the new replication filtering for a specific replica, you need to change the replica's startup.1 configurable to include the -P option and specify the replica name. Note that this is not the -Pic, -Pxc, -Pif, nor the -Pxf options also available for filtering.

    For example:

    p4 configure set "gabriel#startup.1=pull -P myforward -i 1"
    For server 'gabriel', configuration variable 'startup.1' set to 'pull -P myforward -i 1'

    The startup.1 configurable now shows as:

    gabriel: startup.1 = pull -P myforward -i 1
    

    Note: p4 monitor show; will not reflect this change until the replication server is restarted.

    Note: The journal records produced for this change will look similar to this example:

    @rv@ 1 @db.config@ @gabriel@ @startup.1@ @pull -P myforward -i 1@
    @rv@ 1 @db.config@ @gabriel@ @configurationVersion@ @69@
    
  2. Run p4 server serverid and RevisionDataFilter and/or ClientDataFilter

    The new spec options are ClientDataFilter: and RevisionDataFilter: described in p4 server.

RevisionDataFilter

In this example, we specifically allow revision data from "//depot/..." but we specifically disallow (filter) data for "//depot/main/release4/perl_proj/...".

Note: Make sure to run the p4 server command using the ServerID and not the server name:

p4 serverid

Server ID:	myforward

p4 server myforward

ServerID:	myforward
Type:		server
Name:		gabriel
Address:	tcp:replica1:1666
Services:	forwarding-replica
Description:
	Forwarding replica pointing to master1:1666
RevisionDataFilter:
	//depot/...
	-//depot/main/release4/perl_proj/...

p4 admin restart

On the master, add a file to release4:

echo "release4 file" > release4.txt

p4 add release4.txt
//depot/main/release4/perl_proj/release4.txt#1 - opened for add

p4 submit -d "adding release4.txt"
Submitting change 24508.
Locking 1 files ...
add //depot/main/release4/perl_proj/release4.txt#1
Change 24508 submitted.

On the replica, we see the changelist, but it does not contain the file because the file is filtered out:

p4 changes -m 1

Change 24508 on 2013/06/27 by bruno@Bruno_Perl 'adding release4.txt'

Change 24508 by bruno@Bruno_Perl on 2013/06/27 14:34:58

        adding release4.txt

Affected files ...

Differences ...

The file release4.txt is not on the replica server while it is on the master:

p4 files //depot/main/release4/perl_proj/release4.txt
//depot/main/release4/perl_proj/release4.txt - no such file(s).

p4 fstat //depot/main/release4/perl_proj/release4.txt
//depot/main/release4/perl_proj/release4.txt - no such file(s).

On the master, by way of testing, we add a file that is not in release 4, and therefore will not be filtered:

echo "release1 file" > release1.txt

p4 add release1.txt

//depot/main/release1/perl_proj/release1.txt#1 - opened for add
p4 submit -d "adding release1.txt"
Submitting change 24509.
Locking 1 files ...
add //depot/main/release1/perl_proj/release1.txt#1
Change 24509 submitted.

On the replica, the file appears normally as expected:

p4 changes -m 1
Change 24509 on 2013/06/27 by bruno@Bruno_Perl 'adding release1.txt'

p4 describe 24509
Change 24509 by bruno@Bruno_Perl on 2013/06/27 14:38:04

        adding release1.txt

Affected files ...

... //depot/main/release1/perl_proj/release1.txt#1 add

Differences ...

p4 sync //depot/main/release1/perl_proj/release1.txt
//depot/main/release1/perl_proj/release1.txt#1 - added as /home/perforce/bruno/myworkspace/main/release1/perl_proj/release1.txt

The corresponding journal record is:

@rv@ 0 @db.svrview@ @myforward@ 1 0 0 @//depot/...@
@rv@ 0 @db.svrview@ @myforward@ 1 1 1 @//depot/main/release4/perl_proj/...@

Note: Schema values are explained in the Helix SCM database schema.

ClientDataFilter

If you do not want any more have data from client workspace name myworkspace sent to the replica because the client is not relevant to the replica's remote location, add the ClientDataFilter line to exclude this workspace:

p4 info | grep "Client name"

Client name: myworkspace

p4 server myforward

ServerID:	myforward
Type:		server
Name:		gabriel
Address:	tcp:replica1:1666
Services:	forwarding-replica
Description:
	Forwarding replica pointing to master1:1666
ClientDataFilter:
	-//myworkspace/...

p4 admin restart

The sync works as expected:

p4 sync -f //depot/main/release1/perl_proj/test2
//depot/main/release1/perl_proj/test2#1 - refreshing /home/perforce/bruno/myworkspace/main/release1/perl_proj/test2

But note the replica does not know the file has been synced:

p4 have //depot/main/release1/perl_proj/test2
//depot/main/release1/perl_proj/test2 - file(s) not on client.

The master does contain this information; it just was not replicated:

p4 -c myworkspace -H replica1 have //...
//depot/main/release1/perl_proj/fromrep.txt#1 - /home/perforce/bruno/myworkspace/main/release1/perl_proj/fromrep.txt
//depot/main/release1/perl_proj/test2#1 - /home/perforce/bruno/myworkspace/main/release1/perl_proj/test2
//depot/main/release1/perl_proj/utf8.txt#1 - /home/perforce/bruno/myworkspace/main/release1/perl_proj/utf8.txt

ArchiveDataFilter (requires Helix SCM 2013.2 or later)

If the master has large versioned files that are not relavant to the remote replica location, you can exclude replicating the versioned files until the files are specifically requested.

Example: To not transfer *.c files to the replica unless specifically requested:

  1. On the server, change the server specification to not transfer .c versioned files in the depot named depot.

    $ p4 server -o myforward
    # A Perforce Server Specification.                   
    <...>
    ArchiveDataFilter:
    	-//depot/....c
  2. Restart the replica server:

    $ p4 admin restart
  3. On the master, make changes to file networker.c at changelist 24833:

    $ p4 edit networker.c
    //depot/main/release1/perl_proj/networker.c#5 - opened for edit
    $ vi networker.c
    $ p4 submit -d "test of archive filter"
    Submitting change 24833.
    Locking 1 files ...
    edit //depot/main/release1/perl_proj/networker.c#6
    Change 24833 submitted.
    $
  4. On the replica, note that versioned files transfer is up to date:

    $ p4 pull -l
    $
  5. On the replica, the versioned file, networker.c,v, has not been updated. Note the head revision is at changelist 24832 and not the latest changelist 24833:

    $ head networker.c,v
    head     1.24832;
    access   ;
    symbols  ;
    locks    ;comment  @@;
    
    
    1.24832
    date     2013.12.09.12.46.17;  author p4;  state Exp;
    branches ;
    next     1.24831;
  6. On the replica, get the latest revision of networker.c:

    $ p4 sync networker.c
    //depot/main/release1/perl_proj/networker.c#6 - updating /home/perforce/rfong/centclient/main/release1/perl_proj/networker.c
  7. On the replica, only after a sync is the versioned file of networker.c updated. Now the versioned file on the replica shows changelist 24833:

    $ head networker.c,v
    head     1.24833;
    access   ;
    symbols  ;
    locks    ;comment  @@;
    
    1.24833
    date     2013.12.09.13.06.31;  author p4;  state Exp;
    branches ;
    next     1.24832;

Filtered Replication Will Show Versions Before Filtering

Here below is an example of an edit on the master server to a filtered branch. The filtering was put into place after revision 13 of file gx10 was created. This means revision 13 will be on the replica server, but revision 14 and beyond will only be on the master server. If RevisionDataFilter is enabled, users may be confused if the version is on the master but not on the replica.

For example, the master makes a change to a //depot/branch2/gx10:

$ p4 edit //depot/branch2/gx10
//depot/branch2/gx10#13 - opened for edit
... //depot/branch2/gx10 - also opened by giles@ws10200
... //depot/branch2/gx10 - also opened by giles@ws12100

$ p4 submit -d edit
Submitting change 2170.
Locking 1 files ...
edit //depot/branch2/gx10#14
Change 2170 submitted.

Because the branch is filtered, the user on the replica affected by this will not see this latest (filtered) change:

$ p4 fstat //depot/branch2/gx10
... depotFile //depot/branch2/gx10
... clientFile /home/giles/workspaces/filtered-client/depot/branch2/gx10
... isMapped
... headAction edit
... headType binary
... headTime 1358168377
... headRev 13               <--- note: previous revision (filtered)
... headChange 2169          <--- note: previous change   (filtered)
... headModTime 1358168158
... haveRev 13
... ... otherOpen0 giles@ws10200
... ... otherAction0 edit
... ... otherChange0 2009
... ... otherOpen1 giles@ws12100
... ... otherAction1 edit
... ... otherChange1 2165
... ... otherOpen 2

To the master, it will appear something like this:
p4 fstat //depot/branch2/gx10
... depotFile //depot/branch2/gx10
... headAction edit
... headType binary
... headTime 1358169395
... headRev 14
... headChange 2170
... headModTime 1358168158
... ... otherOpen0 giles@ws10200
... ... otherAction0 edit
... ... otherChange0 2009
... ... otherOpen1 giles@ws12100
... ... otherAction1 edit
... ... otherChange1 2165
... ... otherOpen 2

Additionally, we can see from checkpoints of the master and the replica that the data has in fact been filtered for change 2170:/p>

grep gx10 replica.ckp.170 | grep 2170
No data returned

grep gx10 master.ckp.170 | grep 2170

@pv@ 9 @db.rev@ @//depot/branch2/gx10@ 14 65539 1 2170 1358169395 1358168158 E77BBA67A14ABA34C3E2B1FC573F3873 48 0 0 @//depot/branch2/gx10@ @1.2170@ 65539
@pv@ 0 @db.revcx@ 2170 @//depot/branch2/gx10@ 14 1
@pv@ 9 @db.revhx@ @//depot/branch2/gx10@ 14 65539 1 2170 1358169395 1358168158 E77BBA67A14ABA34C3E2B1FC573F3873 48 0 0 @//depot/branch2/gx10@ @1.2170@ 65539

The second of the three numeric fields in the db.svrview journal records is a sequence number to keep the rows ordered. If the server's filter has multiple lines, the first line is seq=0, the second line is seq=1, the third line is seq=2, and so on.

Note: Meta data already present in the server will be used as the basis for 'p4 verify -t'; the filter specification will not stop the verify command pulling archive files referenced in the replica's meta data.

Seeding a filtered replica using a filtered dump file will alleviate this issue.

Filtered Replication Will Still Allow Retrieval of Unfiltered Files

Even if filtered replication is in place, unfiltered files can still be synced. This is because the read-write command will go directly to the master.

For example:

$ echo "filterrep1" > filterrep1.txt
$ p4 add filterrep1.txt             
//depot/main/release2/perl_proj/filterrep1.txt#1 - opened for add
$ p4 submit -d "adding filterrep1.txt"
Submitting change 41593.                                           
Locking 1 files ...                                                
add //depot/main/release2/perl_proj/filterrep1.txt#1               
Change 41593 submitted.

The file in directory release2 is not replicated.

Only files in release1 are replicated:

$ p4 server -o myforward
# A Perforce Server Specification.                   
<snip>
ServerID:	myforward
Type:		server
Name:		gabriel
Services:	forwarding-replica

Description:
	Filtered forwarding replica pointing to master:20152
	
RevisionDataFilter:
	-//depot/...
	//depot/main/release1/perl_proj/...
  1. Create a file on the master that will be filtered:

  2. On the master, the new file exists:

    $ p4 fstat -Oc //depot/main/release2/perl_proj/filterrep1.txt
    ... depotFile //depot/main/release2/perl_proj/filterrep1.txt
    ... clientFile /home/perforce/p4work/20152/client/main/release2/perl_proj/filterrep1.txt
    ... isMapped
    ... headAction add
    ... headType text
    ... headTime 1478301002
    ... headRev 1
    ... headChange 41593
    ... headModTime 1478300981
    ... haveRev 1
    ... lbrFile //depot/main/release2/perl_proj/filterrep1.txt
    ... lbrRev 1.41593
    ... lbrType text
    ... lbrIsLazy 0
    
    $ p4 files //depot/main/release2/perl_proj/filterrep1.txt
    //depot/main/release2/perl_proj/filterrep1.txt#1 - add change 41593 (text)
  3. On the master the versioned file exists:

    $ p4 info | grep -i "Server root"
    Server root: /home/perforce/p4work/20152
    
    $ ls /home/perforce/p4work/20152/depot/main/release2/perl_proj/filterrep1.txt,v
    /home/rfong/p4work/20152/depot/main/release2/perl_proj/filterrep1.txt,v
  4. Switching to the replica, the replica is set to filter out release2 files:

    $ p4 configure show | grep pull
    startup.1=pull -P myforward -i 1 (configure)
    startup.2=pull -u -i 1 (configure)
    startup.3=pull -u -i 1 (configure)
    
    $ p4 server -o myforward
    <...>
    ServerID:	myforward
    Type:		server
    Name:		gabriel
    Services:	forwarding-replica
    
    Description:
    	Filtered forwarding replica pointing to linux-perforce:20152
    	
    RevisionDataFilter:
            -//depot/...
            //depot/main/release1/perl_proj/...
  5. The replica cannot see the release2 files as expected:

    $ p4 files //depot/main/release2/perl_proj/filterrep1.txt
    //depot/main/release2/perl_proj/filterrep1.txt - no such file(s).
    
    $ p4 fstat -Oc //depot/main/release2/perl_proj/filterrep1.txt
    //depot/main/release2/perl_proj/filterrep1.txt - no such file(s).
  6. The versioned file is not on the replica as expected:

    $ p4 info | grep -i "Server root"Server root: /home/perforce/replica
    
    $ ls /home/perforce/p4work/20152/depot/main/release2/perl_proj/filterrep1.txt,v
    ls: cannot access /home/perforce/p4work/20152/depot/main/release2/perl_proj/filterrep1.txt,v: No such file or directory
  7. Yet the file can still be synced:

    $ p4 sync //depot/main/release2/perl_proj/filterrep1.txt
    
    //depot/main/release2/perl_proj/filterrep1.txt#1 - added as /home/perforce/replica/replclient/main/release2/perl_proj/filterrep1.txt

    This is because the read/write command is being retrieved from the master. To prevent this file from retrieval, the "p4 protect" command must specifically exclude these files.

Filtering a Dump File

The server spec is used to determine what data is filtered when creating a dump file; for instance:

$ p4d-2013.1 -r. -P Replica13FR -jd filtered-dump

And the difference:

$ grep branch2 unfiltered-dump | wc -l
40498

$ grep branch2 filtered-dump | wc -l
217

$ p4 -p popeye:13288 files //depot/branch2/...
//depot/branch2/... - no such file(s).

Filtering p4 export Output

If we export all of the records for checkpoint 170 we find 54243 records containing branch2:

$ p4 export -c 170 | grep branch2 | wc -l
54243

Using the filter definition to exclude //depot/branch2/... records:

$ p4 export -c 170 -P Replica13FR | grep branch2 | wc -l
273

We can also specify the actual filter on the command line:

$ p4 export -c 170 -Pxf://depot/branch2/... | grep branch2 | wc -l
273

The full list of options:

-Pic://client/pattern  -- client records to include
-Pxc://client/pattern  -- client records to exclude
-Pif://depot/pattern   -- depot records to include
-Pxf://depot/pattern   -- depot records to exclude

More information can be found in p4 help export.

Filtering and Performance

-P filter interpretation requires the master to parse journal records. Unlike the pre-2013.1 -T journal record filtering, which works with the un-parsed journal record data (it only needs to know the table name and the raw journal record), the 2013.1 -P filters need to interpret the columns in the journal record.

This means that with releases later than 2013.1, if the p4 pull and p4 export commands include -P filters, the master will be doing more processing of journal records than it did in prior releases.

Related Links

Feedback

 

Was this article helpful?


   

Feedback

Please tell us how we can make this article more useful.

Characters Remaining: 255