Filter during replication or edge-to-edge chaining

For performance reasons, you might want to ensure that replication occurs where necessary. Rules for replica filtering are therefore useful.

Filtering rules

As part of an HA/DR solution, you typically want to ensure that all the metadata Information that P4 Server maintains, such as who created file revisions in the depot, whether the file is a 'lazy copy,, the current state of client workspaces, protections, groups, users, labels, streams, and branches. Metadata is stored in the server database and is separate from the 'archive files' that users submit from their client workspace into the depot. and all the versioned files Source files stored in the depot, including one or more revisions to each file. Also known as archive files, archives, and depot files. Versioned files typically use the naming convention 'filename,v' or '1.changelist.gz'. are replicated. In most other use cases, particularly build servers and forwarding replicas, this leads to a great deal of redundant data being transferred.

It is often advantageous to configure your replica servers to filter data on client workspaces and file revisions.
For example:

Filters only apply for metadata and files that are handled by the p4 pull threads defined in the startup.N configurables.

Changes to the filtering rules in the spec of a replica server or edge server affect the replication of server data from the target server The immediately upstream server for replica servers, edge servers, standby servers, proxies and brokers. See also 'upstream server' and 'central server'. to the replica A P4 Server that automatically maintains a full or partial copy of the central server's metadata and that might contain related file content. The replica copies by using 'p4 pull' or 'p4 journalcopy'. A replica can be used as a backup server for disaster recovery. server or edge server.

In the case of edge servers, the filters affect what the upstream server Any server in the inward direction, that is, toward the central server. For example, in an edge-to-edge configuration with a commit, edge1, and edge2, both edge1 and the commit server are upstream servers for edge2. See also 'central server'. replicates to the downstream replicas, where the upstream server is either the commit server The innermost P4 Server server in a topology with one or more edge servers. or an upstream edge server in a chain of edge servers.

When are filtering rules applied?

Changing a filter to exclude file revisions does not cause versioned files to be removed from the edge or replica server.

When filtering rules are applied depends on the version of the server.

2025.2 and later Prior to 2025.2

Changes to filtering rules to include previously excluded versioned files Source files stored in the depot, including one or more revisions to each file. Also known as archive files, archives, and depot files. Versioned files typically use the naming convention 'filename,v' or '1.changelist.gz'. cause the replica (or edge) to be updated with retrospective metadata Information that P4 Server maintains, such as who created file revisions in the depot, whether the file is a 'lazy copy,, the current state of client workspaces, protections, groups, users, labels, streams, and branches. Metadata is stored in the server database and is separate from the 'archive files' that users submit from their client workspace into the depot. for the versioned file revisions the next time that either of the following runs:

Changes to filtering rules to include previously excluded file revisions take effect on future revisions of the file.

P4 Server does not require recreating (reseeding) the replica if the replica filter changes. This is because of the feature of automatic replica filter reconciliation. This feature also causes the transfer of the relevant versioned files Source files stored in the depot, including one or more revisions to each file. Also known as archive files, archives, and depot files. Versioned files typically use the naming convention 'filename,v' or '1.changelist.gz'. to the replica or edge server.

P4 Server requires recreating (reseeding) the replica if the replica filter changes. This is because changes to the filtering rules in the server specification are not applied retrospectively.

The replica filter reconcile feature combines the ability the purge unnecessary records and fetch records that have become necessary. Be aware that changes in replication might have a performance impact.

By default, the journal pull thread automatically performs the reconciliation when it detects a change in the replica filter.

These configurables control the behavior of the reconciliation:

Purge database records Fetch database records
To change the number of threads used when purging multiple database tables in parallel, use the rpl.filter.restrict.threads configurable. To change the number of threads used to process database tables in parallel, use the rpl.filter.expand.threads configurable.

To change the number of database records that are processed in parallel, use the rpl.filter.restrict.batch configurable.

To change the block size used for calculating the checksum to compare to database records to those of the upstream server, use the rpl.filter.expand.batch configurable. Blocks are used to avoid dumping the entire table to the journal if only a few records are missing on the replica.

To disable automatic purging, set the rpl.filter.restrict configurable to 0.

To disable automatic fetching, set the rpl.filter.expand configurable to 0.

An administrator can run a manual replica filter reconciliation from the command line, which provides an option to specify which tables the command applies to. To learn more, see p4 admin replica-filter-reconcile in the P4 CLI Reference.

Sensitive or unneeded versioned files can be removed by running p4 cachepurge before reseeding the replica or edge server.

Reseed the replica

First, on the target server, create a filtered checkpoint:

p4d -r /p4/london -P site1-1668 -jd myCheckpoint

where

  • /p4/london is the P4ROOT directory of the target server, and

  • site1-1668 is the name of the server spec containing filter rules used to create a filtered checkpoint.

Then, replay that checkpoint file on the replica server:

p4d -r /p4/replica -jr myCheckpoint

Referring to the example server spec in Replica scenario, the effects of filtering are:

(a) the exclusion of client metadata for workspace names matching site2-ws-* and site3-ws-*

(b) that pull threads no longer transfer versioned files with the .mp4 suffix.

The p4 verify -t command does not respect the ArchiveDataFilter that applies to a specific server instance. If metadata exists on that instance and p4 verify -t detects the archive file is missing within the file[revRange] provided, the p4 verify -t command will cause the file to be scheduled for transfer. This can have a performance impact along with populating the server with archive files it was not originally intended to contain.

Two ways to filter

Exclude database tables Filter by fields

The simplest way to filter metadata is by using the -T tableexcludelist option with the p4 pull command. For example, if you know that a build server has no need to refer to any of your users' have lists or the state of their client workspaces, you can filter out db.have and db.working entirely with p4 pull -T db.have,db.working.

Excluding entire database tables is a coarse-grained method of managing the amount of data passed between servers, requires some knowledge of which tables are most likely to be referred to during P4 Server command operations, and offers no means of control over which versioned files are replicated.

You can have fine-grained control over what data is replicated by using the ClientDataFilter:, RevisionDataFilter:, and ArchiveDataFilter fields of the p4 server form. These fields enable you to replicate only a subset of the server metadata and versioned files to a replica or edge. For this feature to work, the value of the Services: field in the server spec must be a value other than commit-server or standard.

Replica scenario

Example:  Filtering out client workspace data and files

If workspaces for users in each of three sites are named with site[123]-ws-username, a replica intended to act as partial backup for users at site1 could be configured as follows:

Copy
ServerID:       site1-1668
Name:           site1-1668
Type:           server
Services:       replica
Address:        tcp:site1bak:1668
Description:
        Replicate all client workspace data, except the states of
        workspaces of users at sites 2 and 3.
        Automatically replicate .c files in anticipation of user
        requests. Do not replicate .mp4 video files, which tend
        to be large and impose high bandwidth costs.
ClientDataFilter:
        //...
        -//site2-ws-*/...
        -//site3-ws-*/...
RevisionDataFilter:
ArchiveDataFilter:
        //....c
        -//....mp4

When you start the replica, your p4 pull metadata thread might resemble the following:

p4 configure set "site1-1668#startup.1=pull -i 30"

In this configuration, only those portions of db.have that are associated with site1 are replicated. All metadata concerning workspaces associated with site2 and site3 is ignored.

All file-related metadata is replicated. All files in the depot are replicated, except for those with the .mp4 extension. Files ending in .c are transferred automatically to the replica when submitted.

Build server scenario

Consider a build server scenario. The ongoing work of the organization (such as code, business documents, or videos) can be stored anywhere in the depot. In contrast, this build farm is dedicated to building releasable products, and therefore only needs a subset of the organization’s output:

Example:  Replicating metadata and file contents for a subset of a depot

Releasable code is placed into //depot/releases/... and automated builds are based on these changes.

Copy
ServerID:       builder-1669
Name:           builder-1669
Type:           server
Services:       build-server
Address:        tcp:built:1669
Description:
        Exclude all client workspace data
        Replicate only revisions in release branches
RevisionDataFilter:
        //depot/releases/...
ArchiveDataFilter:
        //depot/releases/...

Exclude a subset of paths

If you want to exclude a subset of paths, ensure that the the inclusionary lines precede the exclusionary lines. For example,

RevisionDataFilter:
    //... 
    -//depot/releases/...