There are many factors that can affect Perforce Server (P4D) performance. These factors are related to the hardware and operating system on the machine where P4D executes, the filesystem where the db.* files are located, and how Perforce is used at a site. Improvement in any of these areas might have an effect on general P4D performance, though improvement in one can have more of an effect than the others.
For best performance, a machine should be dedicated to a single Perforce Server. By dedicating a machine, the single Perforce Server is able to utilize all available resources as needed. An operating system's filesystem cache is generally more effective when caching metadata for a single Perforce Server than if the filesystem cache is shared with other Perforce Servers or other applications.
Memory, I/O subsystem, and the processors in the machine where P4D executes can all have an effect on performance. Maximizing the amount of memory is usually a good recommendation, since most modern operating systems will use a significant portion of the memory not needed for process space as filesystem cache. I/O requests that can be satisfied from a larger filesystem cache complete faster than requests that must be satisfied from beyond the filesystem cache.
For I/O requests that must be satisfied from beyond the filesystem cache, there might be several improvements possible for the I/O subsystem. The storage subsystem containing the db.* files should have a memory cache; maximizing the storage subsystem's memory cache is also a good recommendation. For best performance, write-back caching should be enabled, which of course requires that the storage subsystem's memory have battery backup power. I/O latency to the logical drive where the db.* files are located should be minimized, including the rotational latency of the physical drives themselves. Minimizing I/O latency might require direct connections between the host and the storage subsystem, and usually requires physical drives with the fastest rotational speed (such as 15K RPM).
RAID 1+0 (or RAID 10) is usually the better performing RAID configuration, and is recommended for the logical drive where the db.* files are located. The number of physical drives in the logical drive can also have an affect on P4D performance. Generally, performance improves as the number of physical drives in the logical drive increases. For a given amount of disk space required, better performance might result from using more smaller-capacity physical drives. The stripe size for the logical drive can also affect performance; the optimal stripe size might be dependent upon the number of physical drives in the logical drive.
Hardware-based RAID implementations (that is, RAID logic that is not implemented as software running on the host) usually have good performance characteristics. Software-based RAID implementations can require CPU cycles that might otherwise be needed for P4D processes. Therefore, software-based RAID implementations should be avoided.
Journal and Versioned File Location
For recoverability, the live journal should not be on the same physical device that contains the db.* files. The separation of the live journal and the db.* files is also motivated by performance considerations. During operations that write to the db.* files, entries are written to the live journal as records are written to the db.* files. If the live journal and the db.* files are on the same physical device, the I/O throughput to the db.* files is degraded. For best performance, the live journal should be on a separate storage subsystem connected to a separate host adapter. The live journal should be on a logical drive and filesystem that is optimized for sequential writes.
The versioned files should be located on a separate logical drive than the logical drives where the db.* files and the live journal are located. For best performance, the logical drive where the versioned files are located should be on a separate storage subsystem connected to a separate host adapter. Since the versioned files typically require significantly more disk space and the I/O throughput is not as critical as for the db.* files, a more economical RAID configuration, such as RAID 5, can be used for the logical drive where the versioned files are located.
Faster processors and memory in the machine where P4D executes might result in faster execution of P4D commands. Since portions of some commands acquire and hold resources that might block other commands, it is important that these portions of the commands execute as fast as possible. For example, most P4D commands have a "compute phase", during which shared locks are acquired and held on some of the db.* files. A shared lock on a db.* file will block an operation that might write to the same db.* file. If the data needed for a command's compute phase is cached within the operating system's filesystem cache, only the processor and memory speed constrains the compute phase. It is therefore important that the processors and memory in the machine where P4D executes are as fast as possible for best performance.
For most Perforce installations, performance is generally better when preference is given to faster processors rather than more processors or processors with more cores for the machine where P4D executes. But factors such as the complexity of the site's protections table and client views can affect the CPU requirements. CPU utilization can be monitored using OS utilities such as "top" (on Linux and Unix) and "perfmon" (on Windows). Installations with high CPU utilization on the machine where P4D executes that are already using faster processors might need more processors and/or processors with more cores while maintaining the speed of the processors.
Some processors and operating systems support dynamic frequency scaling, which allows the processor to vary power consumption by dynamically adjusting the processor voltage and core frequency. As more demand is placed on the processor, the voltage and core frequency increase. Until the processor is ramped up to full speed, P4D performance might be impacted. Although the power-saving capability of the dynamic frequency scaling feature is useful for mobile computers, it is not recommended for the machine where P4D executes.
Two examples of dynamic frequency scaling are:
- Intel SpeedStep - available on some Xeon processors and generally available on mobile computers
- AMD PowerNow! - available on an array of AMD processors, including server-level processors
Both of these features are supported on Linux (and enabled by default in some SuSE distributions), Windows, and Mac OS X platforms. If this feature is enabled on the machine where P4D executes, we recommend disabling it. In some Linux distributions, such as SuSE, this feature can be disabled by setting the "powersaved" service to "off".
You might be able to determine the current speed of the processors on your computer. On Linux, the current speed of each core is reported on the "cpu MHz" line in the output from the "cat /proc/cpuinfo" OS command.
Choice of operating system for the machine where P4D executes can also affect performance. 32-bit operating systems might not be able to address large amounts of physical memory, which can restrict the effective size of the filesystem cache. The various 64-bit operating systems each have their own performance characteristics that can favor a particular Perforce workload. In general, Linux distributions using later Linux 2.6 64-bit kernels have good performance characteristics for most Perforce workloads.
Filesystem performance is an important component of operating system performance. The various operating systems usually offer several filesystems, each with their own performance characteristics that can favor a particular Perforce workload. For best P4D performance, the db.* files should be located on a high-performance filesystem. In general, the XFS filesystem has good performance characteristics for most Perforce workloads. The XFS filesystem is available on several operating systems, including Linux distributions using later Linux 2.6 64-bit kernels.
Reading pages into a cache in anticipation of being requested is an optimization that is often implemented within various I/O subsystem components. This optimization is commonly known as "read-ahead". In some implementations, read-ahead can be tuned, which might result in better performance. But tuning read-ahead can be a bit of an art. For example, increasing the read-ahead size might result in better performance for operations requiring mostly sequential reads. But the same increased read-ahead size applied consistently during random reads might unnecessarily discard previously-cached data that might have satisfied subsequent requests.
Perforce usage can affect performance. There are several usage patterns that can have a direct effect on performance. Since the depot filenames are the leading portion of the key in several important db.* files (db.rev, db.revhx and db.integed are among the more notable), the length of paths in the depot filenames have a direct effect on performance. As the length of paths increase, performance decreases. It is therefore prudent to discourage the use of overly-descriptive paths in the depot filenames.
The development methodology can also have a direct effect on performance. If the development methodology calls for frequent creation of full branches (perhaps branching for each bug fix), then the amount of metadata rapidly increases, resulting in more levels within the db.* file btrees. As the number of levels increase, more key comparisons and I/O requests are required to traverse to the leaf pages, which will impact performance. Creating full branches also requires more metadata read and written; the additional metadata read and written might affect the filesystem cache to the detriment of other Perforce tasks. Rather than frequent creation of full branches, it might be prudent to branch only those files needed for each bug fix, or consider a development methodology in which multiple bug fixes can occur on the same branch.
Perforce strives to improve P4D performance in each release. In general, best performance will result from using the latest P4D release. Before deploying a new P4D release, a site might want to perform some amount of acceptance testing using data and a workload that is meaningful for the site.