Perforce Public Knowledge Base - Interpreting server log files
Perforce Software logo
Reset Search
 

 

Article

Interpreting server log files

« Go Back

Information

 
Problem

Beginning with Perforce Server version 2005.2, detailed tracking of Perforce Server activity is automatically enabled and verbose diagnostic output is displayed in the log file. Prior to the 2005.2 version of the Perforce Server, only errors were logged, by default. This article describes the verbose log output and options.

Solution

Turning on verbose logging

The default level of verbose logging is governed by hard-coded thresholds as noted in the release note excerpt below. Verbosity of the log file produced by Perforce Server can be set by invoking p4d with the -vtrack flag. The following description from the 2005.2 section of the P4D release notes offers more details:

Server Performance Tracking - #86623 **

The p4d server now produces diagnostic output to help identify
performance problems. The new support is on by default but
can be turned off or adjusted with the '-vtrack=x' flag to
the server. Any user commands that exceed certain thresholds
for resource usage (CPU, lapse time, database I/O, network
I/O, among other things) automatically get logged into the
server error log P4LOG. The levels that can be set with
-vtrack=x are:

0 turn off tracking
1 track all commands
2 track excess usage for a server < 10 users
3 track excess usage for a server < 100 users
4 track excess usage for a server < 1000 users
5 track excess usage for a server > 1000 users

If -vtrack is not provided on the server command line or
set with P4DEBUG, the tracking level is computed from the
number of users listed in the server license file.

The exact format of the tracking output is not documented,
and subject to change.


Unless a Perforce Server is started using the "-v track=1" option, it is not guaranteed to log every command. To ensure that every command is logged (not just those commands exceeding the default performance thresholds), set the "-v server=1" flag in the p4d invocation for the installation. Note that multiple -v flags can be added to a p4d start command. More information on server trace flags can be found in KB Article: Perforce Server Trace Flags.
 

Deciphering the first lines

Perforce server info:
        2007/08/02 14:25:10 pid 16256 cpflaum@work2 127.0.0.1
[p4/2007.2/LINUX26X86_64/122958] 'user-sync -p //depot/WORK/...'
--- lapse 109s


The first two lines provide the information about what command was run, when it was run, what server was run on, and by whom. The lapse information on the third line is wall clock time, and shows that the command took 109 seconds to run.

The usage information format is based on the output of the getrusage command, common to many Unix systems.

Note: Not all fields are populated. For example, from the Linux getrusage man page:

"Not all fields are meaningful under various operating systems. Right now (Linux 2.4 [kernel version]) only the fields ru_utime, ru_stime, ru_minflt, ru_majflt, and ru_nswap are maintained. Since Linux 2.6 [kernel version], ru_nvcsw and ru_nivcsw are also maintained"

On some operating systems, like Solaris, the getrusage call is simulated and might result in zero values or meaningless negative values.
 

Deciphering usage

--- usage 9383+3937us 0+0io 0+0net 0k 29916pf

9383+3937us

Time spent executing user instructions + Time spent in operating system code on behalf of processes. Times are represented in milliseconds (us)
 

0+0io

The number of times the file system had a 512-byte physical read from the disk on behalf of processes + the number of times the file system had a 512-byte physical write on behalf of processes.

If the io is seen in a parallel sync such as

'user-transmit -t 61063 -b 8 -s 524288'=20 
--- usage 56739+97512us 3448+21128824io 0+0net 12064k 0pf

this would be interpreted as a large number of physical disk writes (21,128,824 512-byte disk blocks) which includes pages written to db.have to update the client's have list after receiving files and pages written to db.sendq removing records corresponding to the files that this transmit thread sent to the client.  The physical disk writes are those that occur prior to the fsync() call on a modified db.*.

0+0net

IPC (Inter Process Communication) messages received on behalf of processes + IPC message sent on behalf of processes.

0k

The maximum number of kilobytes of physical memory that processes used simultaneously.

29916pf

The number of page faults that were serviced by doing I/O.

 

RPC (remote procedure call)

--- rpc msgs/size in+out 3+431751/0mb+1417mb {himarks 64835/64835}

The above output example displays the number of RPC messages and size of the data transfers to and from the disk, in megabytes. In this case, a 1.417 GB was transferred from the server to the local client during a sync command. While 3 messages were sent back, the size of those messages was small enough to be a rounding error, thus zero megabytes. A submit command would show the opposite condition, with a large number of messages and data being sent to the server, with fewer coming back to the client.

Note: Optionally "himarks" is displayed to show the send/receive window size used for the RPC messages. This can be useful to catch some client issues where the window sizes (in bytes) may be set incorrectly by the OS. For example, an issue with older Perforce Windows clients with an artificially low send window size would appear as:

msgs/size in+out 0+2/0mb+0mb himarks 64835/2000
--- rpc receive errors, duplexing F/R 0/0 


Deciphering lock times

This section concerns how many files and revisions were scanned in affected database files, and information about file locking on the Perforce server.

Read Locks:

A read lock still allows other processes to read the particular table in the Perforce database. However, any write lock must wait for the read lock to complete.

Write Locks:

A write lock blocks any other processes from reading or writing to the particular table in the Perforce database.

If a process is waiting its turn for an action to complete, it is a victim. If a process locks the database table and other processes must wait, then it is a culprit. If a large number for read or write lock wait time is seen, look for an earlier corresponding large write lock hold time. This will identifye the process holding other processes off.
 

Good performance Example

--- db.resolve
---   total lock wait+held read/write 0ms+252ms/0ms+0ms

No time was spent waiting for the db.resolve table locks, and the process held a read lock for 252ms. The lock is only held for a fraction of of a second and therefore is not tying up the Perforce db.resolve database file.
 

Slow performance example

---   total lock wait+held read/write 23763ms+463ms/582568ms+343ms
  • The process was blocked for 23.763s waiting for a read lock (so it is a victim of another process holding the table).
  • The process held a read lock for 0.463 seconds (a relatively fast operation).
  • The process waited for 582.568 seconds before it could issue its own write lock (so now it is very much a victim of one or more processes that could be holding locks on multiple tables).
  • The process wrote for 0.343 seconds (still a fairly fast operation).

Therefore this process is not causing a performance slowdown, but it was a victim. An end user might believe that Perforce was stalled or crashed during this time, however the server is still running, albeit very slowly. This is typical of a server undergoing a heavy usage spike, and will often "unwind" when load drops.
 

Buffering

In the course of processing a command the server reads and writes data pages to and from a buffer cache:

--- db.revhx
---   pages in+out+cached 182412+0+64

in = number of pages read into the Perforce buffer/cache.
out = number of pages written out of the Perforce buffer/cache.
cached = number of pages retained in the Perforce buffer/cache.

Note: The "cached" number of pages can be changed through the undocumented dbopen.cache tunable.  It may be helpful on tables that have a high number of positions (pos).  The cached number applies per process so it will not help another user if the command is re-run. The default value is 96, and should only be changed at the guidance of Perforce support.

Table scanning

"Get" fetches a single row given a key. "Position" and "Scan" work together to fetch many rows -- the key is given to "Position", and then "Scan" returns rows after that key.

For example, "Get" is used to get the db.domain record for a client. "Position/Scan" are used to get all db.have records for a client.

--- db.revhx
---   locks read/write 1/0 rows get+pos+scan put+del 0+1+26635 0+0

The database only had to position once and scanned 26,635 revisions. If you see a lot of positioning to read relatively few rows (records), you might need to examine your protection table, client mappings, and command structure. Typically the fewer positions per file, the more efficient the process.

The -Ztrack flag

You can also use the -Ztrack flag on the p4 command-line to give database usage and locking information:

p4 -Ztrack submit

Once the command is completed the resulting database statistics are displayed in the same format as the log file.

More information

Some additional tools to help analyze your server and log files:

  • The Perforce Server Log Analyzer - a publicly available web application that allows you to upload your log file, and quickly see some of the commands that are adversely affecting server performance.

  • TRACK2SQL - a publicly available script that will create a mysql database of process information from a log file. This can be useful for identifying large processes that may affect overall server performance.

  • Simple P4D Log Analysis - a simple command to analyze vtrack results.

If you have any questions about your server performance and would like assistance in your log file analysis, please contact Perforce support.

Related Links

Feedback

 

Was this article helpful?


   

Feedback

Please tell us how we can make this article more useful.

Characters Remaining: 255