If the Helix Server is not responding for a user, follow these steps to isolate the problem. It is important to determine whether the Helix Server is down, slow, or hung.
1. Run p4 info
from the end user workstation.
First and foremost, have the end user run p4 info. Have the user with the problem run
p4 -p <server IP>:1666 info
where 1666 is the port. The IP address will bypass DNS. If p4 is not installed have the user download the P4: Command-Line Client. The p4 info command uses very few resources on the Perforce database.
If a p4 info error comes back with a "check $P4PORT" error, double-check that you entered the proper IP address and port number after the -p flag. A "check $P4PORT" error indicates a network connectivity issue, a firewall issue, or most likely, Perforce is down.
If p4 info does return, then the Helix Server is up. The Helix Server may be slow, but it is not down.
If p4 info does return quickly, have the user run a command that uses more resources like
p4 changes -t -m 10
Check how fast this runs, or if it hangs.
You will be able to know if the Helix Server is up but waiting on the database if you can get output from
p4 lockstat -C
The "p4 lockstat" command will let you know if the Helix Server is currently processing commands that are currently locking the database. If database tables are locked, run this command repeatedly to check whether the database locks are freeing up.
The "p4 lockstat -C" will check for client or global metadata locks as seen in Client Workspace and Global Metadata Locks.
If locks are present, you may have to ask the client to stop their processes or perhaps reboot their workstation.
Alternatively, to stop the processes on the server side, run
p4 monitor show
p4 monitor terminate pid
Before running "p4monitor terminate", make sure the db.monitor.interval is turned on by checking
p4 configure show allservers
You can set db.monitor.interval by running
p4 configure set db.monitor.interval=30
2. Run p4 info while logged onto the Helix Server.
Remote desktop or ssh into the Helix Server
A. If p4 info does not run on the Helix Server
If p4 info returns on the server, then the Helix Server is up and running. But if p4 info cannot connect, then check if the Helix Server parent process is running.
On Unix, run
ps -elf | grep p4d
and look for a running p4d process. Make sure you do not mistake the grep command line for the actual p4d process.
On Windows, check that the Perforce service under Control Panel, Services, is running.
On either platform, if you cannot run p4 info on the Helix Server server, the server is basically unusable so you might as well stop the Helix Server. Kill the parent (not child!) pid, then wait five minutes (running "ps -elf | grep p4d" frequently) and wait for all p4d processes to exit. On Unix, use the Unix "kill" or "kill -15" command on the parent Perforce pid. There is usually no need for "kill -9". On Windows servers, kill the parent process by stopping the Perforce service or by using Task Manager.
After five minutes after the parent pid has been killed, if the child processes are not being removed, make sure that your running journal is not growing, then kill some of the child p4d processes until the rest are terminating normally. Then wait until all p4d or p4s processes are gone and restart Perforce. If Perforce does not start, view the Perforce log for clues.
But if you want to continue further, check whether CPU and memory are adequate.
On Unix, run
and press the number 1 to see each processor. Determine if the p4d process is consuming all the CPU or memory.
On Windows, run Task Manager and determine if the p4d or p4s process is consuming all the CPU or memory. Note that overall CPU may be low, but a single processor may be at 100% CPU for minutes at a time which indicates not enough CPU to run a process.
B. If p4 info does run properly on the Helix Server
But if p4 info does work properly on the Helix Server, the server is up. If p4 info runs without errors on the server, but not at the end user, there is a firewall issue. But assuming p4 info runs on the server, then try a larger resource command like
p4 changes -t -m 10
If this command hangs, the Helix Server is running slowly.
In any case, run
p4 lockstat -C
to see if the Helix Server database is locked. If the database is locked, run
p4 monitor show -ael
and look at some of the oldest commands that are not a form to fill out. (Ignore forms like "p4 client" or "p4 label" because these are just waiting for a user to complete the form). While the oldest command times may not always be accurate, it provides candidates to kill. To free up commands that are locking the Helix Server, contact the user to stop their command, or assuming db.monitor.interval is set up as described earlier, run
p4 monitor terminate pid
where the<pid> is found from "p4 monitor show -ael". Start killing the oldest non-form pids first.
But if the Perforce database is not locked per p4 lockstat, check that the Helix Server journal is not growing.
tail -f <pathto>/journal
If the journal is growing rapidly, then the Helix Server is processing commands. Simply wait for the command running to complete. Killing a sub-process that is accessing the Helix Server can corrupt the database files.
If the journal seems sluggish and only showing occasional db.user and db.domain entries (logins), then check hardware resources for adequate CPU and memory. On Unix, use
and on Windows, use Task Manager. If "top" or Task Manager shows a lack of memory or 100% CPU, find the process or thread that uses up the memory or CPU.
In any case, you may want to use "p4 monitor terminate <pid>" to remove a process or thread as mentioned above. If CPU and memory is at 100%, run "p4 monitor terminate <pid>" to kill the high resource command, then re-run the command with less files (such as running "p4 obliterate", "p4 integrate" or "p4 revert" on a smaller directory at a time).
3. From here, you will have plenty to go on.
- From p4 lockstat and p4 lockstat -C, you will know if the Helix Server is running, but a command is locking the database.
- From running p4 info commands on the same machine as the server, you will isolate network issues from your client to the server.
- From "top" or Task Manager, you will know whether you are out of CPU or memory.
- From "p4 monitor show -ael", you will know if the Helix Server is overwhelmed with processes and you can start guessing which process you can stop by running "p4 monitor terminate".
Feel free to run "p4 monitor terminate <pid>" anytime. This is a safe way to kill processes that will not harm the Perforce database. If you do this, let the end user know you terminated their command.
If you need off-hours support, note that each of our UK, Canada, US, and Australia offices will support you from Monday through Friday excluding holidays. Use our support page and dial our international number, or send a new email to email@example.com. If you have an existing case number, place this into the body of the message.