Runaway Processes
What is a runaway process?
Occasionally a process will stop responding to the system and run wild. These processes ignore their scheduling priority and insist on taking up 100% of the CPU. Because other processes can only get limited access to the CPU, the machine begins to run very slowly.
How can I identify a runaway process on my computer?
The 'w' command at the terminal will print out a list of current users of a machine, and it will tell you the machine's “load average.” The load average of a machine is related to how much input/output the machine has to do. A load average of 1 is a machine under full load. Anything over 1 is extremely high and means that the machine is getting behind on its processing. If your machine has a load average near or over 1, and you are not running anything really resource intensive on the machine, then you probably have a runaway process sapping your machine's processing power.
The 'top' command lists the processes that are taking up the most system resources. In the left column of top's output is the PID number, or the process ID. This number is necessary to identify the process if you want to kill it. Some of the other information that top yields is: the user that owns the process, the priority and nice value with which the process is running, the amount of CPU and memory that are being consumed, how much CPU time the process has consumed, and the command that was executed to generate the process. We use this information to determine if a process is truly a runaway, or if it is a resource intensive program that we should allow to continue executing.
The 'ps -aux' command gives you much of the same information about processes that top provides. When used in conjunction with grep it can be a very useful utility. For example ps -aux |grep vim
will list the details about all of the vim processes running on the system.
See man w, man top, man ps, and man grep for more information about any of these commands.
Killing runaways
kill
kill is the standard Unix utility for terminating nasty processes. You only have the right to kill your own processes. Some processes do not respond to the standard kill command. These processes might need a more forceful signal such as -9. The command killall can help you to kill multiple processes at once. man kill and man killall will give you the details about these commands.
The System Administrators
If you can not kill a process using kill, and you think that it is taking an inappropriate share of system resources, please report it to the System Administrators in room 1140 TMCB or open at ticket at support.cs.byu.edu.
How not to have your process killed
Use nice and renice
The commands nice and renice control the priority of your processes. The higher the nice value, the lower the priority. nice is used to spawn a new process with the specified priority. renice is used to adjust the priority of a currently running process. A non-root user can lower the priority (increase the niceness) of his or her own processes, but can not raise the priority of any process. This includes processes that the user originally niced, i.e. niceing a process can not be undone without root access.
You can tell the priority of a process using top. The PRI column (third) is the priority of each process, and the NI column (fourth) is the nice value of each process. By default nice sets the nice value to 10. 19 is the highest possible nice value (lowest priority). For more information, see the man pages.
Get prior authorization
If you feel that you need to run a process that will be exceptionally resource intensive or that needs to run for an extended period of time, you need to get prior approval to run that process so that it does not get killed. Such approval must be granted through the department CSRs at the request of a sponsoring professor. In such a situation, it is probably just as easy to get permission to use the Fulton Supercomputer The Supercomputer is much more appropriate for most resource intensive research and projects than the open lab machines.