BSD DevCenter
oreilly.comSafari Books Online.Conferences.

advertisement


Discovering System Processes Part II
Pages: 1, 2

Sometimes you may start a process and wish to stop it before it is finished. For example, in a spurt of inspiration you might decide that you want to see the name of every file on your FreeBSD system, so you type this at your terminal:



find / -print |more

However, you soon grow tired of pressing the spacebar and decide that you really didn't want to see all of your files at this time. In other words, you want to send an interrupt signal. One way to do this is:

^C

You'll know that your INT signal worked as you'll get your prompt back.

Retry the same find command, but this time send a signal 3 like so:

^\

Just before you get your prompt back, you'll see the following message:

Quit  (core dumped)

If you use ALT F1 to return to the console, you'll see a message similar to this:

Nov 19 13:50:09 genisis /kernel: pid 806 (find), uid 1001: exited on signal 3
Nov 19 13:50:09 genisis /kernel: pid 807 (more), uid 1001: exited on signal 3 (core dumped)

And if you do a directory listing at your original terminal, you should see a file called more.core. Normally, you won't be sending a signal 3 to a process unless you're a programmer and know how to use the kernel debugger. I included the example to show the difference between a signal 2 and a signal 3; you can safely delete that *.core file.

Interprocess communication isn't much different than any other type of communication. You or another process can send a signal requesting a desired result, but it is up to the process receiving the signal to decide what it wants to do with that signal. Remember that processes are simply running programs; most programs use something called a "signal handler" to decide how and when to respond to signals. Usually if you send some type of termination signal, the signal handler will try to gracefully close all the files that process has opened to prevent data loss before the process itself closes. Sometimes, the signal handler will decide to just ignore your signal and will refuse to terminate the process.

However, some signals can't be ignored; for example, signal 9 and signal 17. Let's say that you wish to stop a process you've started, so you used grep to find the PID of the process, used ps to send a TERM signal, then repeated your grep to ensure it worked like so:

ps | grep processname

kill PID

ps | grep processname

However, the second grep still shows that PID, meaning your TERM signal was ignored by that process. Either one of these commands should fix it:

kill -9 PID

or

kill -KILL PID

If you now repeat your grep command, you should just have your prompt echoed back at you, meaning that PID was indeed terminated.

You may ask, "Why not just always send a signal 9 if it can't be ignored?" Signal 9 does indeed "kill" a process, but it doesn't give it time to gracefully save all of its work first, meaning that you may lose some data. It's better to try sending another type of terminating signal first, and save signal 9 for those processes that stubbornly refuse to terminate. Also, remember that as a regular user you will only be able to send signals to processes that are owned by you. The superuser can send a signal to any process.

There may be times when you wish to terminate all the processes you own; this has different ramifications depending on whether you are a regular user or the superuser.

Let's demonstrate as a regular user. Log in to four different terminals and do a ps command:

ps
  PID  TT  STAT      TIME COMMAND
  316  v0  Ss     0:00.39 -csh (csh)
  957  v0  R+     0:00.00 ps
  317  v1  Is+    0:00.20 -csh (csh)
  915  v2  Is     0:00.12 -csh (csh)
  941  v2  I+     0:00.09 lynx
  942  v2  Z+     0:00.00  (lynx)
  913  v3  Is     0:00.12 -csh (csh)
  946  v3  I+     0:00.01 /bin/sh /usr/X11R6/bin/startx
  951  v3  I+     0:00.04 xinit /home/genisis/.xinitrc --
  955  v3  S      0:03.00 xfce

In this example, I've logged into terminals 0-3. I ran the ps command from the console, logged into the first terminal, started lynx on the second terminal, and started an XWindows session from terminal three, which resulted in a total of 10 processes owned by myself. If I use a PID of "-1" when I invoke the kill command, I will broadcast the signal I specify to all of my processes. So, let's send a TERM signal like so:

kill -1

Then check our results with the ps command:

ps
  PID  TT  STAT      TIME COMMAND
  316  v0  Ss     0:00.41 -csh (csh)
  969  v0  R+     0:00.00 ps
  317  v1  Ss+    0:00.21 -csh (csh)
  915  v2  Is+    0:00.12 -csh (csh)
  913  v3  Is+    0:00.12 -csh (csh)

Looks like we terminated six of the original PIDs, but four processes ignored our TERM signal. Let's be a bit more aggressive:

kill -KILL -1
  PID  TT  STAT      TIME COMMAND
  317  v1  Ss     0:00.22 -csh (csh)
  995  v1  R+     0:00.00 ps

If you scroll through your original four terminals, you'll see the login prompt at three of them. This last command killed all processes except the process you executed the kill command from, that is, all processes except the c shell you ran the kill command in.

You'll note that if you make a typo and type:

kill 1

instead of:

kill -1

you'll receive the following error message:

1: Operation not permitted 

-1 is the special PID that represents all of your processes; 1 is the PID of the process named init. Only the superuser can kill the init process. Also, the superuser should only kill init if the superuser knows what he is doing.

Now let's see what happens if we repeat this exercise as the superuser. First, I'll run the ps command on my test computer that is running all kinds of neat stuff: Apache, MySQL, Squid, NFS, etc.

ps -acux
USER      PID %CPU %MEM   VSZ  RSS  TT  STAT STARTED      TIME COMMAND
genisis  1050  0.0  0.2   428  244  v0  R+    4:08PM   0:00.00 ps
root        1  0.0  0.2   532  304  ??  ILs   5:10AM   0:00.04 init
root        2  0.0  0.0     0    0  ??  DL    5:10AM   0:00.03 pagedaemon
root        3  0.0  0.0     0    0  ??  DL    5:10AM   0:00.00 vmdaemon
root        4  0.0  0.0     0    0  ??  DL    5:10AM   0:00.04 bufdaemon
root        5  0.0  0.0     0    0  ??  DL    5:10AM   0:02.62 syncer
root       27  0.0  2.0 70780 2540  ??  ILs   5:10AM   0:00.08 mount_mfs
root       30  0.0  0.1   208   92  ??  Is    5:10AM   0:00.00 adjkerntz
root      110  0.0  0.3   536  368  ??  Ss   10:10AM   0:00.22 dhclient
root      163  0.0  0.5   904  608  ??  Ss   10:10AM   0:00.19 syslogd
daemon    166  0.0  0.4   916  556  ??  Is   10:10AM   0:00.01 portmap
root      171  0.0  0.3   504  320  ??  Is   10:10AM   0:00.00 mountd
root      173  0.0  0.1   360  172  ??  Is   10:10AM   0:00.01 nfsd
root      175  0.0  0.1   352  164  ??  I    10:10AM   0:00.00 nfsd
root      176  0.0  0.1   352  164  ??  I    10:10AM   0:00.00 nfsd
root      177  0.0  0.1   352  164  ??  I    10:10AM   0:00.00 nfsd
root      178  0.0  0.1   352  164  ??  I    10:10AM   0:00.00 nfsd
root      181  0.0  0.5 263052  576  ??  Is   10:10AM   0:00.00 rpc.statd
root      197  0.0  0.6  1028  764  ??  Is   10:10AM   0:00.02 inetd
root      199  0.0  0.6   956  700  ??  Ss   10:10AM   0:00.19 cron
root      202  0.0  1.0  1424 1216  ??  Is   10:10AM   0:00.20 sendmail
root      227  0.0  0.4   876  488  ??  Is   10:10AM   0:00.00 moused
root      261  0.0  1.4  2068 1704  ??  Ss   10:10AM   0:00.98 httpd
root      275  0.0  0.4   620  448 con- I+   10:10AM   0:00.02 sh
root      293  0.0  0.4   624  452 con- I+   10:10AM   0:00.01 sh
mysql     303  0.0  1.4 10896 1796 con- S+   10:10AM   0:00.43 mysqld
nobody    305  0.0  4.7  6580 5928 con- S+   10:10AM   0:05.42 squid
nobody    308  0.0  1.4  2092 1704  ??  I    10:10AM   0:00.00 httpd
nobody    309  0.0  1.4  2092 1704  ??  I    10:10AM   0:00.00 httpd
nobody    310  0.0  1.4  2092 1704  ??  I    10:10AM   0:00.00 httpd
nobody    311  0.0  1.4  2092 1704  ??  I    10:10AM   0:00.00 httpd
nobody    312  0.0  1.4  2092 1704  ??  I    10:10AM   0:00.00 httpd
genisis   317  0.0  0.8  1336  960  v1  Is+  10:10AM   0:00.24 csh
root      320  0.0  0.5   920  628  v4  Is+  10:10AM   0:00.02 getty
root      321  0.0  0.5   920  628  v5  Is+  10:10AM   0:00.01 getty
root      322  0.0  0.5   920  628  v6  Is+  10:10AM   0:00.01 getty
root      323  0.0  0.5   920  628  v7  Is+  10:10AM   0:00.01 getty
nobody    324  0.0  0.3   832  348  ??  Is   10:10AM   0:00.01 unlinkd
root      992  0.0  0.5   920  628  v2  Is+   3:46PM   0:00.01 getty
root      993  0.0  0.5   920  628  v3  Is+   3:46PM   0:00.01 getty
genisis   994  0.0  0.8  1336  956  v0  Ss    3:46PM   0:00.14 csh
root        0  0.0  0.0     0    0  ??  DLs   5:10AM   0:00.02 swapper

Then I'll send the KILL signal to the special PID -1 as the superuser:

su
Password:
kill -9 -1

That command was a little scarier as it even kicked me out of the c shell I executed the kill command from. Once I logged back in, I assessed the damage like so:

ps -acux
USER      PID %CPU %MEM   VSZ  RSS  TT  STAT STARTED      TIME COMMAND
genisis  1070  0.0  0.2   396  244  v0  R+    4:11PM   0:00.00 ps
root        1  0.0  0.2   532  304  ??  ILs   5:10AM   0:00.05 init
root        2  0.0  0.0     0    0  ??  DL    5:10AM   0:00.03 pagedaemon
root        3  0.0  0.0     0    0  ??  DL    5:10AM   0:00.00 vmdaemon
root        4  0.0  0.0     0    0  ??  DL    5:10AM   0:00.05 bufdaemon
root        5  0.0  0.0     0    0  ??  DL    5:10AM   0:02.65 syncer
root     1059  0.0  0.5   920  628  v3  Is+   4:10PM   0:00.01 getty
root     1060  0.0  0.5   920  628  v2  Is+   4:10PM   0:00.01 getty
root     1061  0.0  0.5   920  628  v7  Is+   4:10PM   0:00.01 getty
root     1062  0.0  0.5   920  628  v6  Is+   4:10PM   0:00.01 getty
root     1063  0.0  0.5   920  628  v5  Is+   4:10PM   0:00.01 getty
genisis  1064  0.0  0.8  1336  956  v0  Ss    4:10PM   0:00.12 csh
root     1065  0.0  0.5   920  628  v4  Is+   4:10PM   0:00.01 getty
root     1066  0.0  0.5   920  628  v1  Is+   4:10PM   0:00.01 getty
root        0  0.0  0.0     0    0  ??  DLs   5:10AM   0:00.02 swapper

When the superuser sends a signal to -1, it is sent to every process except the system processes. If that signal happened to be the KILL signal, you would be hearing complaints from users who happened to have a file open at the time and lost their data.

This is one of the reasons only the superuser is allowed to run the reboot and halt commands. When one of these commands is issued, a TERM signal is sent to PID -1 to give all processes a chance to save their data; this is followed by a KILL signal to ensure that any remaining processes are terminated.

In next week's article, I'd like to continue a bit more on this theme and take a closer look at init and getty.

Dru Lavigne is a network and systems administrator, IT instructor, author and international speaker. She has over a decade of experience administering and teaching Netware, Microsoft, Cisco, Checkpoint, SCO, Solaris, Linux, and BSD systems. A prolific author, she pens the popular FreeBSD Basics column for O'Reilly and is author of BSD Hacks and The Best of FreeBSD Basics.


Read more FreeBSD Basics columns.

Discuss this article in the Operating Systems Forum.

Return to the BSD DevCenter.

 





Sponsored by: