BSD DevCenter
oreilly.comSafari Books Online.Conferences.

advertisement


FreeBSD Basics

Find: Part Two

03/14/2002

In the last article, I introduced the Unix find command. This week, I'd like to continue by demonstrating some more of the switches that are available with this handy command.

Let's continue where we left off, with this example:

find . -atime +7 -o -size +`expr 10 \* 1024 \* 2` -print

As a recap, this command was looking for any files in the current directory and its subdirectories (represented by .) that have not been accessed for more than 7 days (-atime +7) or (-o) that were greater than a certain size (-size +). I used the expr command to calculate the size for me. Since I was aiming for 10MB and find thinks in terms of 512 bytes, I needed to calculate 10 times 1024 times 2 (as 2 times 512 is 1024).

Notice that I used the ` or "backquote" (the key on the far left of your PC keyboard). In Unix, whenever you want the output of one command passed to another command, put the command that will give the output between backquotes; this is known as command substitution. By putting the math that I wanted calculated between backquotes, the resulting calculation was passed to the -size switch and used by the find command.

The last thing I want you to notice is that I also had to quote the two * in the command using the \ character. When calculating math, * represents multiply; however, to the shell it represents a wildcard. By placing a \ before the *, the shell won't interpret it as a wildcard, so expr receives the * and will know that I want it to perform a multiplication.

Let's try some more examples. Let's say I have a large directory structure and I wish to search for a certain pattern and remove all of the files that match this pattern. There are several ways to do this with the find command, so let's compare some of these methods.

In my home directory, I have a directory called tmp that contains a subdirectory named tst. This tst directory has a lot of files and subdirectories, and some of these files end with a .old extension. Let's start by seeing just how many files live in my tst directory:

cd ~/tmp/tst
find . -print | wc -l
   269

Notice that when the find command ran, it printed each file found on a separate line. I could then pipe that result to the word count (wc) command using the switch that counted the lines (-l). This told me that I have 269 files (including directories, since to Unix, directories are really files) in my tst directory.

Let's see how many of these files have a .old extension:

find . -name "*.old" -print | wc -l
   67

Now, how can I go about removing these *.old files? One way is to use the -exec switch and have it call the rm command like so:


find . -name "*.old" -exec rm {} \;

Once that is finished, I can repeat this command to see if there are any remaining *.old files:

find . -name "*.old" -print | wc -l
   0

This command works, but it may not always be the best way to remove a large number of files. Whenever you use the -exec switch, a separate process is created for every file that find finds. This may not be an issue if you are only finding a small amount of files on your home computer. It may be an issue if you are finding hundreds or thousands of files on a production system. Regardless, this method does consume more resources and is slower than other methods.

Let's look at a second way to delete these files, this time using xargs:


find . -name "*.old" -print | xargs rm

You'll note that I didn't have to include the \; string at the end of this command, as that string is used to terminate commands that are passed to exec. By using xargs in this command, I will still remove all of the files that end in .old, but instead of creating a separate process for each file that is found, only one process is started through xargs. As find finds each file, it creates a list with each file on its own line. This list is passed to xargs, which takes all of the lines of the file and places them onto one line with a space to separate each file; it then passes this argument list of files to the rm command.

Learning the Unix Operating System

Related Reading

Learning the Unix Operating System
A Concise Guide for the New User
By Jerry Peek, Grace Todino-Gonguet, John Strang

There is actually a third way to remove these files, using the -delete switch with find:

find . -name "*.old" -delete

This command has the easiest syntax to use and is actually the most efficient way of removing files. The -delete switch doesn't even need to open a separate process: all of the files are removed by the find process. Also, this command should always work, whereas the xargs command may fail if find finds more files that can be passed to a command as an argument list. If you are searching a deep directory structure or have very long filenames, you may reach this limit. If you are curious as to the actual limit, there is a sysctl value that has been set for you:

sysctl -a | grep kern.argmax
kern.argmax: 65536

The 65536 represents the maximum number of bytes (or characters) in an argument list.

Before moving on to some other switches, I should mention that you may want to verify which files find will find before removing them. In my examples, I was just removing old files in one of my test directories. If you are concerned that find may find some files you don't want deleted, run your command like this first:

find . -name "*.old" -print

This will give you a list of all the matching files. If the list looks good, use the -delete switch to remove the files as in the example mentioned above.

Or, you can do the above in just one find command by using -ok like so:

find . -name "*.old" -ok rm {} \;

The -ok will prompt for verification before executing the command that follows it. You'll note that I do have to use the rm command; I can't use the -delete switch. And, as with using -exec, I have to use the {} \; syntax in order for -ok to work.

Pages: 1, 2

Next Pagearrow





Sponsored by: