Linux DevCenter    
 Published on Linux DevCenter (http://www.linuxdevcenter.com/)
 See this if you're having trouble printing code examples


Unix Power Tools
Telling tar Which Files to Exclude or Include

by Jerry Peek
02/03/2000

On some systems, make creates filenames starting with a comma (,) to keep track of dependencies. Various editors create backup files whose names end with a percent sign (%) or a tilde (~). I often keep the original copy of a program with the .orig extension and old versions with a .old extension.

I often don't want to save these files on my backups. There may be some binary files that I don't want to archive, but don't want to delete either.

A solution is to use the X flag to tar. [Check your tar manual page for the F and FF options, too. - JIK ] This flag specifies that the matching argument to tar is the name of a file that lists files to exclude from the archive. Here is an example:


% find project ! -type d -print | \
egrep '/,|%$|~$|\.old$|SCCS|/core$|\.o$|\.orig$' > Exclude
% tar cvfX project.tar Exclude project

In this example, find lists all files in the directories, but does not print the directory names explicitly. If you have a directory name in an excluded list, it will also exclude all the files inside the directory. egrep is then used as a filter to exclude certain files from the archive. Here, egrep is given several regular expressions to match certain files. This expression seems complex but is simple once you understand a few special characters:

A breakdown of the patterns and examples of the files that match these patterns is given here:

Instead of specifying which files are to be excluded, you can specify which files to archive using the - I option. As with the exclude flag, specifying a directory tells tar to include (or exclude) the entire directory. You should also note that the syntax of the - I option is different from the typical tar flag. The next example archives all C files and makefiles. It uses egrep's () grouping operators to make the $ anchor character apply to all patterns inside the parentheses:


% find project -type f -print | \
egrep '(\.[ch]|[Mm]akefile)$' > Include
% tar cvf project.tar -I Include

I suggest using find to create the include or exclude file. You can edit it afterward, if you wish. One caution: extra spaces at the end of any line will cause that file to be ignored.

One way to debug the output of the find command is to use /dev/null as the output file:


% tar cvfX /dev/null Exclude project

Including Other Directories

There are times when you want to make an archive of several directories. You may want to archive a source directory and another directory like /usr/local. The natural, but wrong, way to do this is to use the command:


% tar cvf /dev/rmt8 project /usr/local

Note

When using tar, you must never specify a directory name starting with a slash (/). This will cause problems when you restore a directory.


The proper way to handle the incorrect example above is to use the - C flag:


% tar cvf /dev/rmt8 project -C /usr local

This will archive /usr/local/... as local/....

Type Pathnames Exactly

For the above options to work when you extract files from an archive, the pathname given in the include or exclude file must exactly match the pathname on the tape.

Here's a sample run. I'm extracting from a file named appe.tar. Of course, this example applies to tapes, too:


% tar tf appe.tar
appe
code/appendix/font_styles.c
code/appendix/xmemo.c
code/appendix/xshowbitmap.c
code/appendix/zcard.c
code/appendix/zcard.icon

Next, I create an exclude file, named exclude, that contains the lines:


code/appendix/zcard.c
code/appendix/zcard.icon

Now, I run the following tar command:


% tar xvfX appe.tar exclude
x appe, 6421 bytes, 13 tape blocks
x code/appendix/font_styles.c, 3457 bytes, 7 tape blocks
x code/appendix/xmemo.c, 10920 bytes, 22 tape blocks
x code/appendix/xshowbitmap.c, 20906 bytes, 41 tape blocks
code/appendix/zcard.c excluded
code/appendix/zcard.icon excluded

Exclude the Archive File!

If you're archiving the current directory (.) instead of starting at a subdirectory, remember to start with two pathnames in the Exclude file: the archive that tar creates and the Exclude file itself. That keeps tar from trying to archive its own output!


% cat > Exclude
./somedir.tar
./Exclude
CTRL-d
% find . -type f -print | \
egrep  '/,|%$|~$|\.old$|SCCS|/core$|\.o$|\.orig$' >>Exclude
% tar cvfX somedir.tar Exclude .

In that example, we used cat > to create the file quickly; you could use a text editor instead. Notice that the pathnames in the Exclude file start with ./; that's what the tar command expects when you tell it to archive the current directory (.). The long find/egrep command line uses the >> operator to add other pathnames to the end of the Exclude file.

Or, instead of adding the archive and exclude file's pathnames to the exclude file, you can move those two files somewhere out of the directory tree that tar will read.


Back More Unix Power Tools

 

Copyright © 2009 O'Reilly Media, Inc.