LinuxDevCenter.com

oreilly.comSafari Books Online.Conferences.
Sign In/My Account | View Cart   

We've expanded our Linux news coverage and improved our search! Search for all things Linux across O'Reilly!

Search
Search Tips

advertisement


Listen Print Subscribe to Linux Subscribe to Newsletters

Unix Power Tools
Hacking on Characters with tr

by Tim O'Reilly and Jerry Peek
01/27/2000

The tr command is a character translation filter, reading standard input and either deleting specific characters or substituting one character for another.

The most common use of tr is to change each character in one string to the corresponding character in a second string. (A string of consecutive ASCII characters can be represented as a hyphen-separated range.)

For example, the command:

     
$ tr 'A-Z' 'a-z' <file

will convert all uppercase characters in file to the equivalent lowercase characters. The result is printed on standard output.

In the System V version of tr, square brackets must surround any range of characters. That is, you have to say: [a-z] instead of simply a-z . And of course, because square brackets are meaningful to the shell, you must protect them from interpretation by putting the string in quotes.

If you aren't sure which version you have, here's a test. The Berkeley version converts the input [] to A characters because [] aren't treated as range operators:

% echo '[]' | tr '[a-z]' A
AA                                  Berkeley version
% echo '[]' | tr '[a-z]' A 
[]                                  System V version

There's one place you don't have to worry about the difference between the two versions: when you're converting one range to another range, and both ranges have the same number of characters. For example, this command works in both versions:

$ tr '[A-Z]' '[a-z]' < file

The Berkeley tr will convert a [ from the first string into the same character [ in the second string, and the same for the ] characters. The System V version uses the [] characters as range operators. In both versions, you get what you want: the range A-Z is converted to the corresponding range a-z. Again, this trick works only when both ranges have the same number of characters.

The System V version also has a nice feature: the syntax [a*n], where n is some digit, means that the string should consist of n repetitions of character "a." If n isn't specified, or is 0, it is taken to be some indefinitely large number. This is useful if you don't know how many characters might be included in the first string.

This translation (and the reverse) can be useful from within vi for translating a string. You can also delete specific characters. The -d option deletes from the input each occurrence of one or more characters specified in a string (special characters should be placed within quotation marks to protect them from the shell). For instance, the following command passes to standard output the contents of file with all punctuation deleted (and is a great exercise in shell quoting):

$ tr -d ",.!?;:'"'"`'< file

The -s (squeeze) option of tr removes multiple consecutive occurrences of the same character in the second argument. For example, the command:

$ tr -s " " " " <file

will print on standard output a copy of file in which multiple spaces in sequence have been replaced with a single space.

We've also found tr useful when converting documents created on other systems for use under UNIX. For example tr can be used to change the carriage returns at the end of each line in a Macintosh text file into the newline UNIX expects. tr allows you to specify characters as octal values by preceding the value with a backslash, so the command:

$ tr '\015' '\012' < file.mac > file.unix

does the trick.

The command:

$ tr -d '\015' < pc.file

will remove the carriage return from the carriage return/newline pair that a PC file uses as a line terminator.


Back More Unix Power Tools

 




Tagged Articles

Be the first to post this article to del.icio.us

Sponsored Resources

  • Inside Lightroom
Advertisement
O'Reilly Media
© 2008, O'Reilly Media, Inc.
(707) 827-7000 / (800) 998-9938
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.
About O'Reilly
Privacy Policy
Contacts
Authors
Press Room
Jobs
User Groups
Academic Solutions
Newsletters
Writing for O'Reilly
RSS Feeds
Other O'Reilly Sites
O'Reilly Radar
Ignite
Tools of Change for Publishing
Digital Media
Inside iPhone
O'Reilly FYI
makezine.com
craftzine.com
hackszine.com
perl.com
xml.com
Sponsored Sites
Inside Aperture
Inside Lightroom
Inside Port 25
InsideRIA
java.net