ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


Big Scary Daemons

Rotating Log Files

06/14/2001

Also in Big Scary Daemons:

Running Commercial Linux Software on FreeBSD

Building Detailed Network Reports with Netflow

Visualizing Network Traffic with Netflow and FlowScan

Monitoring Network Traffic with Netflow

Information Security with Colin Percival

Log files grow. That's what they're there for, after all. As a systems administrator, you need to be able to control log growth. FreeBSD provides a basic log file handler, newsyslog.

The newsyslog program handles standard log file rotation. The oldest logs are deleted. Each old log is renamed. Finally, the current log is moved and a new log file is created. The newsyslog program can also compress files, restart daemons, and in general handle all the routine tasks of shuffling files.

The cron daemon fires up newsyslog once an hour. It scans the /etc/newsyslog.conf file, and checks each log file listed there. If the conditions listed for rotating the logfile are met, the log is rotated.

The /etc/newsyslog.conf file uses one line per logfile. The first entry on each line is the logfile name. This is a full path, such as /var/log/httpd-error.log.

The second entry is optional, and actually doesn't appear in the default /etc/newsyslog.conf. It lists the owner and group of the file, separated by a colon like this: root:wheel. Newsyslog can change the owner and group of old log files. By default, log files are owned by "root" and in the group "wheel". While changing the owner isn't commonly done, you might have use for this on multiuser machines.

You can choose to only change the owner, or only change the group. In this case, you must use a colon, even though nothing appears on the other side of it. For example, :www will change the group to www, while user827: will change the owner to user827.

The third field is the mode, in standard Unix three-digit notation.

Then we have a "count" field. This is the number of old log files that newsyslog keeps -- kind of. newsyslog starts counting archived log files at 0. Many computer systems start numbering at zero, but newsyslog includes 0 and goes up to the count number. With the default count setting of 5 for /var/log/messages, /var/log includes the following files:

messages
messages.0.gz
messages.1.gz
messages.2.gz
messages.3.gz
messages.4.gz
messages.5.gz

Those of you who can count will recognize that this is six backups, not five, plus the current log file! As a rule, though, it's better to have too many files than not enough. Still, if you're tight on disk space deleting an extra log file or two might buy you some time. Similarly, some web servers can have hundreds of sites on a single server; one or two files times a hundred sites can create a lot of disk space.

Newsyslog uses the next two fields, size and time, to determine if it should rotate a log on this run. You can rotate logs at a given time, or when they reach a certain size, or both. If you use both, the log will rotate whenever either condition is met.

If either the size or time isn't important (for example, you want to rotate every day, no matter how large the file gets), use an asterisk.

The fifth field is for file size. When newsyslog runs, it compares the size listed here to the size of the file. If the file is larger than the given size in kilobytes, it is rotated.

So far, it's easy, right?

The sixth field, time, is the one that makes new administrators cry. The time field has four possible types of value: an asterisk, a number, and two different date formats.

If you don't want to rotate a log at a particular time, put an asterisk (*) here.

If you use a plain naked number, newsyslog will rotate the log after that many hours have passed. For example, if you want a log to rotate every 24 hours, but don't care exactly when this rotation happens, use "24" here.

Any time beginning with an "@" is in ISO-8601-restricted time format. This is a standard used by newsyslog on most Unix systems, and was the time format originally used in MIT's primordial newsyslog program. It's not at all clear on first sight. Since it's a standard, FreeBSD supports it.

A full date in ISO 8601 format is 16 digits with a "T" in the middle. The first four are the year; the next two are the month; the next two are the date. The letter T is inserted after the date as a sort of "decimal point," separating whole days from fractions of one. The next two digits are hours; the next are minutes; the next are seconds.

For example, the date and time "February 2, 2002, 9:15 and 8 seconds p.m.," is expressed in ISO 8601 as:

20020202T211508

You must have a T in an ISO 8601 date.

Specifying complete dates in ISO 8601 is straightforward and obvious. The confusion comes in when you don't list the whole date. You can choose to specify only fields near the T, leaving fields further away blank. Any fields you don't fill in are wildcards, and match anything.

For example, T23 matches every day of the year, and the 23rd hour of the day. If you use a newsyslog time of @T23, that log will rotate every day at 11:00 p.m. 4T00 matches midnight of the fourth day of every month, so @4T00 will make the log rotate at that date and time.

Much like when you're working with crontab, you need to specify hours. A date like @7T will run once an hour, every hour, on the seventh of the month. After all, it matches all day long! This can be useful for debugging, but isn't generally useful.

One problem with this system is that it doesn't allow you to easily designate weekly jobs. It's not at all uncommon to want to rotate a log on Mondays, for example. And specifying the last day of the month is impossible. That's where the final time format comes in.

Any time with a leading cash sign ($) is in the FreeBSD-specific month-week-day format. This works much like cron, allowing you to set particular days of the week to run a job on.

This format uses three identifiers: M (day of month), W (day of week), and H (hour of day). Each is followed by a number indicating the particular time it should be run. Hours range from 0 to 23, while week days range from 0 (Sunday) to 6. M starts with 1, and goes up to the number of days in that particular month.

For example, if you want a log to rotate every Sunday at 8 a.m., you could use a time of $W0H8. If you wanted the log to rotate on the fifth of the month at noon, you could use $M5H12.

One interesting feature of this system is that you can automatically schedule a job for the last day of the month by using the special day of the month "L". Without this, it's very difficult to do an end-of-month job without writing a script that includes how any days are in each month. If you wanted to start your month-end log file accounting two hours before the end of the month, you could use a time of $MLH22.

Once you've figured out how to exactly express the time you want your log to run, there's a flags field. This is optional for many logs, but vital for others. Newsyslog inserts a "logfile turned over" message into new log files it creates. If a log file is a binary (such as, /var/log/wtmp), adding this message would really screw up the file. The "B" flag tells newsyslog to not write this message.

Many log files are plain ASCII text. Compressing them saves a huge amount of room. The Z flag indicates that the old log file should be compressed with gzip.

You should only use one of these flags.

The next field is the "pidfile" path. A pidfile is a simple method of recording a program's process ID so other programs can see it. Most programs store their pidfiles under /var/run -- take a look and see what's on your system. If you list the full path to a pidfile here, newsyslog will send a signal to that program when it rotates the log. For example, the Apache web server needs to be notified when you rotate its logs. By listing its pidfile here, you can have newsyslog send a kill -1 to Apache so it will handle its part of log file rotation.

Most programs will handle log file rotation on a kill -1, or SIGHUP. Some programs need a specific signal when a log file is rotated. If you have one of these programs, you can list the exact signal number required in the last field.

Let's slap this all together in a worst-case, "you-have-got-to-be-kidding" example. Assume you have a database log file that you want to rotate at 11:00 p.m. on the last day of every month. The database documentation says that you need to send the program an interrupt signal (SIGINT, or signal number 2) upon rotation. You want the archived logs to be owned by the user "dbadmin", and only viewable by that user. You need a month of logs. What's more, the logs are binary files, and need to be untouched by newsyslog. Your newsyslog.conf line would look this.

/var/log/database     dbadmin:  600  30     *    $MLH23 B /var/run/db.pid 2

This is an extreme example; in most cases, you just slap in the file name and rotation condition and you're done.

Michael W. Lucas


Read more Big Scary Daemons columns.

Return to the BSD DevCenter.

Copyright © 2009 O'Reilly Media, Inc.