Apache DevCenter
oreilly.comSafari Books Online.Conferences.

advertisement


Log Rhythms
Pages: 1, 2, 3

The Access Log

Let's see what's lurking inside that log. For the purposes of this look at a typical set of logs, I'm assuming your Apache server has been configured to use Common Log Format (CLF), the default in a fresh Apache installation. Your httpd.conf file should contain the following configuration directive:



CustomLog logs/access_log common

Look at your access log, the location of which will depend upon your layout preferences and installation method. The Apache 1.3.9 RPM installation under Red Hat 6.1 places logs in an /etc/httpd/logs directory. The source and binary installs typically use /usr/local/apache/logs/access_log. The default filename under Windows is access.log.

Let's zoom in on one fairly representative line in a log:

123.45.678.90 - - [07/Mar/2000:14:27:12 -0800] 
"GET /mypage.html HTTP/1.1" 200 10369
123.45.678.90

The visitor's IP address. If you particularly need the visitor's host name, read the Apache documentation on the HostNameLookups directive.

- -

The first of the two dashes is a placeholder for something called ident, a less trustworthy form of client identification. That's about all I'll say on this; for further information, see Apache's IdentityCheck directive.

The second dash is a placeholder for the user name supplied by a visitor if required to log in to gain access to a password-protected section of the web site. Say, for example, I restricted access to a private directory on my server to only myself. Upon visiting http://www.memyselfandi.net/private, I'd have to log in (say, as the user "me") to gain access to that directory's contents. Thereafter, all my requests for items in that directory are logged, replacing the dash with me.

[07/Mar/2000:14:27:12 -0800]

The date, time, and time-zone.

GET /mypage.html

The visitor's request, in this case the mypage.html document in the web server's document root.

You'll often see requests consisting only of a slash, GET /, or composed of a directory path and ending in only a slash, GET /some/path/. This denotes a request for the default document within the server's document root or along some directory path. So, if your default DirectoryIndex is index.html, every request for / results in the return of that directory's index.html document to the visitor's browser. If no DirectoryIndex document exists in the requested directory, the browser will display either a listing of the files in that directory or a "Forbidden" message, depending on your IndexOptions and FancyIndexing settings.

HTTP/1.1

The browser's request protocol, in this case HTTP, version 1.1. An older, yet still very common protocol, is HTTP 1.0.

200

An HTTP status code is returned as part of the response to the visitor's browser. 200 signifies "OK" -- request fulfilled. A common error you might have come across in your Web travels is "404 Not Found," indicating that the request does not match anything on the server. Also, a code of "304 Not Modified" says that the content has not changed since it was last requested. In other words, you've visited before and already have the latest copy of this content in your browser's cache, so the content is not resent for efficiency's sake.

10369

The number of bytes returned to the visitor, excluding headers (status codes and the like). In the case of a 304 Not Modified status (see above), this value is the usual - placeholder.

Logging in Apache (version 1.2 and later) is handled by the Apache module, mod_log_config, which enables you to customize how your logs look and work. Your httpd.conf file contains some popular log formats to get you started:

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" 
\"%{User-Agent}i\"" combined
LogFormat "%h %l %u %t \"%r\" %>s %b" common
LogFormat "%{Referer}i -> %U" referer
LogFormat "%{User-agent}i" agent

Each log format starts out with the LogFormat directive, followed by a string of tokens that describe how each line of the log file should look, and ending with a nickname given to the format. Click here for a comprehensive list of tokens and their meanings. How you want your logs displayed and into how many files you want them sorted is up to you. Some site authors separate log files into referrer and agent logs. I prefer to use the "combined" log format and keep everything in one place.

Let's say I wish to use "common" log format, but also want to keep track of who is linking to my site. I could just use "combined" format, but I don't really care what type of browser (agent) my visitor is using. Instead, I'll create a new LogFormat directive like so:

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\"" commonish

Now that I've defined my preferred log format, I need to tell Apache to use this format. Using my "commonish" log format above:

CustomLog logs/commonish_log commonish

where logs/commonish_log is the path to my log file relative to my ServerRoot. You can actually skip the LogFormat directive and include your preferred log format string in place of the nickname in your CustomLog directive -- it's up to you.

We've only just scratched the surface of log customization. For much more, be sure to read the detailed mod_log_config documentation.

Pages: 1, 2, 3

Next Pagearrow





Sponsored by: