AddThis Social Bookmark Button


Profiling LAMP Applications with Apache's Blackbox Logs
Pages: 1, 2, 3, 4

What to Log

The directives listed below are most compatible with Apache 2.0 with the mod_logio module. Later in the article, I will discuss cross-compatibility with Apache 1.3 and other server environments.

Source IP, Time, and Request Line

%a, %t, and %r

These directives are already used in the common log file format. They are the three most obvious request metrics to track.

When logging the remote host, it is important to log the client IP address, not the hostname. To do this, use the %a directive instead of %h. Even if HostnameLookups are turned on, using %a will only record the IP. For the purposes of the Blackbox format, reverse DNS should not be trusted.

The %t directive records the time that the request started. It could be modified using a strftime format, but it would be better to keep it as is. That makes it easier to correlate lines between the Blackbox log file and the common log file.

The %r directive is the first line of text sent by the web client, which includes the request method, the full URL, and the HTTP protocol. It is possible to break up this data using individual directives. For example, you could log a URL without a query string. Again, it's better to keep the request line intact for comparison.

Process id and Thread id

%{pid}P and %{tid}P

When the Apache server starts, it spawns off child processes to handle incoming requests. As it runs, it shuts down older processes and adds new ones. Apache can add additional child processes if it needs to keep up with a high demand. By recording the process id and thread id (if applicable), you will have a record of which child process handled an incoming client.

You can also track the number of Apache processes for a given time and determine when a child process shut down. If you are running an application handler (mod_perl, mod_python), recording the PID will make it easier to find out what hits a child process was handling when debugging an application error.

Connection Status


The connection status directive tells us detailed information about the client connection. It returns one of three flags: X if the client aborted the connection before completion, + if the client has indicated that it will use keep-alives (and request additional URLs), or - if the connection will be closed after the request.

Keep-Alive is an HTTP 1.1 directive that informs a web server that a client can request multiple files during the same connection. This way a client doesn't need to go through the overhead of re-establishing a TCP connection to retrieve a new file.

For Apache 1.3, use the %c directive in place of the %X directive.

Status Codes

%s and %>s

There's nothing really new about this directive, since it's already used in the common log file format. The CLF records the status code — after any redirections take place — with %>s. For the Blackbox format, we will want to record the status code before and after the redirection took place.

Time to Serve Request

%T and %D

The common log file format cannot accurately determine the amount of time it takes to serve a file. Some parsing programs will try to make estimates based on the timestamp on hits from the same source, but it is very unreliable, especially if the hits are being made in parallel.

These two directives will give you the exact metrics you need. The %T directive will report the time in seconds it took to handle the request while the %D directive will report the same time in microseconds.

Apache 1.3 does not support the %D directive.

Bytes Sent and Received

%I, %O, and %B

Apache 2.0 includes the optional mod_logio module which can report on how many bytes of traffic a client sent and how many bytes the server returned.

The %b directive does a good job, but it only reports the bytes returned in the requested object, excluding the bytes from the HTTP headers. The header traffic is usually small, but you may want to record it to get a better idea of outgoing TCP traffic for a given interface can be like. Recording the incoming bytes is helpful when your users are uploading files with the PUT or POST methods.

Use %I to record incoming bytes, %O to record outgoing bytes, and %B to record outgoing content bytes. In cases where no content is returned, the %B directive returns a zero, whereas %b returns a dash. Since we're dealing with integer values, it's better to use %B.

Pages: 1, 2, 3, 4

Next Pagearrow