ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


O'Reilly Book Excerpts: Apache Cookbook

Cooking with Apache, Part 2

Related Reading

Apache Cookbook
By Ken Coar, Rich Bowen

by Rich Bowen and Ken Coar

Editor's note: Last month, we published our first batch of recipes from the recently released Apache Cookbook. This week, we've excerpted three more samples. Find out how to make part of your web site available via SSL, how to place a CGI program in a directory that contains non-CGI documents, and how to redirect a 404 ("not found") page to another page (such as the front page of the site) in these latest samplings.

Recipe 7.4: Serving a Portion of Your Site via SSL

Problem

You want to have a certain portion of your site available via SSL exclusively.

Solution

This is done by making changes to your httpd.conf file.

For Apache 1.3, add a line such as the following:

Redirect /secure/ https://secure.domain.com/secure/

For Apache 2.0:

<Directory /www/secure>
    SSLRequireSSL
</Directory>

Or, with mod_rewrite:

RewriteEngine On
RewriteCond %{HTTPS} !=on
RewriteRule ^/(.*) https://%{SERVER_NAME}/$1 [R,L]

Discussion

It is perhaps best to think of your site's normal pages and its SSL-protected pages as being handled by two separate servers, rather than one. While they may point to the same content, they run on different ports, are configured differently, and, most importantly, the browser considers them to be completely separate servers. So you should too.

Don't think of enabling SSL for a particular directory; rather, you should think of it as redirecting requests for one directory to another.

Note that the Redirect directive preserves path information, which means that if a request is made for /secure/something.html, then the redirect will be to https://secure.domain.com/secure/something.html.

Be careful where you put this directive. Make sure that you only put it in the HTTP (non-SSL) virtual host declaration. Putting it in the global section of the config file may cause looping, as the new URL will match the Redirect requirement and get redirected itself.

Finally, note that if you want the entire site to be available only via SSL, you can accomplish this by simply redirecting all URLs, rather than a particular directory:

Redirect / https://secure.domain.com/

Again, be sure to put that inside the non-SSL virtual host declaration.

You will see various solutions proposed for this situation using RedirectMatch or various RewriteRule directives. There are special cases where this is necessary, but in most cases, the simple solution offered here works just fine.

It it important to understand that this Redirect must appear only in the non-SSL virtual host, otherwise it will create a condition where the Redirect will loop. This implies that you do in fact have the HTTP (non-SSL) site set up as a virtual host. If you do not, you may need to set it up as one in order to make this recipe successful.

Thus, the entire setup might look something like this:

NameVirtualHost *

<VirtualHost *>
    ServerName regular.example.com
    DocumentRoot /www/docs

    Redirect /secure/ https://secure.example.com/secure/
</VirtualHost>

<VirtualHost _default_:443>
    SSLEngine On
    SSLCertificateFile /www/conf/ssl/ssl.crt
    SSLCertificateKeyFile /www/conf/ssl/ssl.key

    ServerName secure.example.com
    DocumentRoot /www/docs
</VirtualHost>

This is, of course, an oversimplified example and is meant only to illustrate the fact that the Redirect must appear only in the non-SSL virtualhost to avoid a redirection loop.

The other two solutions are perhaps more straightforward, although they each have a small additional requirement for use.

The second recipe listed, using SSLRequireSSL, will work only if you are using Apache 2.0. It is a directive added specifically to address this need. Placing the SSLRequireSSL directive in a particular <Directory> section will ensure that non-SSL accesses to that directory are not permitted.

The third recipe, using RewriteCond and RewriteRule directives, requires that you have mod_rewrite installed and enabled. Using the RewriteCond directive to check if the client is already using SSL, the RewriteRule is invoked only if they are not; in which case, the request is redirected to a request for the same content but using HTTPS instead of HTTP.

See Also

Recipe 8.2: Enabling CGI Scripts in Non-ScriptAliased Directories

Problem

You want to put a CGI program in a directory that contains non-CGI documents.

Solution

Use AddHandler to map the CGI handler to the particular files that you want to be executed:

<Directory "/foo">
    Options +ExecCGI
    AddHandler cgi-script .cgi .py .pl
</Directory>

Discussion

Enabling CGI execution via the ScriptAlias directive is preferred, for a number of reasons, over permitting CGI execution in arbitrary document directories. The primary reason is security auditing. It is much easier to audit your CGI programs if you know where they are, and storing them all in a single directory ensures that.

However, there are cases where it is desirable to have this functionality. For example, you may want to keep several files together in one directory — some of them static documents, and some of them scripts — because they are part of a single application.

Using the AddHandler directive maps certain file extensions to the cgi-script handler so they can be executed as CGI programs. In the case of the aforementioned example, programs with a .cgi, .py, or .pl file extension will be treated as CGI programs, while all other documents in the directory will be served up with their usual MIME type.

Note that the +ExecCGI argument is provided to the Options directive, rather than the ExecCGI argument — that is, with the + sign rather than without. Using the + sign adds this option to any others already in place, whereas using the option without the + sign will replace the existing list of options. You should use the argument without the + sign if you intend to have only CGI programs in the directory, and with the + sign if you intend to also serve non-CGI documents out of the same directory.

See Also

 

Recipe 9.5: Redirecting Invalid URLs to Some Other Page

Problem

You want all "not found" pages to go to some other page instead, such as the front page of the site, so that there is no loss of continuity on bad URLs.

Solution

Use the ErrorDocument to catch 404 (Not Found) errors:

ErrorDocument 404 /index.html
DirectoryIndex index.html /path/to/notfound.html

Discussion

The recipe given here will cause all 404 errors — every time someone requests an invalid URL — to return the URL /index.html, providing the user with the front page of your web site, so that even invalid URLs still get valid content. Presumably, users accessing an invalid URL on your web site will get a page that helps them find the information that they were looking for.

On the other hand, this behavior may confuse the user who believes she knows exactly where the URL should take her. Make sure that the page that you provide as the global error document does in fact help people find things on your site, and does not merely confuse or disorient them. You may, as shown in the example, return them to the front page of the site. From there they should be able to find what they were looking for.

When users get good content from bad URLs, they will never fix their bookmarks and will continue to use a bogus URL long after it has become invalid. You will continue to get 404 errors in your log file for these URLs, and the user will never be aware that they are using an invalid URL. If, on the other hand, you actually return an error document, they will immediately be aware that the URL they are using is invalid and will update their bookmarks to the new URL when they find it.

Note that, even though a valid document is being returned, a status code of 404 is still returned to the client. This means that if you are using some variety of tool to validate the links on your web site, you will still get good results, if the tool is checking the status code, rather than looking for error messages in the content.

See Also

Check back here next month when we run the final batch of samples from the book. Recipes will cover how to require logins for proxied content; how to optimizing symbolic links; and how to solve the "Trailing Slash" problem.

Rich Bowen is a member of the Apache Software Foundation, working primarily on the documentation for the Apache Web Server. DrBacchus, Rich's handle on IRC, can be found on the web at www.drbacchus.com/journal.

Ken Coar is a member of the Apache Software Foundation, the body that oversees Apache development.


Return to the Apache DevCenter.

Copyright © 2009 O'Reilly Media, Inc.