ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


PHP Forms

The Universal Web Form Processor

12/29/2000

One recent evening the Phanatic and a few friends were splitting an extra large pizza and a few six packs of Jolt. The Phanatic was pontificating as usual about good program design and problem generalization. One of the PHP apostles, probably egged on by a Jolt buzz, challenged the Phanatic to "practice what you preach" and design a program that solved a class of problems rather a single task. Another of the assembled brethren suggested the Phanatic use the NuSphere suite he had been extolling recently. The gauntlet was thrown down.

The Phanatic like any good programmer can't refuse a good challenge. The only thing more motivating than a challenge was the statement, "it's impossible." After a little discussion, the huddled mass yearning to program proclaimed the task would be a form-process script that "does everything."

Before getting into the generic form script, let's take a look at the Phanatic's new favorite product.

NuSphere MySQL

The product making the Phanatic so excited is NuSphere MySQL. It's an integrated multi-platform distribution of Apache, Perl, PHP, and MySQL. Each element of the distribution package is the pick of the litter in their respective open source application category.

NuSphere distributions are available for Linux, Unix, and Windows platforms. If you have a lot of patience and a high-speed connection, the package can be downloaded for free. The CD ROM boxed edition is $79 and includes a hard copy of the massive MySQL Reference and the O'Reilly pocket guides for Apache, Perl 5, and PHP 4.

The Windows downloadable version of NuSphere's MySQL weighs in at about 24 megs without source code and 50 megs with source code for all products. Unzipping the package generates over 4,500 files and consumes 65 megs of disk space! Although this sounds like a lot of bits and bytes, (especially to a guy who's first hard drive was 32 megs), let's put it into perspective. The NuSphere distribution includes the four major web-related offerings in the open source world. Collectively they consume less disk space than that big software company's word processor.

For good measure, the installation includes the complete online documentation for all four products and, additionally, a graphical administration tool for managing MySQL. Here's the best part: within two minutes of unpacking the zipped download, you can be testing the MySQL, Perl, and PHP on an Apache server running on your PC. If you have ever struggled with the installation and configuration of just one of these products, you will think the NuSphere installation is magic.

NuSphere installation

The first step in the NuSphere suite is unzipping the downloaded zip file. I would suggest you create a directory C:\NuSphere and extract the files into that directory. The extraction process will create two directories under C:\NuSphere, C:\Nusphere\apache and C:\NuSphere\NuSphereMySQL-n.nn.n-Win32, where n.nn.n (something like 1.13.2) is the MySQL version information. Next, run the file setup.exe from the directory:

C:\NuSphere\NuSphereMySQL-n.nn.n-Win32

A browser window will open with the salutation, "Welcome to the NuSphere MySQL installation CD." Click on the Install button. The next window offers the option for a Quick or Custom installation.

Unless you are highly experienced, take the Quick installation option. The default path for the Quick install is C:\nusphere. Click on the Quick install button accepting the default path. MySQL, Apache, Sample Website, PHP, phpMyAdmin, and Perl are installed in turn. Next, follow the "Click here to continue" link. The next screen has a link to register and a link to do a quick start. After registering, click on "Quick Start."

A new window will open with the default home page. Check out the various links to test your installation. The startup page also has links to full documentation for MySQL, Perl, PHP, Apache, and the phpMyAdmin applications which reside on your PC. This is truly an impressive package.

After installation

What you do next is a function of your background and experience with Apache and the other included packages. Let's discuss paths for a moment. At the server level it's important to recognize the difference between action and virtual paths. When you point your browser to a URL like, http://www.xyz.com/index.html, there is a lot going on under the surface. Where does index.html actually live? The server configuration file contains an entry associating an actual starting path for the domain xyz.com portion of the URL. It might be something as simple as

/home/xyz/htdocs

although the path is frequently more complicated.

To answer our own question, index.html lives at home/xyz/htdocs/index.html. The virtual web root is therefore /home/xyz/htdocs. If we were to create a directory named tutorials under htdocs and also create a file named forms.html in that directory, the URL would be

http://www.xyz.com/tutorials/forms.html

Since Apache is now running on our PC, as opposed to running on an ISP's machine, there is no domain name. When we want to access the Apache server running on our local machine, we use localhost in a URL where we would normally use a domain name.

If we enter a URL like

http://localhost

what is the file we are actually viewing? During the NuSphere installation we specified an application root directory of C:\nusphere. The installation creates an apache subdirectory and a directory of htdocs under apache. The virtual root of this system, without changing the Apache configuration, is therefore

c:\nusphere\apache\htdocs

The Apache configuration file also indicates a default start page of index.html or index.php, therefore,

http://localhost

displays C:\nusphere\apache\htdocs\index.html.

If you have a single web site, you can use C:\nusphere\apache\htdocs\ as your web site's starting directory. Additionally, build a directory image under directory "htdocs" that conforms to your actual directory tree structure. If you have multiple web sites you will probably want to create a directory under htdocs for each of your domains. You could also do some Apache configuration to deal with these issues, but I would not recommend you go that route unless you have experience with Apache configuration.

A few last points. The domain localhost is a virtual domain, there is no directory named localhost. If you are going to use the htdocs directory as your testbed, rename index.html, it's the NuSphere suite starting page.

Apache and MySQL must be started before you can use them. When you start them depends upon you work habits. If you do web site development almost every time you log on, you may want to have the severs started when you boot up. If so, create an item in your Start Up group. If not, create a shortcut on your desktop. The following will start all the required applications in one shot.

C:\nusphere\launch.exe /batch=c:\nusphere\start-nusphere.dat

After starting NuSphere we can use the various applications for developing the form-to-mail PHP script. Create the PHP files as you normally would. Let's assume you create a file test-form.php. Instead of uploading them to your friendly ISP, create a directory under C:\nusphere\apache\htdocs called php or whatever directory name you prefer. You can place your files in the htdocs directory, but it will quickly start to get disorganized.

Let's assume you put have tucked away test-form.php as C:\nusphere\apache\htdocs\php\test-form.php. Place the following URL in your favorite Web browser:

http://localhost/php/test-form.php

It's great not to have to keep uploading every minor revision to your ISP for testing.

Forms R Us

The Phanatic installed the NuSphere package on a AMD Athlon 700 MHz system with 128 megs of RAM. The OS is Windows 98 SE.

Well we've had our little software diversion, the pizza is finished, and there is no more Jolt, so it looks like there are no excuses left for the Phanatic not getting back to why we're here, PHP. Let's get to work on the e-mail form handler.

A form processing do-it-all script st a big task. Let's start by conceptualizing the characteristics of this beast. The script should, based upon the form designer's choice,

  1. display a nicely formatted output of the form's variables;
  2. email a nicely formatted output of the form's variables to a designated recipient;
  3. email a nicely formatted receipt to the form's originator;
  4. redirect to a designated document;
  5. display a "thank you" message;
  6. validate designated fields for non-null, email address, zip code, and numeric data;
  7. allow user selection of cosmetic properties;
  8. allow users maximum latitude in the selection of form names; and
  9. include selected environment variables in output display.

Problems

Any program worth its creation time must overcome a series of problems, hopefully using elegant solutions. Repeat slowly after me, "elegance is simplicity!" Recall the three desirable program characteristics from a previous tome: effectiveness, maintainability, and efficiency. Any non-trivial problem will have multiple solutions. Program elegance is therefore finding the best solution, not simply one that works. Let's look at obstacles this script will have to overcome.

  1. Detect if the form was submitted using the GET or POST method.
  2. Delineate between HIDDEN control variables and unknown user variables.
  3. Allow only email requests from designated domains, for security reasons.
  4. Deal with form elements having multiple values.
  5. Validate selected fields such as e-mail addresses.

The Phanatic will present some code snippets to solve each of these problems. Since there is a lot of work, and space is limited, the next episode concludes with a complete script ready for some prime time form evaluation.

Did you get or post me?

Extracting form variables when programming in Perl is a real task. The variables can be extracted fairly painlessly using the CGI.pm modules, but there is a lot of work going on under the hood. In PHP form variables just spring to life. If a form has an input statement

<input type="text" name="FirstName">

we can just simply use $FirstName in our script. However, this wonderful shortcut is of no value if we don't know the contents of the name clauses from the submitted form. In a generalized form handler, we must deal with $FirstName even though we don't know the name of the variable.

Fortunately, PHP has an associative array containing all the name/values pairs contained in a form submission. Actually PHP has two arrays, $HTTP_POST_VARS and $HTTP_GET_VARS. As you might expect, the first is populated with a form submission using the post method while the latter is populated with get method submissions.

The $REQUEST_METHOD environment variable normally contains a value of either get or post if the script was called from a form. However, some older servers may not set this variable so we'll detect which associative array is loaded to determine the form's submission method.

The function GetFormData performs two tasks. It returns the appropriate value Post or Get as the first positional parameter. The second parameter is an associative array containing the submitted data, including hidden fields. Once the method determination is ascertained, this scheme frees the script from any additional get/post consideration.

function GetFormData(&$Method,&$FormVariables) {
  # Determine if the form used the post or get method and return in $Method
  # Return the form's variables as an associative array containing the
  # set of Name - Value pairs.

  global $HTTP_POST_VARS, $HTTP_GET_VARS; 
  # POST or GET method used when submitting the form?
  $Method = (isset($HTTP_POST_VARS)) ? "Post" : "Get";
  # Load the $FormVariables associative array from appropriate array
  $FormVariables = ($Method == "Post") ?
       $HTTP_POST_VARS : # Post Method Used
       $HTTP_GET_VARS;   # Get Method Used
} # End of function GetFormData

The two parameters are passed by reference, note the & before the $ used for both variables. The function call would use the template:

GetFormData($Method, $FormVariables);

You will also notice the Phanatic's penchant for the ternary conditional format.

$ReceivingVariable = (Condition) ? TrueAssignment : FalseAssignment;

A form element by any other name

Displaying form values presents a three-fold problem:

We'll cover these in turn.

Let's initially explore the first two problems. An associative array is a set of key/value pairs. The key portion of the $FormVariables array represents the form component's name clause. Conversely, the value portion represents what the user inputs or the value clause in a hidden field.

As demonstrated in the Dump GLOBALS script, PHP has a foreach array traversal function. The foreach construct can be used for associative or indexed arrays. Our form mailer needs a function to take an associative array as input, traverse the array -- formatting the display of the array on the fly, and finally, return the formatted HTML. The following is the code snippet for dumping the $FormVariables, which is an image of either the $HTTP_GET-VARS or $HTTP_POST_VARS array. The same function can be employed to display other associative arrays variables such as environment values.

Both of the function's required parameters are called by reference. The first parameter is the associative array to be displayed while the second parameter returns the formatted HTML.

function DisplayArrayVariables(&$FormVariables,&$HTMLVariables) {
  $HTMLVariables = "";
  foreach ($FormVariables as $Name=>$Value) {
      $HTMLVariables .= "<tr><td align=\"right\"><b>$Name:&nbsp;</b></td>\n";
      if (gettype($Value) == "array") {
          $ArrayComponent = "";
          foreach ($Value as $ArrayElement) {
              $ArrayComponents .= "$ArrayElement, ";
          } # End of foreach ($Value as $ArrayElement)
      $Value = substr($ArrayComponents,0,strlen($ArrayComponents)-2);
    } # End of if (gettype($Value) == "array")
    $HTMLVariables .= "<td><b>$Value</b></td></tr>\n";
  } # End of foreach ($FormVariables as $Name=>$Value)
  $HTMLVariables = "<table>\n$HTMLVariables\n</table>\n";
} # End of function DisplayArrayVariables

Hidden fields

The soul of a generic form handler is the script's ability to perform a variety of optional tasks as selected by the designer of the form. The various options and parameters are passed to the script as hidden fields. Quite simply, a hidden field has a predetermined value assigned by the form's designer, but the end-user of the form never sees the hidden fields unless they look at the document's source code.

The form handler has only one required hidden field: the e-mail address of the designated recipient receiving the formatted output. The HTML code might be

<input type="hidden" name="Recipient" value="urb@usats.com">

The hidden fields must appear between the <form> and </form> tags. As you might suspect, once we're in the script, the variable $Recipient contains the value urb@usats.com.

Once arriving in the CGI script, the name/value pairs from any hidden fields are indistinguishable from user-supplied data. We need a method to determine which of the form's supplied field are control variables and which are user variables. To distinguish between the two types, let's call them system variables and user variables. System variables and values are created by the form designer. User values are those entered into the form by the user.

Separating the system and user data is a three-part process. First, select a group of names than can only be used for system variables. Second, build an associative array containing name(key) parts representing possible system variable names. Additionally, default values for those keys, if any, are then initialized. The Phanatic has his own naming conventions as you can see from the scripts. He prefers running variable names together and starting each name portion with a capital letter, something like FirstName. However, even the Phanatic is willing to recognize that others may have different naming preferences. Use whatever convention you like, just be consistent. How about this range of name values:

In other words, FirsName, FIRSTNAME, firstname, first name, First-Name, First Name, and First_Name should all be recognized as the same system variable name, namely firstname. Here is a snippet from the StartUp function:

$SystemVariables = array(
  # The action variable determines the script's role. 
  #  "M" is mail results to recipient
  #  "T" test form by displaying form, system, and environment variables
  #  "A" mail results to recipient and acknowledgment to submitter

  action=>"M",
  allownamealias=>True,               # Allow name aliasing
  recipient=>"",                      # Email recipient - only required hidden field
  subject=>"Form Submission",         # Email subject
 # Cosmetic properties
  bgcolor=>"WHITE",           # Background color

To accomplish the required naming flexibility, the keys of the associated array are all specified as lower case with spaces, dashes, and underscores squeezed out. Now onto the second part, separating the system and user variables.

To peek ahead a little, the allownamealias field is set to either true or false indicating whether name aliases are to be allowed. Maybe you don't want FirstName and first-name to be the same field.

In an earlier function we placed the form's output into an associative array called $FormVariables. What we want to do now is traverse the $FormVariable array and transfer any system variables into the $SystemVariable array. In addition, we want to delete the system variable information from the $FormVariables array. Since we're now experts at traversing an associative array, let jump in.

foreach ($FormVariables as $Name=>$Value)

Remember, the $Name value of a system variable can be based on several different conventions. First, we have to convert the contents to conform to the values loaded in the StartUp function. Then we test the converted name by seeing if the converted value is a key value in the $SystemVariables array. The next two statements perform the conversion. The first uses a Perl regular expression to replace blanks, dashes, and underscores with a null value. The second converts the result to all lower case.

# Replace blanks, dashes, and underscores with nothing.
$TestKey = preg_replace("/( |-|_)/","",$Value);
$TestKey = strtolower($TestKey);

The remainder of the function then determines if $TestKey (the converted value) resides in the $SystemVariables array. If so, it inserts the passed value into the array, overwriting any default value. If it is a system variable the name/value pair in the $FormVariable array are removed.

if (isset($SystemVariables[$TestKey])) { # Is it a system variable?
  $SystemVariables[$TestKey] = $Value;   # Use it's value if yes
  unset($FormVariables[$Name]);          # Remove it from $FormVariables
} # End of if (isset($SystemVariables[$TestKey]))

Summary

We're well on our way with our generic form process script but, alas, were running out of space. Let's wrap it up for now and add the bells and whistles next time. We'll also add some field validation in the next episode.

There are two scripts for you to try, demo-form.php is a test form with some of the field building routines from a previous article. The second script, process-form.php has the complete code to do all the things discussed in this article. In addition, it dumps some additional useful information.

As usual, please let me know what types of things you would like to see in these articles.

Urb LeJeune is a 25-year programming veteran with over 10 years of Internet experience thrown in for good measure.


Read more PHP Forms columns.

Discuss this article in the O'Reilly Network PHP Forum.

Return to the PHP DevCenter.

Copyright © 2009 O'Reilly Media, Inc.