ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


Testing Web Apps Effectively with twill

by Michele Simionato
11/03/2005

You have just finished your beautiful web application, with lots of pages, links, forms, and buttons; you have spent weeks making sure that everything works fine, that it handles the special cases correctly, that the user cannot crash your system no matter what she does.

Now you are happy and are ready to ship, but at the last minute the customer ask for a change. You have the time to apply the change, but you lack the time--and the will--to pass through another testing ordalia. You ship anyway, hoping that your last little fix didn't break some other part of the application. The result is that the hidden bug shows up on the first day of usage.

If you recognize yourself in this situation, then this article is for you. If not, I am sure you will find something interesting among the following topics:

To Test or Not to Test

Let me begin with a brief recollection of how I became interested in testing and what I have learned in the last couple of years.

Related Reading

Python Cookbook
By Alex Martelli, Anna Martelli Ravenscroft, David Ascher

I have been aware of the importance of testing from the beginning, and I have heard about automatic testing for years. However, having heard about automatic testing is not the same as doing automatic testing, nor is it the same as doing automatic testing well. It takes some time and experience to get into the testing mood, as well as to be able to challenge some widespread misconceptions.

For instance, when I began studying test-driven methods, I had gathered two wrong ideas:

After some experience, I quickly realized that unit tests were not the only tool, nor were they the best tool to test my application effectively. (I am an early adopter and supporter of doctests. See, for instance, my talk at the ACCU conference.) To overcome the second misconception, I needed some help.

The help come from an XP seminar I attended last year, where I actually asked the question, "How do I test the user interface of a web application so that when the user click on a given page, she gets the expected result?"

The answer was, "You don't. Why do you want to test that your browser is working?"

The case for not testing everything

The answer made me rethink many things. Obviously I was well aware from the beginning that full test coverage is a myth, though still I thought one programmer should try to test as much as he can.

This isn't the right approach. Instead, it is important to discriminate about the infinite amount of things that could be tested, and focus on the things that are your responsibility.

If your customer wants feature x, you must be sure feature x is there. If in order to get feature x, you need to rely on features x1, x2 ... xn; you don't need to test for all of them. Test only the feature that you need to implement. Don't test that the browser is working--it's not your job.

For instance, in the case of a web application, you can interact with it indirectly via the HTTP protocol, or directly via the internal API. If you check that when the user clicks on button b the application calls method m and displays result r, you are testing both your application and the correctness of the HTTP protocol implementation, in both the browser and the server. This is way too much. You may rely on the HTTP protocol and just test the API; just test that calling method m returns the right result r.

Of course, a similar view can apply to GUIs. In the same vein, you must test that the interface to the database you wrote is working, but you don't need to test that the database itself is working--it's not your responsibility.

The basic point is to separate the indirect testing of the user interface--via the HTTP protocol--from the testing of the inner API. To this aim, it's important to write your application in such a way that you can test the logic independently of the user interface. Working in this way, you have the additional bonus of being able to change the user interface later, without having to change a single test for the logic part.

The problem is that typically a customer will give his specifications in terms of the user interface. He will tell you, "There must be a page where the user will enter her order, then she will enter her credit card number, and then the system must send a confirmation email ..."

This kind of specification is a very high-level test--a functional test--that you must convert to a low-level test. For example, you may have unit testing telling you that the ordered item has been registered in the database, that the send_confirmation_email method has been called, and so on.

The conversion requires some thinking and practice, and it is an art more than a science. Actually, I think that the art of testing is not in how to test but rather in what to test. The best advice and answer to "How do I test a web application?" is probably "Make a priority list of the things you would like to test, and test as little as possible".

For instance, never test the details of the implementation. If you make this mistake (as I did at the beginning), your tests will get in your way at refactoring time, having exactly the opposite of the intended effect. Generally speaking, some good advice is: don't spend time testing third-party software; don't waste time testing code that API is likely to change; and split the UI testing from the application logic testing.

Ideally, you should be able to determine the minimal set of tests you need to make your customer happy, and then restrict yourself to those tests.

The case for testing everything

The previous advice is nice and reasonable, especially in an ideal world where third-party software is bug free and everything is configured correctly. Unfortunately, the real world is a bit different.

For instance, you must be aware that your application doesn't work on some buggy browser, or that it cannot work in specific circumstances with some database. Also, you may have a nice and comprehensive test suite that runs flawlessly on your development machine, but the application may not work correctly when installed on a different machine because the database could be installed improperly, or the mail server settings could be incorrect, or the internet connection could be down. In the same vein, if you want to really be sure that when the user--with a specific browser in a specific environment--clicks on that button she gets that result, you have to emulate exactly that situation.

It looks as if this advice takes you back to square one, where you need to test everything. Hopefully you've learned something in the process: whereas in principle you would like to test everything, in practice you can effectively prioritize your tests, focusing on some more than on others, and splitting them into categories to run separately at different times.

You definitely need to test that the application is working as intended when deployed on a different machine. From the failures to these installation tests, you may also infer what is wrong and correct the problem. Keep these installation tests--those of the environment where your software is running--decoupled from the unit tests checking the application logic. If you're sure that the logic is right, then you are also sure that the problems are in the environment and can focus your debugging skills in the right area.

In any case, you need to have both high-level (functional, integration, and installation) tests and low-level (unit tests, doctests) tests. High-level tests include those of the user interface. In particular, you need a test to make sure that if a user clicks on x he gets y, so you are sure that the internet connection, the web server, the database, the mail server, your application, and the browser all work nicely together. Beware not to focus on these global kinds of tests. You don't need to write thousands of these high-level tests if you already have many specific low-level tests checking that the logic and the various components of your application are working.

How to Test the User Interface

Having structured your application properly, you need a smaller number of user interface tests, but you still need at least a few. How do you write those tests, then?

There are two possibilities: the hard way and the easy way.

The hard way is just doing everything by hand, using your favorite programming language web libraries to perform GET and POST requests and verify the results. The easy way is to leverage tools built by others. Of course, internally these tools work just by calling the low-level libraries, so it is convenient to say a couple of words on the hard way just to understand what is going on, in case the high-level tools give you some problem. Moreover, there is always the possibility than you need something more customized, and knowledge of the low-level libraries can be valuable.

The interaction between the user and a web application passes through the HTTP protocol, so it is perfectly possible to simulate the action of a user clicking on a browser just by sending to the server an equivalent HTTP request (ignoring the existence of JavaScript for the moment).

Any modern programming language has libraries to interact with the HTTP protocol; here, I will give examples in Python, since it is a common and readable language for web programming. Python's urllib libraries manage the interaction with the Web. There are two of them: urllib, which works in the absence of authentication, and urllib2, which can also manage cookie-based authentication. A complete discussion of these two libraries would take a long time, but explaining the basics is pretty simple. I will give just a couple of recipes based on urllib2, the newest and most powerful library.

The support for cookies in Python 2.4 has improved (essentially by including the third-party ClientCookie library), so you may not be aware of the trick I am going to explain, even if you have used the urllib libraries in the past. So, don't skip the next two sections. ;)

Recipe 1: How to send GET and POST requests

Suppose you want to access a site that does not require authentication. Making a GET request is pretty easy: at the interpreter prompt, just type

>>> from urllib2 import urlopen
>>> page = urlopen("http://www.example.com")

Now you have a filelike object that contains the HTML code of the page http://www.example.com/:

>>> for line in page: print line,
<HTML>
<HEAD>
  <TITLE>Example Web Page</TITLE>
</HEAD> 
<body>  
<p>You have reached this web page by typing "example.com",
"example.net",
  or "example.org" into your web browser.</p>
<p>These domain names are reserved for use in documentation and are not available 
  for registration. See <a href="http://www.rfc-editor.org/rfc/rfc2606.txt">RFC 
  2606</a>, Section 3.</p>
</BODY>
</HTML>

If you try to access a nonexistent page or your internet connection is down, you will get an urllib2.URLError instead. Incidentally, this is why the urllib2.urlopen function is better than the older urllib.urlopen, which would just silently retrieve a page containing the error message.

You can easily imagine how to use urlopen to check your web application. For instance, you could retrieve a page, extract all the links, and verify that they refer to existing pages; or verify that the retrieved page contains the right information, for instance by matching it with a regular expression. In practice, urlopen (possibly coupled with a third-party HTML parsing tool, such as Beautiful Soup) gives you all the fine-grained control you may wish for.

Moreover, urlopen gives you the possibility to make a POST: just pass the query string as the second argument to urlopen. As an example, I will make a POST to http://issola.caltech.edu/~t/qwsgi/qwsgi-demo.cgi/widgets, which is a page containing the example form coming with Quixote, a nice, small Pythonic web framework.

>>> page = urlopen("http://issola.caltech.edu/~t/qwsgi/qwsgi-demo.cgi/widgets",
...        "name=MICHELE&password=SECRET&time=1118766328.56")
>>> print page.read()
<html>
<head><title>Quixote Widget Demo</title></head>
<body>
<h3>You entered the following values:</h3>
<table>
  <tr><th align="left">name</th><td>MICHELE</td></tr>
  <tr><th align="left">password</th><td>SECRET</td></tr>
  <tr><th align="left">confirmation</th><td>False</td></tr>
  <tr><th align="left">eye colour</th><td><i>nothing</i></td></tr>
  <tr><th align="left">pizza size</th><td><i>nothing</i></td></tr>
  <tr><th align="left">pizza toppings</th><td><i>nothing</i></td></tr>
</table>
<p>It took you 163.0 sec to fill out and submit the form</p>
</body>
</html>

Now page will contain the result of your POST. Notice that I had to explicitly pass a value for time, which is an hidden widget in the form.

That was easy, wasn't it?

Recipe 2: Managing authentication

If the site requires authentication, things are slightly more complicated, but not much--at least if you have Python 2.4 installed. In order to manage cookie-based authentication procedures, you need to import a few utilities from urllib2:

>>> from urllib2 import build_opener, HTTPCookieProcessor, Request

Notice that HTTPCookieProcessor is new in Python 2.4: if you have an older version of Python you need a third-party library such as ClientCookie.

build_opener and HTTPCookieProcessor create an opener object that can manage the cookies sent by the web server:

>>> opener = build_opener(HTTPCookieProcessor)

The opener object has an open method that can be used to retrieve the web page corresponding to a given request. The request itself is encapsulated in a Request object, which is built from the URL address, the query string, and some HTTP header information. In order to generate the query string, it is pretty convenient to use the urlencode function defined in urllib (not in urllib2):

>>> from urllib import urlencode

urlencode generates the query string from a dictionary or a list of pairs, taking care of the quoting and escaping rules the HTTP protocol requires. For instance:

>>> urlencode(dict(user="MICHELE", password="SECRET"))
'password=SECRET&user=MICHELE'

Notice that the order is not preserved when you use a dictionary (quite obviously), but this is usually not an issue. Now define a helper function:

>>> def urlopen2(url, data=None, user_agent='urlopen2'): 
...     """Can be used to retrieve cookie-enabled Web pages (when 'data' is
...     None) and to post Web forms (when 'data' is a list, tuple or dictionary
...     containing the parameters of the form).
...     """
...     if hasattr(data, "__iter__"):
...         data = urllib.urlencode(data)
...     headers = {'User-Agent' : user_agent}
...     return opener.open(urllib2.Request(url, data, headers))

With urlopen2, you can POST your form in just one line. On the other hand, if the page you are posting to does not contain a form, you will get an HTTPError:

>>> urlopen2("http://www.example.com", dict(user="MICHELE", password="SECRET"))
Traceback (most recent call last):
  ...  
HTTPError: HTTP Error 405: Method Not Allowed

If you just need to perform a GET, simply forget about the second argument to urlopen2, or use an empty dictionary or tuple. You can even fake a browser by passing a convenient user agent string, such as Mozilla or Internet Explorer. This is pretty useful if you want to make sure that your application works with different browsers.

Using these two recipes, it is not that difficult to write your own web testing framework. Still, you may be better off by leveraging the work of somebody else.

Testing the Easy Way: twill

I am a big fan of mini languages--small languages written to perform a specific task. (See, for instance, my O'Reilly article on the graph-generation language dot.) I was very happy when I discovered that there a nice little language expressly designed to test web applications. Actually there are two implementations of it: Titus Brown's twill and Cory Dodt's Python Browser Poseur (PBP).

PBP came first, but twill seems to be developing faster. At the time of this writing, twill is still pretty young (I am using version 0.7.1), but it already works pretty well in most situations. Both PBP and twill are based on tools by John J. Lee such as mechanize (inspired by Perl), ClientForm, and ClientCookie. Twill also uses Paul McGuire's pyparsing. However, you don't need to install these libraries; twill includes them as zipped libraries (leveraging on the new Python 2.3 zipimport module). As a consequence, twill installation is absolutely obvious and painless, being nothing more than the usual python setup.py install.

The simplest way to use twill is interactively from the command line. Here's a simple session example:

$ twill-sh
-= Welcome to twill! =-

current page:  *empty page*
                 
>> go http://www.example.com
==> at http://www.example.com

>> show
<HTML>
<HEAD>
  <TITLE>Example Web Page</TITLE>
</HEAD>
<body>
<p>You have reached this web page by typing "example.com",
"example.net",
  or "example.org" into your web browser.</p>
<p>These domain names are reserved for use in documentation and are not available
  for registration. See <a href="http://www.rfc-editor.org/rfc/rfc2606.txt">RFC
  2606</a>, Section 3.</p>
</BODY>
</HTML>

Twill recognizes a few intuitive commands, such as go, show, find, notfind, echo, code, back, reload, agent, follow, and a few others. The example shows how to access a particular HTML page and display its content.

The find command matches the page against a regular expression, thus:

>>> find("Example Web Page")

is a test asserting that the current page contains what you expect. Similarly, the notfind command indicates that the current page does not match the given regular expression.

The other twill commands are pretty obvious: echo <message> prints a message on standard output; code <http_error_code> checks that you are getting the right HTTP error code (200 if everything is alright); back allows you to go back to the previously visited page; reload reloads the current page; agent <user-agent> lets you change the current user agent, thus faking different browsers; follow <regex> finds the first matching link on the page and visits it.

To see a full list of the commands, type help at the prompt; EOF or Ctrl-D allows you to exit.

Once you have tested your application interactively, it is pretty easy to cut and paste your twill session and convert it to a twill script. Then you can run your twill script in a batch process:

$ twill-sh mytests.twill

As you may imagine, you can put more than one script in the command line and test many of them at the same time. Because twill is written in Python, you can control it from Python entirely and you can even extend its command set just by adding new commands in the commands.py module.

At the moment, twill is pretty young, and it lacks the capability to convert scripts in unit tests automatically so that you can easily run entire suites of regression tests. However, it is not that difficult to implement that capability yourself, and it is likely that twill will gain good integration with unittest and doctest in the future.

Retrieving and submitting web forms

Twill is especially good at retrieving and submitting web forms. The form-related feature uses the commands:

Explaining the commands is pretty straightforward.

showforms shows the forms contained in a web page. For example, try:

>>> go http://issola.caltech.edu/~t/qwsgi/qwsgi-demo.cgi/widgets
>> showforms
Form #1
## __Name______ __Type___ __ID________ __Value__________________
   name         text      (None)
   password     password  (None)
   confirm      checkbox  (None)       [] of ['yes']
   colour       radio     (None)       [] of ['green', 'blue', 'brown', 'ot ...
   size         select    (None)       ['Medium (10")'] of ['Tiny (4")', 'S ...
   toppings     select    (None)       ['cheese'] of ['cheese', 'pepperoni' ...
   time         hidden    (None)       1118768019.17
1               submit    (None)       Submit
current page: http://issola.caltech.edu/~t/qwsgi/qwsgi-demo.cgi/widgets

Notice that twill does a good job of emulating a browser, so it fills the hidden time widget automatically. The previous code had to fill it explicitly with urlopen.

Unnamed forms get an ordinal number to use as a form ID in the formvalue command, which fills a field of the specified form with a given value. You can give many formvalue commands in succession; if you are a lazy typist, you can also use fv as an alias for formvalue:

>>> fv 1 name MICHELES
current page: http://issola.caltech.edu/~t/qwsgi/qwsgi-demo.cgi/widgets
>> fv 1 password SECRET
current page: http://issola.caltech.edu/~t/qwsgi/qwsgi-demo.cgi/widgets

formclear resets all the fields in a form, and submit lets you press a Submit button, thus submitting the form:

>>> submit 1
current page: http://issola.caltech.edu/~t/qwsgi/qwsgi-demo.cgi/widgets

A simple show will convince you that twill has submitted the form. The best way to understand how it works is just to experiment on your own. The base distribution contains a few examples you can play with.

Enlarging the Horizon

In this article I have shown two easy ways to test your web application: by hand with urllib or with a simple tool such as twill. Much more is under the sun. There are many sophisticated web testing frameworks out there, including enterprise-oriented ones, with lots of features and difficult learning curves. Here, on purpose, I have decided to start from the small and to discuss the topic from a do-it-yourself angle, because often the simplest things work best. Sometimes you don't need the sophistication; sometimes your preferred testing framework lacks the feature you wish for; and sometimes it is just buggy. If you need something more sophisticated, a great source for everything testing-related is Grig Gheorghiu's blog.

A new framework that is especially interesting is Selenium, which is also useful for testing Plone applications. Selenium is really spectacular, because it is based on JavaScript and it really tests your browser, clicking on links, submitting forms, and opening pop-up windows all in real time. It completely emulates the user experience at the highest possible level. It also gives you all kind of bells and whistles, eye candy, and colored HTML output (which you may or may not like but surely will impress your customer if you are going to demonstrate that the application conforms to the specifications). I cannot render justice to Selenium in a few lines; perhaps I should write an entire article on it, when I find the time. For the moment, I make no promises, and I refer you to the available documentation.

Michele Simionato is employed by Partecs, an open source company headquartered in Rome. He is actively developing web applications in the Zope/Plone framework.


Return to the Python DevCenter.

Copyright © 2009 O'Reilly Media, Inc.