ONLamp.com
oreilly.comSafari Books Online.Conferences.

advertisement


Getting Loopy with Python and Perl
Pages: 1, 2

Aside from the lack of assignment, Python's while loops function almost identically to their Perl counterparts:



Perl:

   
    $done = 0;
    while (!$done) {
        $input = getInput();
        if (defined($input)) {
            process($input);
        } else {
            $done = 1;
        }
    }

Python:


    done = False
    while not done:
        input = getInput()
        if input is not None:
            process(input)
        else:
            done = True

Note that False is a new, built-in value in Python 2.2.1 (in Python 2.3, boolean operations will return True/False instead of 1/0). In general, any empty value generates a false truth value: False, None, 0, "", (), [], {}, and class instances with __nonzero__() or __len__() methods that return 0.

Note also that in Perl, if getInput() starts returning a hash or array instead of a scalar, the while loop needs to be modified, whereas the Python loop will continue to work fine with a dict or a list. That's because in Python, everything is done with reference semantics (called "binding" in Python because one does not access references directly); one can get the same effect in Perl by explicitly using references.

In addition to iterators, Python 2.2 added generators. Generators are functions that return an iterator. It may help to think of them as something like resumable closures. Here's a subset implementation of grep:


    from __future__ import generators
    import sys
    import re

    def grep(seq, regex):
        regex = re.compile(regex)
        for line in seq:
            if regex.search(line):
                yield line

The "yield" keyword turns an ordinary function into a generator; the generator returns an iterator that wraps the generator. Each time the yield executes (returning a result just like "return" does), the generator is paused, but the generator function's stack frame is retained (including all local variables), pending another call to the iterator's next() method. The iterator does not execute any of the generator's code until the first call to next().

Returning to this specific example, calling grep() returns an iterator. Each call to the iterator's next() method returns one match. This happens implicitly in a for loop:


     regex = r"\s+\w+\s+"
     for line in grep(sys.stdin, regex):
         sys.stdout.write(line)

The advantage of all this is that it's simple, clean, and efficient. You can write straightforward code--but it doesn't need to hog memory by creating an entire list before returning. Generators can be pipelined; imagine recreating other Unix utilities, such as uniq. Even if the source is a list, at least there's no need to create temporary lists.

The "from __future__" statement is needed because "yield" is a new keyword in Python 2.2 and accessing it must be done explicitly. When Python 2.3 is released, "from __future__ import generators" will no longer be required, but keeping it won't harm anything (allowing code to run under any version Python 2.2 or later).

For more information about converting Perl code to Python, see the Python/Perl Phrasebook . (The phrasebook is seven years old and out of date, but still quite useful.)

Special thanks to Cathy Mullican (menolly@spy.net) for refreshing my badly outdated Perl knowledge.

Aahz has been programming in Python for more than three years and enjoys teaching people how to use Python.





Sponsored by: