ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


Untwisting Python Network Programming

by Kendrew Lau
08/10/2006

Networking is an essential task in software applications nowadays. Many programming languages have support for network programming to various extents. While the core libraries of most languages allow low-level socket programming, other libraries and third-party extensions often facilitate higher-level Internet protocols. For example, Java has a standard API to access sockets and send emails (via the URL class), but other common Internet protocols are available through external libraries such as JavaMail, JTelnet, and JTA. Perl has a native Unix-style interface to sockets and convenient core modules such as IO::Socket, Net::POP3, Net::SMTP, and Net::FTP. To access Telnet programmatically in Perl, the CPAN module Net::Telnet is a good option.

Python is an exception--it has very good built-in support for both socket and various Internet protocols, including POP3, SMTP, FTP, Telnet, and Gopher. The Python core distribution contains many networking modules, such as socket, poplib, ftplib, smtplib, telnetlib, and gopherlib. Being components in a high-level programming language, these modules encapsulate the complexity of the underlying protocols and are very convenient to use. Twisted is also another powerful networking framework, which, unlike the core networking modules, adopts an asynchronous approach in the networking programs.

This article introduces basic client-side networking using both core Python modules and the Twisted framework. For its example, I will show how to send, receive, and delete emails, and conduct Telnet sessions. I have written two functionally equivalent examples, one using the core modules (mail-core.py) and another using Twisted (mail-twisted.py), with both start, stop, and interact with a server to process emails. These programs work with any standard-compliant SMTP and POP3 servers in sending and retrieving of emails. The starting and stopping of server are specific to the Apache James mail server, which I choose as a local testing server due to its ease of installation and its shutdown procedure in a Telnet session.

Sending Mails with smtplib

The core module smtplib provides a SMTP class that encapsulates the interactions to a SMTP server to send emails. Essentially, you create an SMTP instance with the address of the server specified, invoke sendmail to send the mail(s), and finally close the SMTP connection by the quit method:

def sendMail(host, addr, to, subject, content):
    import smtplib
    from email.MIMEText import MIMEText

    print "Sending mail from", addr, "to", to, "...",
    server = smtplib.SMTP(host)
    msg    = MIMEText(content)

    msg["Subject"] = subject
    msg["From"]    = addr
    msg["To"]      = to

    server.sendmail(addr, [to], msg.as_string())
    server.quit()
    print "done."

The sendmail method takes parameters of the sender's address, receivers's addresses in a list, and the content of the message. Because the message content should be in the MIME format, use the convenient class MIMEText (from email.MIMEText) to create a text message. Create a MIMEText with the message body, specify the subject, from, and to addresses with the dictionary-like syntax, and take it as a string when passing to sendmail.

This testing server accepts and relays emails from anyone, and this is the default configuration of the Apache James server. Although it is fine for a local testing server, most, if not all, SMTP servers in the Internet mandate certain security measures to fight spam. To use a SMTP object with a server that requires authentication, invoke the login(username, password) method before sending any message. This method keeps silent on success and raises an exception when the authentication fails.

Retrieving Emails with poplib

Retrieving emails is inherently more complex than sending: it involves identification of the user, getting the number of messages, and retrieving or deleting the messages. The POP3 class in the poplib module provides methods to do this:

def display(host, user, passwd, deletion = 0):
    import poplib, email
    pop3 = poplib.POP3(host)
    pop3.user(user)
    pop3.pass_(passwd)

    num = len(pop3.list()[1])
    print user, "has", num, "messages"

    format = "%-3s %-15s %s"

    if num > 0:
        if deletion:
            print "Deleting", num, "messages",
            for i in range(1, num+1):
                pop3.dele(i)
                print ".",
            print " done."
        else:
            print format % ("Num", "From", "Subject")
            for i in range(1, num+1):
                str = string.join(pop3.top(i, 1)[1], "\n")
                msg = email.message_from_string(str)
                print format % (i, msg["From"], msg["Subject"])
    pop3.quit()

Like the SMTP class, you can create a POP3 instance by specifying the mail server. Terminate the POP3 session with the quit method. Log in to the POP3 account with the methods user and pass_ with parameters of the user name and password respectively.

To get the number of messages in the server, you may use either the stat method or the list method. The stat method returns a tuple of two values: the number of messages and the total mailbox size. The list method, used in the example, returns the message list in the following form:

(response, ['mesg_num octets', ...], octets)

Here the second value is a list of strings, each stating the number and size of a message in the mailbox. The number of messages is the size of this list of strings.

Retrieve a message wholly with the method retr(message_index) or partially with top(message_index, number_of_lines). In both methods, the message index is 1-based and returns a message in the following form:

(response, ['line', ...], octets)

Typically, the second value in the results is the most interesting: it is a list of the lines of the message. The example program needs only to retrieve the header information, so it gets string containing all header plus one line of body text with string.join(pop3.top(i, 1)[1], "\n"), then uses email.message_from_string to parse the string to build a MIME message object. From the MIME message, you can fetch the email subject, sender address, and receiver address(es) via the standard dictionary syntax.

To delete a message in the mailbox, use the method dele(message_index). It sets the deletion flag of the specified message and then the server actually does the deletion when you close the POP3 session. If you have a program calling dele on a message but the message persists, check that the program actually invokes quit.

Conducting Telnet with telnetlib

The Apache James mail server comes with a batch file to start it. As a convenient option, the example program can invoke the batch file via a call to os.system.

def start():
    print "Starting server...",
    os.system("c:/apps/bin/james.bat")
    print "done."

def stop(host):
    import telnetlib
    print "Stoping server...",

    telnet = telnetlib.Telnet(host, 4555)
    telnet.read_until("Login id:")
    telnet.write("root\n")
    telnet.read_until("Password:")
    telnet.write("root\n")
    telnet.read_until("Welcome")
    telnet.write("shutdown\n")
    telnet.close()

    print "done."

Shutting down the server, on the other hand, requires a Telnet session. Manually, you can do it through a Telnet client program connected to port 4555 of the server. Enter the user name as root, password as root, and the command shutdown, respectively. To shutdown the server programmatically in Python, use class Telnet from the core module telnetlib.

To establish a Telnet session, create a Telnet object with the Telnet server address and port number specified. To interact with the telnet server, interleave calls to methods read_until and write. The read_until method reads data from the server until it receives a specified string, while the write method sends a string to the server. Note that the Apache James mail server expects a trailing carriage return in the string from the Telnet client, so the example appends "\n" to the end of the parameter to write. The Telnet session ends with the invocation of method close on the Telnet object.

The Twisted Framework

Beyond the core Python modules, Twisted is a networking framework in a different style. As demonstrated in the previous example, Python's core networking modules use a procedural approach. To perform a task, your code must invoke a method that holds the thread of execution until the task either completes successfully or fails. Such a method is straightforward to use since it is synchronous and communicates any results of execution to its caller by means of some return value. The code to perform a sequence of tasks simply invokes the appropriate methods one by one, possibly with some structural constructs and checking of return values.

On the other hand, the Twisted framework adopts a different approach of asynchronous invocation. A method call will schedule a task to do in the framework's execution thread, returning control to its caller immediately and before the completion of the task. An event-driven mechanism communicates the results of the execution. The object returned by the asynchronous method call can register success and failure callbacks that will be invoked when the scheduled task completes successfully and fails, respectively. Performing a sequence of tasks is relatively more complex, since you must typically define each task in a method registered as the success callback to the object returned by the previous task's method.

In Twisted, get used to receiving a Deferred (from twisted.internet.defer) object from an asynchronous method call. There is also a DeferredList object which can watch for asynchronous method calls completing or failing. The engine reactor (from twisted.internet) controls the framework's execution thread. Start it and shut it down with the methods run and stop, respectively.

Sending Mails the Twisted Way

The sendmail (from twisted.mail.smtp) method is the workhorse for sending mails in Twisted. It takes similar parameters as the sendmail method of a smtplib.SMTP object, with the additional first parameter of the SMTP host name. The return value is a Deferred object to which you can attach success and failure callback methods.

def sendMail(host, addr, to, subject, content):
    from twisted.mail.smtp import sendmail
    from email.MIMEText import MIMEText

    print "Sending mail from", addr, "to", to, "..."
    msg = MIMEText(content)

    msg["Subject"] = subject
    msg["From"]    = addr
    msg["To"]      = to

    return sendmail(host, addr, [to], msg.as_string())

def main():
    ...
    elif "s" == sys.argv[1]:
        print len(addrs), "messages to be sent."
        dlist = []

        for addr in addrs:
            toaddr = user + "@" + host
            text   = "Test mail: " + addr + " to " + toaddr
            dlist.append( sendMail(host, addr, toaddr, text, text) )

        DeferredList(dlist).addBoth(lambda _: reactor.stop())
        reactor.run()

To send more than one mails with sendmail, you don't need to attach the callbacks to each of the returned Deferreds. Instead, put the Deferreds in a list to create a DeferredList object. The code then attaches a callback to that single DeferredList object via its addBoth method. It will fire when all the sendmail actions succeed or any of them fails. The callback simply stops the Twisted's execution thread by reactor.stop(). Note that the tasks scheduled or registered by sendmail or addBoth are not executing until the call to reactor.run(), which starts Twisted's execution thread.

Retrieving Mails with Twisted

Programming a POP3 client to retrieve mails in the Twisted framework is more complex and takes more code. The logic to retrieve mails is all in a class that subclasses POPClient (from twisted.mail.pop3client). Due to the event-driven nature of Twisted, it's easier to define a method for each step, register the success, and failure callbacks to the method to enter the next step and handle any error, respectively.

from twisted.internet.protocol import ClientCreator
from twisted.mail.pop3client import POP3Client

class MyPOP3Client(POP3Client):

    def serverGreeting(self, msg):
        POP3Client.serverGreeting(self, msg)
        self.login(self.myuser, self.mypass).addCallbacks(
              self.do_stat, errorHandler)
              
    def do_stat(self, result):
        self.stat().addCallbacks(self.do_retrieve, errorHandler)

In the class MyPOP3Client, the first step to get mails is the serverGreeting method, which Twisted will invoke when the client starts. This method invokes the superclass's serverGreeting, and then logs in to the POP3 server with a user name and password. The login method returns a Deferred object, invoking the addCallbacks method to register the do_stat method (called upon successful login), and the errorHandler method (called on login error).

Similarly, the do_stat method invokes POP3Client's stat method to perform a POP3 STAT command, and registers the next step as do_retrieve. Because the call to method stat is asynchronous, it cannot return its results to the caller with return values. Instead, it passes the results as arguments to the success callback registered to the stat method. The second parameter to the do_retrieve method is a list, of which the first element is the number of messages in the POP3 account.

def do_retrieve(self, stats):
        self.format       = "%-3s %-15s %s"
        self.num_messages = stats[0]
        self.cur_message  = 0

        print self.myuser, "has", self.num_messages, "messages"

        if self.num_messages > 0:
            if deletion:
                print "Deleting", self.num_messages, "messages",
                self.delete(0).addCallbacks(self.do_delete_msg, errorHandler)
            else:
                print self.format % ("Num", "From", "Subject")
                self.retrieve(0).addCallbacks(self.do_retrieve_msg, errorHandler)
        else:
            reactor.stop()

    def do_retrieve_msg(self, lines):
        msg = email.message_from_string("\r\n".join(lines))
        print self.format % (self.cur_message, msg["From"], msg["Subject"])
        self.cur_message += 1
        if (self.cur_message < self.num_messages):
            self.retrieve(self.cur_message).addCallbacks(
                    self.do_retrieve_msg, errorHandler)
        else:
            reactor.stop()

If there is no message in the mailbox, the code calls reactor.stop to tell Twisted to shutdown. Otherwise, it invokes retrieve(0) to get the first message. Its success callback, do_retrieve_msg, handles the message by displaying its summary, and then retrieves the next message. Because the method do_retrieve_msg gets invoked for all subsequent messages, the code uses an instance variable, cur_message, to keep track of the current message number and to determine when it has handled all messages. When it has processed everything, it stops the Twisted main loop.

Because the logic of mail retrieval is similar to deletion, both features are in the same class, MyPOP3Client. The instance variable deletion denotes the current mode of working. You can see its initialization in __init__, along with the user name and password. More interesting is the setting of allowInsecureLogin to true, which allows login to a server without authentication challenge non-encrypted transport.


    def __init__(self):
        self.myuser   = user
        self.mypass   = passwd
        self.deletion = deletion
        self.allowInsecureLogin = True
        
    def do_delete_msg(self, str):
        print ".",
        self.cur_message += 1
        if (self.cur_message < self.num_messages):
            self.delete(self.cur_message).addCallbacks(
                    self.do_delete_msg, errorHandler)
        else:
            print " done."
            q = self.quit()
            q.addCallbacks(lambda _: reactor.stop(), errorHandler)

To delete a mail, call the delete method of class POP3Client. Similar to the core poplib module, this method just marks the mail for deletion, and the actual deletion occurs when you send the POP3 command QUIT to the server, as with the quit method. Finally, Twisted's execution thread stops when the quitting action completes.

pop3 = ClientCreator(reactor, MyPOP3Client)
d    = pop3.connectTCP(host, 110)
reactor.run()

With the implementation of the desired mail handling in class MyPOP3Client, you can launch the client. With its descriptive name, the class ClientCreator (from twisted.internet.protocol) provides a convenient way to start a communication client. This code passes the reactor and the MyPOP3Client class to create a ClientCreator, and begins the mail retrieval by calling connectTCP with the specified server and port number. Twisted's execution loop then kicks off by reactor.run().

Invoking do_retrieve_msg repeatedly for all messages is conceptually tedious and lengthy, when compared to the DeferredList mechanism which keeps track of several actions and gets notified when all actions complete, as in the case of sending mails. However, collecting the Deferreds of multiple calls to retrieve of POP3Client in a DeferredList simply does not work in Twisted (Versions 2.2.0 and 2.4.0). The success callback never gets invoked (see mail-twisted.py).

Doing Telnet with Twisted

Twisted can power a Telnet client in a way similar to, but simpler than, the POP3 client. the Telnet conversation logic goes in a subclass of Telnet (from twisted.conch.telnet).

def stop(host):
    from twisted.internet.protocol import ClientCreator
    from twisted.conch.telnet import Telnet

    class MyTelnet(Telnet):
        def dataReceived(self, data):
            if "Login id:" in data:
                self._write("root\n")
            elif "Password:" in data:
                self._write("root\n")
            elif "Welcome" in data:
                d = self._write("shutdown\n")

        def connectionLost(self, reason):
            reactor.stop()
            print "done."

    mytelnet = ClientCreator(reactor, MyTelnet)
    d = mytelnet.connectTCP(host, 4555)
    reactor.run()

The class MyTelnet overrides two methods of class Telnet. The first method, dataReceived, is called when data arrives at the client. It checks the data received and calls the _write method to send the user name, password, or the shutdown command accordingly to the server. The second method is connectionLost, which Twisted calls when the server closes the telnet session. In that case, the program simply terminates the Twisted execution loop. The Telnet client starts by using the ClientClient class, connected to port 4555 of the James mail server.

When to Be Twisted?

The two functionally equivalent programs, one using Python core modules and the other using the Twisted framework, significantly differ from each other in terms of programming style and the amount of code. Then when should you use either of the two options?

For basic programs such as the command-line client of this example, the Python core networking modules are more desirable due to the simplicity and performance advantages. However, most real-world networking programs are very complex, and Twisted's asynchronous programming model is more effective. For example, BitTorrent, the popular peer-to-peer file sharing client that performs massive parallel downloading of data chunks from different sources, uses Twisted. Twisted also works well in programs with graphical user interface (GUI), because its asynchronous nature fits more seamlessly with the event-driven programming models of modern GUI frameworks. In fact, Twisted has integration with popular GUI frameworks including PyGTK, Qt, Tkinter, WxPython, and Win32.

The other area where Twisted shines is in server programming. A typical network server uses multithreading so that it can handle multiple clients concurrently. The asynchronous mechanism of Twisted alleviates the creation and handling of threads by server programs. In addition, Twisted provides several protocols on which to build new networking services, enabling rapid development of complex servers. One such project is Quotient, which adopts Twisted to build a multiprotocol messaging server that supports a variety of protocols and services including SMTP, POP3, IMAP, webmail, and SIP.

Kendrew Lau is a consultant in Hong Kong, with focus on Java, Linux, and other OSS technologies.


Return to the Python DevCenter.

Copyright © 2009 O'Reilly Media, Inc.