Python DevCenter
oreilly.comSafari Books Online.Conferences.

advertisement


Interactive Debugging in Python
Pages: 1, 2, 3, 4, 5, 6, 7

I started the debugging session over by quitting with the (q)uit command, and restarted the debugger by kicking off the script again:

(Pdb) q
{'1': 2, '0': 1, '3': 4, '2': 3, '5': 6, '4': 5, '7': 8, '6': 7}
****************************************
****************************************
line>> 1 2 3 4
{'1': 2, '0': 1, '3': 4, '2': 3, '5': 6, '4': 5, '7': 8, '6': 7}
****************************************
****************************************
line>> 1 2 3 4
{'1': 2, '0': 1, '3': 4, '2': 3}
****************************************
jmjones@bean:~/debugger $ python example_debugger_pm.py example_debugger.data
****************************************
line>> 1 2 3 4 5 6 7 8 9 10
{'1': 2, '0': 1, '3': 4, '2': 3, '5': 6, '4': 5, '7': 8, '6': 7, '9': \
    10, '8': 9}
****************************************
****************************************
line>> 1 2 3 4 5 6 7 8 9 10
> /home/jmjones/debugger/example_debugger_pm.py(17)walk_string()
-> self.tmp_dict[key] = int(l[i])
(Pdb)

Now, I could give a command to show the exception at this line:

(Pdb) !for e in sys.exc_info(): print "EXCEPTION", e
EXCEPTION exceptions.ValueError
EXCEPTION invalid literal for int(): 9
EXCEPTION <traceback object at 0x402ef0f4>

This is the same command as before, except I prepended it with a ! to tell the debugger that this is Python code, which it needs to evaluate, rather than a debugger command. Because the debugger knows this is Python code to evaluate, it doesn't try to execute a do_for() method and generate an exception.

The line self.tmp_dict[key] = int(l[i]) raised a ValueError exception because it could not covert "9" to an int? That is really weird. Sometimes, things aren't exactly what they seem, though. What exactly was the input? Take a look:

(Pdb) !print l[i]
9

That looks pretty normal to me. When I didn't feed l[i] to print, what happened?

(Pdb) !l[i]
'9\x089'

The mystery is pretty much over. The input data contained some funky values that masked themselves. I did the same thing with some_string (which was a single input line from the data file):

(Pdb) !print some_string
1 2 3 4 5 6 7 8 9 10

This looked pretty normal as well. Here it is when I don't print it:

(Pdb) !some_string
'1 2 3 4 5 6 7 8 9\x089 10'

The \x08 character is \b, a backspace, so when the code prints out the input line, it prints 1 2 3 4 5 6 7 8 9, backspaces over the 9, and then prints out 9 10. If you ask the interpreter what the value of the input line is, it shows you the string value--including the hex value of unprintable characters.

The ValueError exception is totally expectable in this situation. Here is the result, at a debugger prompt, of trying to get the integer value of the same kind of string:

(Pdb) int("9\b9")
*** ValueError: invalid literal for int(): 9

The problem with the example code boils down to a couple of things:

  • It handles exceptions at too high a level.
  • It doesn't clean up properly when the code hits an exception.

The first item (improper exception handling) caused the code to create a dictionary of part of the corrupt input line. It created a dictionary with keys of 0 through 7.

The second item (improper cleanup) caused the code to use the dictionary that existed from the corrupt line (line 2 from the input data file) for the following line (line 3 from the input data file). That is why the first 1 2 3 4 line contained dictionary keys of 0 through 7 rather than 0 through 3. Interestingly, if the code handled the exception of converting a string to an int properly, cleaning up would not have been an issue.

Here is a better version of the code, which does better exception handling and cleans up better in the case of a catastrophic error:

#!/usr/bin/env python

import pdb
import string
import sys

class ConvertToDict:
    def __init__(self):
        self.tmp_dict = {}
        self.return_dict = {}
    def walk_string(self, some_string):
        '''walk given text string and return a dictionary. 
        Maintain state in instance attributes in case we hit an exception'''
        l = string.split(some_string)
        for i in range(len(l)):
            key = str(i)
            try:
                self.tmp_dict[key] = int(l[i])
            except ValueError:
                self.tmp_dict[key] = None
        return_dict = self.tmp_dict
        self.return_dict = self.tmp_dict
        self.reset()
        return return_dict
    def reset(self):
        '''clean up'''
        self.tmp_dict = {}
        self.return_dict = {}
    def get_number_dict(self, some_string):
        '''do slightly better exception handling here'''
        try:
            return self.walk_string(some_string)
        except:
            #if we hit an exception, we can rely on tmp_dict 
			being a backup to the point of the exception
            return_dict = self.tmp_dict
            self.reset()
            return return_dict

def main():
    ctd = ConvertToDict()
    for line in file(sys.argv[1]):
        line = line.strip()
        print "*" * 40
        print "line>>", line
        print ctd.get_number_dict(line)
        print "*" * 40
    
if __name__ == "__main__":
    main()

The output from running it is:

jmjones@bean:~/debugger $ python example_debugger_fixed.py example_debugger.data
****************************************
line>> 1 2 3 4 5 6 7 8 9 10
{'1': 2, '0': 1, '3': 4, '2': 3, '5': 6, '4': 5, '7': 8, '6': 7, '9': \
    10, '8': 9}
****************************************
****************************************
line>> 1 2 3 4 5 6 7 8 9 10
{'1': 2, '0': 1, '3': 4, '2': 3, '5': 6, '4': 5, '7': 8, '6': 7, '9': \
    10, '8': None}
****************************************
****************************************
line>> 1 2 3 4
{'1': 2, '0': 1, '3': 4, '2': 3}
****************************************
****************************************
line>> 1 2 3 4
{'1': 2, '0': 1, '3': 4, '2': 3}
****************************************

That looks much better. If the script cannot convert a string to an integer, it puts None in the dictionary.

Conclusion

The Python debugger is an indispensable tool when you have a problem that is eluding other efforts to root it out. It is not an everyday tool, but when you need it, you need it.

Jeremy Jones is a software engineer who works for Predictix. His weapon of choice is Python.


Return to the Python DevCenter.



Sponsored by: