[Tutor] Unravelling tricky map/filtering code [wordnet.py:Word.__init__.extractVerbFrames()]

Danny Yoo dyoo@hkn.eecs.berkeley.edu
Sun Dec 15 14:42:01 2002


Hi everyone,


For fun, I started browsing through the bug reports in PyWordnet:

    http://sourceforge.net/tracker/?atid=390151&group_id=27422&func=browse

just to see if there was anything interesting in there.  Something caught
my eye:

http://sourceforge.net/tracker/index.php?func=detail&aid=448189&group_id=27422&atid=390151



This looked fairly understandable; the bug that they're mentioning
involves the fact that int() doesn't do automatic base conversion unless
we give an explicit base:

###
>>> int("0e")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: invalid literal for int(): 0e
>>> int("0e", 16)
14
###

so the corrected code explicitely sets 16 as the base for converting
certain numbers, so that the hexidecimal numbers are treated properly.


However, the error report also mentioned one ugly expression that reminded
me too much of a Lisp expression gone wild.  I'm paste it here so that we
can see what slightly obfuscated Python code might look like:

###
def extractVerbFrames(index, vfTuples):
    return tuple(map(lambda t:string.atoi(t[1]),
                     filter(lambda t,i=index:
                                string.atoi(t[2], 16) in (0, i),
                            vfTuples)))
###

Perhaps I'm being too harsh.  *grin* In the original code, this was all
laid out on a single line, so it looked much worse before I indented this.


I do think the code can be improved for readability, and thought it might
be interesting to bring it up on Tutor, and see if we can massage this
into something that's fairly easy to read.


The original code is meant to be compatible with older versions of Python
1.52. The above uses the functional builtins map() and filter() in
combination, as well as the deprecated string.atoi().  Unlike us, the
authors of PyWordnet don't have the freedom to use the latest wiz-bang
features in Python.


If we had the luxury of doing this using those wiz-bang features in
Python, extractVerbFrames() might look like this:

###
def extractVerbFrames(index, vfTuples):
    return tuple([int(t[1]) for t in vfTuples
                  if int(t[2], 16) in (0, index)])
###

There are a few things that we've done here: we've substituted
string.atoi() --- the old string-to-integer function --- with the simpler
int() function.

Also, we've subsituted the combination map/filter with a list
comprehension.  List comprehensions are specifically designed to handle
this map/filter situation, so if the authors had the freedom to use it,
this would probably be a good way to go.



But if we had to stick with compatibility, we can still break down that
nested map/filter into separate statements, by giving names to each
original subexpression.

###
def extractVerbFrames(index, vfTuples):
    filtered_tuples = filter(lambda t, i=index:
                                 string.atoi(t[2], 16) in (0,i),
                             vfTuples)
    frames = map(lambda t: string.atoi(t[1]), filtered_tuples)
    return tuple(frames)
###

so that there's less nesting involved.  Part of me is wavering on this,
since it's merely a asthetic judgement call: I just felt that the
readability of that code was suffering because it was trying to do
everything in one single statement; I'd rather have it take up a few more
lines vertically.

That part about 'i=index' still remains because older versions of Python
did not have nested scopes: without nested-scopes, those lambdas wouldn't
be able to refer to the 'index' parameter.  Old code had to fake nested
scopes by binding those variables to default parameters --- it was an ugly
kludge, but it worked.



For someone who doesn't know about lambdas or map or filter, this might
still look a little intimidating.  We can avoid the functionals altogether
and use a for loop:

###
def extractVerbFrames(index, vfTuples):
    vFrames = []
    for t in vfTuples:
        if string.atoi(t[2], 16) in (0, i):
            vFrames.append(string.atoi[t[1]])
    return tuple(vFrames)
###

and although this isn't "functional" in flavor, I'd actually prefer this
over the map/filter for this particular case, partially because the
complexity of those two lambda expressions was a bit high.



I hope this helps!