The Perils of PyContract (and Generators)

Nick Daly nick.m.daly at gmail.com
Wed Aug 5 01:06:45 EDT 2009


So, just in case any body else runs into this strange, strange
happening, I thought I might as well document it somewhere Google can
find it.  Contracts for Python[0] and generators don't necessarily play
well together.  This mail comes in three parts, first, the example code
that didn't work at all, second, a more in-depth view of the situation
while you mull over the solution, and third, the answer.

The trouble with this example is that even with all the (2 lines of)
boilerplate code required to make PyContract work, the entire example is
all of about a dozen lines long.  The lines after the "post:" directive
are executed every time the function returns, and if any one of them is
false, PyContract raises an exception, preventing the calling code from
acting on bad data:

    import os

    def find_files(name, directory="."):
        """Finds files with the sub-string name in the name.

        post:
            forall(__return__, lambda filename: name in filename)

        """
        for root, dirs, files in os.walk(directory):
            for the_file in files:
                if name in the_file:
                   yield the_file

    import contract
    contract.checkmod(__name__)

That's it.  We're just walking the directory and returning the next
matching item in the generator when it's called.  However, if we try
executing this simple function in ipy (Interactive Python, not Iron
Python), nothing works as you'd expect:

In a directory containing 4 files:

    ["one fish", "two fish", "red fish", "blue fish"]

    >>> find_files("fish")
    <generator object at 0x...>

    >>> z = find_files("fish")
    >>> z.next()
    StopIteration: ...


Apparently our generator object is empty whenever it's returned.  When
adding a print statement right before the yield, we see:

    >>> z = find_files("o")
    "one fish"
    "two fish"
    <generator object at 0x...>

    >>> z.next()
    StopIteration: ...

Why are they printing during the function?  Why is everything printing
before the function even returns??  Has my python croaked?  (How actors
and serpents could both behave like amphibians is beyond me)

The trouble is that when the yield statement is replaced with a return
statement, everything works exactly as you might expect.  It's perfect.
 Unit tests don't fail, doctests are happy, and dependent code works
exactly as advertised.  When you turn it back into a generator though,
well, generating empty lists for everything isn't helpful.

Getting irritated at it, I eventually just decided to comment out and
remove as many lines as possible (11) without actually breaking the
example, at which point it started working...  What?

The problem actually lies in the contract.  Generally, the PyContract
shouldn't affect the return values or in any way modify the code, which
it doesn't, as long as the function returns a list values (the way the
code had in fact originally been written).  However, the contract
mentioned above is actually quite wrong for a generator.  PyContract's
forall function checks every value in the list (exhausting the list)
before returning it, and actually operates on the actual return value,
and not a copy.  Thus, when the forall function verifies all the values
in the list, it's generating every value in the list, emptying the
generator before it's returned.

Correcting the above example involves doing nothing more than
simplifying the contract:

    post:
        name in __return__

So, in conclusion, generators and PyContract's forall() function don't
mix, and PyContract doesn't operate off of a copy of your parameters
unless you explicitly tell it so (I don't think it ever operates off a
copy of your return value).

Nick

0: http://www.wayforward.net/pycontract/



More information about the Python-list mailing list