doctest problem with null byte

Tim Peters tim.one at comcast.net
Fri Jan 26 01:33:56 EST 2007


[Stuart D. Gathman]
> I am trying to create a doctest test case for the following:
>
> def quote_value(s):
>     """Quote the value for a key-value pair in Received-SPF header
>     field if needed.  No quoting needed for a dot-atom value.
>
>     >>> quote_value(r'abc\def')
>     '"abc\\\\\\\\def"'
>     >>> quote_value('abc..def')
>     '"abc..def"'
>     >>> quote_value('')
>     '""'
>     >>> quote_value('-all\x00')
>     '"-all\\\\\\\\x00"'
>     ...
> 
> However, doctest does *not* like the null byte in the example (yes,
> this happens with real life input):
> **********************************************************************
> File "/home/stuart/pyspf/spf.py", line 1453, in spf.quote_value
> Failed example:
>     quote_value('-all')
> Exception raised:
>     Traceback (most recent call last):
>       File
>       "/var/tmp/python2.4-2.4.4c1-root/usr/lib/python2.4/doctest.py", 
>      line 1248, in __run
>         compileflags, 1) in test.globs
>     TypeError: compile() expected string without null bytes
> **********************************************************************
> 
> How can I construct a test cast for this?

As the docs say, doctest examples are parsed /twice/:  once when Python 
compiles the module and creates string objects, and again when doctest 
extracts and passes example substrings to the compile() function (which 
you can see in your traceback).

The first time, the \x00 in the

     quote_value('-all\x00')

portion of your docstring got changed into an honest-to-goodness NUL 
byte.  Passing that to compile() can't work.

To alleviate that, and the "leaning toothpick syndrome" in your other 
examples, a simple approach is to make your docstring a raw string 
(stick the letter 'r' in front of the opening triple-quote).

For example, this works fine:

def f():
    r"""
    >>> len("\\")
    1
    >>> ord("\x00")
    0
    """

Because it's a raw string, Python does /not/ change the

    \x00

portion into a NUL byte when it compile the module.  Instead the 
docstring continues to contain the 4-character substring:

    \x00

and that's what compile() needs to see.

To perform the same test without using a raw string requires slamming in 
more backslashes (or other ad-hoc escaping tricks):

def g():
    """
    >>> len("\\\\")
    1
    >>> ord("\\x00")
    0
    """



More information about the Python-list mailing list