[Python-Dev] Are undocumented exceptions considered bugs?

Nick Coghlan ncoghlan at gmail.com
Sat Mar 23 16:21:53 CET 2013


On Sat, Mar 23, 2013 at 4:05 AM, Stefan Bucur <stefan.bucur at gmail.com> wrote:
> Hi,
>
> I'm not sure this is the right place to ask this question, but I thought I'd
> give it a shot since it also concerns the Python standard library.

It's the right place to ask :)

> I'm writing an automated test case generation tool for Python programs that
> explores all possible execution paths through a program. When applying this
> tool on Python's 2.7.3 urllib package, it discovered input strings for which
> the urllib.urlopen(url) call would raise a TypeError.

That sounds like a really interesting tool.

> For instance:
>
> urllib.urlopen('\x00\x00\x00')
>
> [...]
>   File "/home/bucur/onion/python-bin/lib/python2.7/urllib.py", line 86, in
> urlopen
>     return opener.open(url)
>   File "/home/bucur/onion/python-bin/lib/python2.7/urllib.py", line 207, in
> open
>     return getattr(self, name)(url)
>   File "/home/bucur/onion/python-bin/lib/python2.7/urllib.py", line 462, in
> open_file
>     return self.open_local_file(url)
>   File "/home/bucur/onion/python-bin/lib/python2.7/urllib.py", line 474, in
> open_local_file
>     stats = os.stat(localname)
> TypeError: must be encoded string without NULL bytes, not str
>
> In the urllib documentation it is only mentioned that the IOError is raised
> when the connection cannot be established. Since the input passed is a
> string (and not some other type), is the TypeError considered a bug (either
> in the documentation, or in the implementation)?

The general answer is that there are certain exceptions that usually
aren't documented because almost all code can trigger them if you pass
the right kind of invalid argument. For example, almost any API can
emit TypeError or AttributeError if you pass an instance of the wrong
type, and many can emit ValueError, IndexError or KeyError if you pass
an incorrect value. Other errors like SyntaxError, ImportError,
NameError and UnboundLocalError usually indicate bugs or environmental
configuration issues, so are also typically omitted when documenting
the possible exceptions for particular APIs.

In this specific case, the error message is
confusing-but-not-really-wrong, due to the "two-types-in-one" nature
of Python 2.x strings - 8-bit strings are used as both text sequences
(generally not containing NUL characters) and also as arbitrary binary
data, including encoded text (quite likely to contain NUL bytes).

I think a bug report for this would be appropriate, with the aim of
making that error message less confusing (it's a fairly obscure case,
though).

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list