Thoughts on using isinstance

Thu Jan 25 13:08:34 EST 2007

Bruno Desthuilliers  <bdesth.quelquechose at free.quelquepart.fr> wrote:
> Matthew Woodcraft a écrit :

>> Adding the validation code can make your code more readable, in that
>> it can be clearer to the readers what kind of values are being
>> handled.

> This is better expressed in the docstring. And if it's in the
> docstring, you can't be blamed for misuse.

I certainly agree that the description of the function's requirements on
its parameters is best placed in the docstring.

This is another place where the "don't validate, just try running the
code anyway" approach can cause problems: what should you put in the
docstring?

I don't think anyone would like to be fully explicit about the
requirements: you'd end up having to write things like "A string, or at
least anything that's iterable and hashable and whose elements are
single character strings, or at least objects which have an upper()
method which ...".

So in practice you end up writing "a string", and leave the rest of the
'contract' implicit. But that can lead to difficulties if people working
on the code have different ideas of what that implicit contract is -- is
it "a string, or anything else which works with the current
implementation", or perhaps "you may pass something other than a string
so long as you take responsibility for making it support all the
necessary operations, even if the implementation changes", or is there
some project-wide convention about how much like a string such things
have to be?

I think this kind of vagueness can work well within a lump of code which
is maintained as a piece, but it's good to divide up programs into
components with more carefully documented interfaces. And it's at that
level that I think doing explicit parameter validation can be helpful.

>> If you validate, you can raise an exception from the start of your
>> function with a fairly explicit message. If you don't validate,
>> you're likely to end up with an exception whose message is something
>> like 'iteration over non-sequence', and it might be raised from some
>> function nested several levels deeper in.

> And what is the stack backtrace for, actually ?

I'm not sure that you intended that as a serious question, but I'll
answer it anyway.

In an ideal world, the stack backtrace is there to help me work with
code that I'm maintaining. It isn't there to help me grub around in the
source of someone else's code which is giving me an unhelpful error
message. Just as, in an ideal world, I should be able to determine how
to correctly use someone else's code by reading its documentation rather
than its source.

I think this is a 'quality of implementation' issue. When you start
using Python you pretty rapidly pick up the idea that a message like
'len() of unsized object' from (say) a standard library function
probably just means that you didn't pass the value you intended to; but
that doesn't mean it's a good error message. These things do add up to
make the daily business of programming less efficient.

>> The latter can be harder for the user of your function to debug (in
>> particular, it may not be easy to see that the problem was an invalid
>> parameter to your function rather than a bug in your function itself,
>> or corrupt data elsewhere in the system).

> docstrings and unit-tests should make it clear.

I don't see that either of those things remove the issues I described.

> Now if one want to have to declare everything three times and write
> layers and layers of adapters and wrappers, well, he knows where to
> find Java !-)

Right. But using Python there is a position between 'writing layers and
layers of adapters and wrappers' and 'never validate anything': put
explicit checks in particular functions where they're likely to do most
good.

For example, it's often helpful to explicitly validate if you're going
to store the parameters away and do the actual work with them later on.
Consider what happens if you pass garbage to urllib2.install_opener():
you'll get an obscure error message later on from a urlopen() call,
which will be rather less convenient to investigate than an error from
install_opener() would have been.

>> This might well lead to your program apparently completing
>> successfully but giving the wrong result (which is usually the kind
>> of error you most want to avoid).

> Compared to what C or C++ can do to your system, this is still a
> pretty minor bug - and probably one of the most likely to be detected
> very early

I disagree. What C or C++ will do, very often, is produce a segmentation
fault. That may well turn out to be hard to debug, but it's considerably
more likely to be detected early than a successful exit status with
incorrect output.

-M-