[Tutor] Test Question
Dave Angel
davea at davea.name
Mon Jul 1 13:01:11 CEST 2013
On 07/01/2013 05:58 AM, John Steedman wrote:
> Good morning all,
>
> A question that I am unsure about. I THINK I have the basics, but I am not
> sure and remain curious.
>
> 1. What does this mean?
>>>> if my_object in my_sequence:
> ...
We can be sure what 'if' and 'in' mean, but not the other two items.
They're just names, and the behavior of the test will depend on the
types of the objects the names are bound to. By calling it my_sequence,
you're implying that the object is not only a collection, but an ordered
one. So if we trust the names, this will iterate through the sequence,
testing each item in the sequence against my_object for "==" and stop
when a match is found. If one is found the if clause will execute, and
if the sequence is exhausted without finding one, the else clause (or
equivalent) will execute.
>
> 2. What can go wrong with this? What should a code review pick up on?
The main thing that can go wrong is that the objects might not match the
names used. For example, if the name my_sequence is bound to an int,
you'll get a runtime exception.
Second, if the items look similar (eg. floating point, but not limited
to that), but aren't actually equal, you could get a surprise. For
example if my_object is a byte string, and one of the items in
my_sequence is a unicode string representing exactly the same thing.
Python 2 will frequently consider them the same, and Python 3 will know
they're different.
Third if my_object is something that doesn't equal anything else, such
as a floating point NAN. Two NANs are not equal, and a NAN is not even
equal to itself.
By a different definition of 'wrong' if the sequence is quite large, and
if all its items are hashable and it may have been faster to pass a dict
instead of a sequence. And if my_sequence is a dict, it'll probably be
faster, but the name is a misleading one.
>
> I believe that "my_sequence" might be a either container class or a
> sequence type. An effective __hash__ function would be required for each
> "my_object".
"in" doesn't care if there's a __hash__ function. It just cares if the
collection has a __contains__() method. If the collection is a dict,
the dict will enforce whatever constraints it needs. If the collection
is a list, no has is needed, but the __contains__() method will probably
be slower. In an arbitrary sequence it won't have a __contains__()
method, and I believe 'in' will iterate.
> I HTINK you'd need to avoid using floating point variables
> that might round incorrectly.
>
One of the issues already covered.
> Are there other issues?
>
Those are all I can think of off the top of my head.
--
DaveA
More information about the Tutor
mailing list