Python Mystery Theatre -- Episode 3: Extend this

Wed Jul 23 10:49:20 EDT 2003

I tried these out myself after making my guesses. Here's what I found:

Raymond Hettinger wrote:
> [snip]
> 
> This time, the extra credit is for picking-out the three
> that surfaced as errors in my own, real code over
> the past few years (the example context below may be
> different, but the type of error occurred in real code).
> 
> [snip]
> 
> ACT I ---------------------
> def real(z):
>     return hasattr(z, 'real') and z.real or z
> def imag(z):
>     return hasattr(z, 'imag') and z.imag or 0
> for z in 3+4j, 5, 0+6j, 7.0, 8+0j, 9L:
>     print real(z), imag(z)

I spotted the error in this one (I've been bitten by this more than 
once) if z.real or z.imag is false, I get z or 0. That results is two 
problems:

For 0+6j, the answer is 6j and 6.0, definitely not what was expected.
For 8+0j, the answer is 8.0 and 0, not 8.0 and 0.0. This is close to 
what was expected and, if a bug at all, would be a rather subtle one.

I would certainly expect that Raymond's been bitten by this one. I would 
guess that every experienced Python programmers been bitten by some form 
of this one.

> ACT II ---------------------
> uniq = {}
> for k in (10, 'ten', 10+0j, u'ten', 'Ten', 't'+'en', 10.0):
>     uniq(k) = 1
> print len(uniq)

I had to experiment with this one. I guessed 3, which turned out to be 
right. I was a little surprised that 'ten' and u'ten' hash the same. 
However, after thinking about that, I decided I was happy with that result.

I was more surprised that 10 and 10.0 came out the same. One of these 
days I'll take the time to look up the hash algorithm for floats to see 
how this was achieved.

> ACT III ---------------------
> s = 'abracadabra'
> for i in [3, 2, 1, 0]:
>         print s[i:], s[-i:]

Got this one the first time. -0 == 0, so s[-0:] == s[0:] == s. I'll bet 
that this is one that burned Raymond.

> ACT IV ---------------------
> pets = list('cat ')
> pets += 'dog '
> pets.extend('fish ')
> print pets + 'python'

I guessed ['c', 'a', 't', ' ', 'd', 'o', 'g', ' ', 'f', 'i', 's', 'h', ' 
', 'p', 'y', 't', 'h', 'o', 'n']. I tried it to discover that the print 
statement at the end throws. I decided that not concatenating strings to 
  lists doesn't bother me since I would believe that this is an error 
more often than not and pets + list('python') would hopefully make the 
intent clear. On the other hand, if I saw pets = list('cat') in code I 
was reviewing I suspect that I would want to assure myself that ['c', 
'a', 't'] was what was intended and not ['cat'].

I also much bothered by pets += 'dog ' working but pets + 'python' not. 
What's the justification for the checking in one place and not the other?

> INTERMISSION (with output included, oh yeah! ------------
> 
>>>>import operator
>>>>1 - reduce(operator.add, [1e-7]* 10**7)
> 
> 2.4983004554002264e-010

Round off error. Welcome to the world of floating point arithmetic!

> ACT V ---------------------
> class Bin:
>     numitems = 0
>     contents = []
> 
>     def add(self, item):
>         self.numitems += 1
>         self.contents += [item]
> 
> laundry = Bin()
> groceries = Bin()
> 
> laundry.add('shirt')
> groceries.add('bananas')
> groceries.add('nuts')
> laundry.add('socks')
> groceries.add('pretzels')
> 
> print laundry.numitems, laundry.contents
> print groceries.numitems, groceries.contents

Learned something on this one. I caught that, as defined, numitems and 
contents are attributes of the class Bin, rather than instances of Bin. 
(The fix is to add the missing __init__ method.) I predicted that the 
result of both print statements would be:

5 ['shirt', 'bananas', 'nuts', 'socks', 'pretzels']

When I tried it I got:

2 ['shirt', 'bananas', 'nuts', 'socks', 'pretzels']
3 ['shirt', 'bananas', 'nuts', 'socks', 'pretzels']

After thinking about it (and reviewing the Reference Manual), I realized 
that self.contents += items does an in place update of Bin.contents, but 
the augmented assignment self.numitems += 1, is the same as 
self.numitems = self.numitems+1 (since there is no in place update for 
integers or immutable objects in general), which has the result of 
creating the attribute numitems on the laundry and groceries instances.

> ACT VI -----------------------------------------
> print "What is the technical name for this algorithm or transformation?"
> a = range(0, 100, 10)
> import random
> random.shuffle(a)
> print "Input:", a
> swaps = 0
> for i in range(len(a)):
>     for j in range(i+1, len(a)):
>     if a[i] > a[j]:
>         i, j, a[i], a[j], swaps = j, i, a[j], a[i], swaps+1
> print "Output:", a
> print "Workload;", swaps

I went through a number of different answers on this one. First, there 
was the indentation error at if a[i] > a[j]. I decided that was probably 
a typo on Raymond's part.

The assignment i,j, ... = j,i, ... . Kept me guessing. Why do that?  At 
first I though it had no effect. (Note: that the corresponding 
assignment in a C program which read:

     for(i = 0; i < len(a); i++)
         for(j = i+1; j < len(a); j++) ...

would require a lot more thought as to the likely result.

At this point I convinced myself that this was an interchange sort gone 
wrong and would just produce garbage. When I tried it, I discovered that 
  the correct technical name for this program is the identity 
transformation. That is, nothing happened.

Reexamining the code, I realized that flipping the two indices has the 
effect of assigning the two array elements back to themselves, resulting 
in no effect.

About this point in time, I also realized that the sort should work if 
you take away the spurious assignments to i and j. That is,

a[i], a[j], swaps = a[j], a[i], swaps

will actually sort the list correctly. The correct technical name for 
this is "sort by successive minima". This is a slightly different sort 
from an interchange sort or a bubble sort, even though all three of 
these sorts are O(n**2).

What's the real lesson here? I have two suggestions:

1. Watch out for side effects when doing compound assignments.

2. Never try to write your own sort when a.sort() is available.

----------------

OK, which three did Raymond personnally stumble over. I'll guess I, III, 
and VI. Why VI? Well, because V, while subtle in some ways, doesn't look 
to me like one you'd run into until you knew what you were doing, at 
which point: why this missing __init__ method. VI looks like something 
you might do by accident while trying to write a sort routine.

   Chris