sorted unique elements from a list; using 2.3 features

Andrew Dalke adalke at mindspring.com
Thu Jan 9 17:07:27 EST 2003


Peter Abel wrote:
> Im using Python 2.2.2 and without the 2.3. - features the
> following seems to work fine:
>  
>>>>input=[5, 3, 1, 2, 5, 4, 3, 4, 1, 1, 5, 4, 5, 1, 4, 3, 2, 2, 4, 1]
>>>>output=dict( zip(input ,range(len(input)) ) ).keys()
>>>>output
> [1, 2, 3, 4, 5]
> 
> What Im wondering about is the fact that in several examples
> even without the [sort ...] I always got a sorted output.

My guess is it's because the hash function used for integers is
the integer itself, so the hash values are 1, 2, 3, 4.  These then
get put into slot 1, 2, 3, and 4, which are then visited in that
order to get the values.

Put a -1 in the list.  You'll get

[1, 2, 3, 4, 5, -1]

I think the initial hash size is about 8 elements

 >>> for i in range(10000):
...     assert dict(zip(range(i, i+5), range(i, i+5))).keys() == 
range(i, i+5)
...
Traceback (most recent call last):
   File "<stdin>", line 2, in ?
AssertionError
 >>> i
4
 >>>

 >>> for i in range(10000):
...     assert dict(zip(range(i, i+2), range(i, i+2))).keys() == 
range(i, i+2)
...
Traceback (most recent call last):
   File "<stdin>", line 2, in ?
AssertionError
 >>> i
7
 >>>

that is,
 >>> dict(zip((7, 9), (0, 0))).keys()
[9, 7]
 >>> dict(zip((7, 8), (0, 0))).keys()
[8, 7]
 >>> dict(zip((6, 7), (0, 0))).keys()
[6, 7]
 >>>


BTW, this would have been a bit faster and used less memory

 >>> output=dict( zip(input , (0,)* len(input)) ).keys()
            |-----------------------------------|
                same as dict.from_keys(input, 0) in Python 2.3

					Andrew


					Andrew





More information about the Python-list mailing list