save dictionary to a file without brackets.

Andrew Cooper amc96 at cam.ac.uk
Thu Aug 9 18:54:20 EDT 2012


On 09/08/2012 23:26, Dave Angel wrote:
> On 08/09/2012 06:03 PM, Andrew Cooper wrote:
>> On 09/08/2012 22:34, Roman Vashkevich wrote:
>>> Actually, they are different.
>>> Put a dict.{iter}items() in an O(k^N) algorithm and make it a hundred thousand entries, and you will feel the difference.
>>> Dict uses hashing to get a value from the dict and this is why it's O(1).
>>>
>> Sligtly off topic, but looking up a value in a dictionary is actually
>> O(n) for all other entries in the dict which suffer a hash collision
>> with the searched entry.
>>
>> True, a sensible choice of hash function will reduce n to 1 in common
>> cases, but it becomes an important consideration for larger datasets.
>>
>> ~Andrew
> 
> I'm glad you're wrong for CPython's dictionaries.  The only time the
> lookup would degenerate to O[n] would be if the hash table had only one
> slot.  CPython sensibly increases the hash table size when it becomes
> too small for efficiency.
> 
> 
> Where have you seen dictionaries so poorly implemented?
> 

Different n, which I should have made more clear.  I was using it for
consistency with O() notation.  My statement was O(n) where n is the
number of hash collisions.

The choice of hash algorithm (or several depending on the
implementation) should specifically be chosen to reduce collisions to
aid in efficient space utilisation and lookup times, but any
implementation must allow for collisions.  There are certainly runtime
methods of improving efficiency using amortized operations.

As for poor implementations,

class Foo(object):

    ...

    def __hash__(self):
        return 0

I seriously found that in some older code I had the misfortune of
reading.  It didn't remain in that state for long.

~Andrew



More information about the Python-list mailing list