Strategy for determing difference between 2 very large dictionaries

Terry Reedy tjreedy at udel.edu
Wed Dec 24 18:38:12 EST 2008


Marc 'BlackJack' Rintsch wrote:
> On Wed, 24 Dec 2008 03:23:00 -0500, python wrote:

>> collection, I don't see the advantage of using an iterator or a list.
>> I'm sure I'm missing a subtle point here :)
> 
> `keys()` creates a list in memory, `iterkeys()` does not.  With
> ``set(dict.keys())`` there is a point in time where the dictionary, the 
> list, and the set co-exist in memory.  With ``set(dict.iterkeys())`` only 
> the set and the dictionary exist in memory.

If you can, consider using 3.0 in which d.keys() is a set-like view of 
the keys of d.  Same for d.values and d.items. The time and space to 
create such is O(1), I believe. since they are just alternate read-only 
interfaces to the internal dict storage.  There is not even a separate 
set until you do something like intersect d1.keys() with d2.keys() or 
d1.keys() - d2.keys().  I think this is an under-appreciated feature of 
3.0 which should be really useful with large dicts.

tjr




More information about the Python-list mailing list