Optimisation Hints (dict processing and strings)

Daniel Dittmar daniel.dittmar at sap.corp
Tue Mar 29 10:03:04 EST 2005


OPQ wrote:
> for (1): 
> 
>>>>longone=longone + char # where len(char)== 1
> 
> I known that string concatenation is time consuming, but a small test
> on timeit seems to show that packing and creating an array for those 2
> elements is equally time consuming

- use cStringIO instead
- or append all chars to a list and do "".join (listvar)

> 
> for (2):
> for k in hash.keys()[:]: # Note : Their may be a lot of keys here
>    if len(hash[k])<2:
>       del hash[k]
> 
> 
> Here again, I think the hash.keys duplication can be time *and* memory
> consuming. But still better than (I suppose - no time it yet)
> hash=dict([(k,v) for (k,v) in hash if len(v)>1])

- Try if it isn't faster to iterate using items instead of iterating 
over keys
- use the dict.iter* methods to prevent building a list in memory. You 
shouldn't use these values directly to delete the entry as this could 
break the iterator:

for key in [k for (k, v) in hash.iteritems () if len (v) < 2]:
     del hash (key)

This of course builds a list of keys to delete, which could also be large.

- also: hash.keys()[:] is not necessary, hash.keys () is already a copy

Daniel



More information about the Python-list mailing list