which datastructure for fast sorted insert?

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Sun May 25 12:05:31 EDT 2008


En Sun, 25 May 2008 03:37:00 -0300, <notnorwegian at yahoo.se> escribió:

> im writing a webcrawler.
> after visiting a new site i want to store it in alphabetical order.
>
> so obv i want fast insert. i want to delete duplicates too.
>
> which datastructure is best for this?

Use a list, and the bisect module to keep it sorted:

py> import bisect
py> def insert_nodupes(slist, item):
...   i = bisect.bisect_left(slist, item)
...   if i>=len(slist) or slist[i] != item:
...     slist.insert(i, item)
...
py> items = []
py> insert_nodupes(items, 'link to some site')
py> insert_nodupes(items, 'another link')
py> insert_nodupes(items, 'third site')
py> insert_nodupes(items, 'another link')
py> items
['another link', 'link to some site', 'third site']

-- 
Gabriel Genellina




More information about the Python-list mailing list