Binary search tree

Michel Albert exhuma at gmail.com
Mon Nov 12 03:06:57 EST 2007


On Nov 9, 11:45 pm, Bruno Desthuilliers
<bdesth.quelquech... at free.quelquepart.fr> wrote:
> maxim.no... at gmail.com a écrit :
>
> > Hi,
>
> > I have to get list of URLs one by one and to find the URLs that I have
> > more than one time(can't be more than twice).
>
> > I thought to put them into binary search tree, this way they'll be
> > sorted and I'll be able to check if the URL already exist.
>
> What about a set ?
>
> s = set()
> for url in urls:
>    if url in s:
>      print "already have ", url
>    else:
>      set.add(url)

Interesting. For this case I usually used dicts. As in:

d = {}
for url in urls:
   if url in d:
      print "already have ", url
   else:
      d[url] = 1

Now, I can see that this method has some superfluous data (the `1`
that is assigned to the dict). So I suppose this is less memory
efficient. But is this slower then? As both implementations use hashes
of the URL to access the data. Just asking out of curiosity ;)




More information about the Python-list mailing list