Nebie: list question, speed

Werner Hoch werner.ho at gmx.de
Sun May 6 01:38:41 EDT 2001


Uwe Hoffmann wrote:
> Werner Hoch wrote:
> > I wrote a little programm parsing two textfiles together.
> > The result is a list without uniq entries:
> > 
> > So i wrote this to check if the entry already exists:
> > --------
> > if ergfield.count(ergline) == 0:
> >       ergfield.append(ergline)
> > ---------
> > execution time is about 45 seconds
> > 
> > an then I tried a second statment which is twice as fast as the first one:
> > ------------
> > try:
> >       ergfield.index(ergline)
> > except:
> >       ergfield.append(ergline)
> > ------------
> > execution time is about 21 seconds
> > 
> > I don't like the second solution because it uses the exeption handling like
> > a if statement!
> > Are there better ways to do this?
> 
> not sure if this is what you want but use a dictionary instead
> 
> earlier:
> ergDict = {}
> 
> if not ergDict.has_key(ergLine):
> 	ergDict[ergLine] = 1
> else:
> 	ergDict[ergLine] += 1
> 
> 
> later ergDict.keys() is the same as your ergfield
> and ergDict.values() (or ergDict.items() with key and value) 
> contains the number of duplicates

Looks great, I will keep it in mind if I need the numbers of duplicates.
> 
> this is only faster if your files contain many different lines

1st file has 53000 lines
2nd file has 44000 lines
and the result ergfield has 4500 entries

> > BTW: how can I convert an integer to a string?
>
> str(number)
> or 
> "%i" % (number,)

It's a shame that this is not in my Python book.

Thanks
Werner
-- 
werner.ho at gmx.de



More information about the Python-list mailing list