A little advice please? (Convert my boss to Python)
Alex Martelli
aleax at aleax.it
Tue Apr 16 06:30:48 EDT 2002
Paul Rubin wrote:
> "Duncan Smith" <buzzard at urubu.freeserve.co.uk> writes:
>> So what I'm looking for is speed, and some advice so that I don't end up
>> trying too many alternatives.
>
> If you have to do something like that over and over for zillions of
> huge files, you're best off writing in C and tuning carefully.
Not necessarily. Python dictionaries are pretty amazing. Duplicating
their functionality and speed is not just a question of "tuning
carefully".
> Regarding duplicates, maybe you can just sort the file with an
> external sort utility, so the duplicates will all be next to each
> other. Then you don't have to mess with dicts. I didn't examine your
> code closely enough to figure out if that makes sense, so maybe it
> doesn't.
Sorting is O(N log N). Inserting N entries in a dictionary can be
pretty close to O(N), since entry insertion is darn close to an
amortized O(1). Therefore, it's anything but obvious that sorting
should be a performance win.
Alex
More information about the Python-list
mailing list