[Tutor] sorting a 2 gb file
John Purser
johnp at milwaukielumber.com
Tue Jan 25 16:21:46 CET 2005
I'll just "Me Too" on Alan's Advice. I had a similar sized project only it
was binary data in an ISAM file instead of flat ASCII. I tried several
"pure" python methods and all took forever. Finally I used Python to
read-modify-input source data into a mysql database. Then I pulled the data
out via python and wrote it to a new ISAM file. The whole thing took longer
to code that way but boy it sure scaled MUCH better and was much quicker in
the end.
John Purser
-----Original Message-----
From: tutor-bounces at python.org [mailto:tutor-bounces at python.org] On Behalf
Of Alan Gauld
Sent: Tuesday, January 25, 2005 05:09
To: Scott Melnyk; tutor at python.org
Subject: Re: [Tutor] sorting a 2 gb file
> My data set the below is taken from is over 2.4 gb so speed and
memory
> considerations come into play.
To be honest, if this were my problem, I'd proably dump all the data
into a database and use SQL to extract what I needed. Thats a much
more effective tool for this kind of thing.
You can do it with Python, but I think we need more understanding
of the problem. For example what the various fields represent, how
much of a comparison (ie which fields, case sensitivity etc) leads
to "equality" etc.
Alan G.
_______________________________________________
Tutor maillist - Tutor at python.org
http://mail.python.org/mailman/listinfo/tutor
More information about the Tutor
mailing list