Sorting Large File (Code/Performance)

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Sat Jan 26 13:21:10 EST 2008


En Fri, 25 Jan 2008 17:50:17 -0200, Paul Rubin  
<"http://phr.cx"@NOSPAM.invalid> escribi�:

> Nicko <usenet at nicko.org> writes:
>>     # The next line is order O(n) in the number of chunks
>>     (line, fileindex) = min(mergechunks)
>
> You should use the heapq module to make this operation O(log n) instead.

Or forget about Python and use the Windows sort command. It has been there  
since MS-DOS ages, there is no need to download and install other  
packages, and the documentation at  
http://technet.microsoft.com/en-us/library/bb491004.aspx says:

Limits on file size:
   The sort command has no limit on file size.

Better, since the OP only intents to extract lines starting with "zz", use  
the findstr command:
findstr /l /b "zz" filename.exe
would do the job.

Why doing things more complicated than that?

-- 
Gabriel Genellina




More information about the Python-list mailing list