[Tutor] Logfile Manipulation

ALAN GAULD alan.gauld at btinternet.com
Mon Nov 9 16:23:13 CET 2009



> I can sort the biggest logfile (800M) using unix sort in about 1.5
> mins on my workstation.  That's not really fast enough, with
> potentially 12 other files....

You won't beat sort with Python.
You have to be realistic, these are very big files!

Python should be faster overall but for specific tasks the Unix 
tools written in C will be faster.

But if you are merging multiple files into one then sorting 
them before processing will probably help. However if you expect 
to be pruning out more lines than you keep it might be easier just 
to throw all the data you want into a single file and then sort that 
at the end. It all depends on the data.

HTH,

Alan G
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20091109/0ee80364/attachment.htm>


More information about the Tutor mailing list