Sorting Apache Log Files

Lenny Self lenny at squiggie.com
Mon Jun 18 21:50:41 EDT 2001


Thanks for your help.  This is what I ended up doing... It seems to work
quite nicely and seems fast enough.  Although, I'm not sure how fast its
going to be with 20MB of logs :)

#!/usr/bin/pyton

import string

# Reading file into list
list = open("d:/work/access.log","r").readlines()
def compare (line1,line2):
    # Nicely sucks out the apache date stamp
    datestamp1 = line1[string.find(line1,"[") + 1:string.rfind(line1,"]")]
    datestamp2 = line2[string.find(line2,"[") + 1:string.rfind(line2,"]")]
    # Compare the date stamps and return appropriate value
    if datestamp1 < datestamp2:
            return -1
    elif datestamp2 < datestamp1:
            return 1
    else:
            return 0
list.sort(compare)
# Writing sorted list to new file
open("d:/work/newfile.txt","w").writelines(list)

Thanks.

    -- Lenny


"Sheila King" <sheila at spamcop.net> wrote in message
news:td3titkav6amrrfjimkjkf4kngp7u4ahpg at 4ax.com...
> On 18 Jun 2001 15:55:53 -0700, lenny.self at qsent.com (Lenny) wrote in
> comp.lang.python in article
> <b1aa9ab6.0106181455.681ef924 at posting.google.com>:
>
> : I was planning on loading each of the log files
> :into a list and then sorting the list.  Unfortualy, I am unaware of
> :how to do that when the value I wish to search on isn't at the
> :beginning of the line.  I need to search on Apache's date string.
>
> How about this? Create a list of tuples, where the tuple is:
>
> (datestamp, full_line)
>
> So, as you put each line from the log into the list, grab the datestamp
> from the line, make a tuple and then sort the list on the first element
> of each tuple?
>
> --
> Sheila King
> http://www.thinkspot.net/sheila/
> http://www.k12groups.org/
>
>





More information about the Python-list mailing list