Parse a log file

kaklis at gmail.com kaklis at gmail.com
Mon Jan 18 17:46:26 EST 2010


On Jan 18, 11:56 pm, Tim Chase <python.l... at tim.thechases.com> wrote:
> kak... at gmail.com wrote:
> > I want to parse a log file with the following format for
> > example:
> >               TIMESTAMPE            Operation     FileName
> > Bytes
> > 12/Jan/2010:16:04:59 +0200   EXISTS       sample3.3gp   37151
> > 12/Jan/2010:16:04:59 +0200  EXISTS        sample3.3gp   37151
> > 12/Jan/2010:16:04:59 +0200  EXISTS        sample3.3gp   37151
> > 12/Jan/2010:16:04:59 +0200  EXISTS        sample3.3gp   37151
> > 12/Jan/2010:16:04:59 +0200  EXISTS        sample3.3gp   37151
> > 12/Jan/2010:16:05:05 +0200  DELETE      sample3.3gp   37151
>
> > How can i count the operations for a month(e.g total of 40 Operations,
> > 30 exists, 10 delete?)
>
> It can be done pretty easily with a regexp to parse the relevant
> bits:
>
>    import re
>    r = re.compile(r'\d+/([^/]+)/(\d+)\S+\s+\S+\s+(\w+)')
>    stats = {}
>    for line in file('log.txt'):
>      m = r.match(line)
>      if m:
>        stats[m.groups()] = stats.get(m.groups(), 0) + 1
>    print stats
>
> This prints out
>
>    {('Jan', '2010', 'EXISTS'): 5, ('Jan', '2010', 'DELETE'): 1}
>
> With the resulting data structure, you can manipulate it to do
> coarser-grained aggregates such as the total operations, or remap
> month-name abbreviations into integers so they could be sorted
> for output.
>
> -tkc

Thank you both so much

Antonis



More information about the Python-list mailing list