Parsing logfile with multi-line loglines, separated by timestamp?

Victor Hooi victorhooi at gmail.com
Tue Jun 30 11:24:56 EDT 2015


Hi,

I'm trying to parse iostat -xt output using Python. The quirk with iostat is that the output for each second runs over multiple lines. For example:

06/30/2015 03:09:17 PM
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.03    0.00    0.03    0.00    0.00   99.94

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
xvdap1            0.00     0.04    0.02    0.07     0.30     3.28    81.37     0.00   29.83    2.74   38.30   0.47   0.00
xvdb              0.00     0.00    0.00    0.00     0.00     0.00    11.62     0.00    0.23    0.19    2.13   0.16   0.00
xvdf              0.00     0.00    0.00    0.00     0.00     0.00    10.29     0.00    0.41    0.41    0.73   0.38   0.00
xvdg              0.00     0.00    0.00    0.00     0.00     0.00     9.12     0.00    0.36    0.35    1.20   0.34   0.00
xvdh              0.00     0.00    0.00    0.00     0.00     0.00    33.35     0.00    1.39    0.41    8.91   0.39   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00    11.66     0.00    0.46    0.46    0.00   0.37   0.00

06/30/2015 03:09:18 PM
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    0.50    0.00    0.00   99.50

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
xvdap1            0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
xvdb              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
xvdf              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
xvdg              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
xvdh              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

06/30/2015 03:09:19 PM
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    0.50    0.00    0.00   99.50

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
xvdap1            0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
xvdb              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
xvdf              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
xvdg              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
xvdh              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

Essentially I need to parse the output in "chunks", where each chunk is separated by a timestamp.

I was looking at itertools.groupby(), but that doesn't seem to quite do what I want here - it seems more for grouping lines, where each is united by a common key, or something that you can use a function to check for.

Another thought was something like:

    for line in f:
        if line.count("/") == 2 and line.count(":") == 2:
            current_time = datetime.strptime(line.strip(), '%m/%d/%y %H:%M:%S')
        while line.count("/") != 2 and line.count(":") != 2:
            print(line)
            continue

But that didn't quite seem to work.

Is there a Pythonic way of parsing the above iostat output, and break it into chunks split by the timestamp?

Cheers,
Victor



More information about the Python-list mailing list