[Tutor] Introduction - log exercise

Antonio de la Fuente toni at muybien.org
Tue Nov 17 23:38:26 CET 2009


* bob gailer <bgailer at gmail.com> [2009-11-17 15:26:20 -0500]:

> Date: Tue, 17 Nov 2009 15:26:20 -0500
> From: bob gailer <bgailer at gmail.com>
> To: Antonio de la Fuente <toni at muybien.org>
> CC: Python Tutor mailing list <tutor at python.org>
> Subject: Re: [Tutor] Introduction - log exercise
> User-Agent: Thunderbird 2.0.0.23 (Windows/20090812)
> Message-ID: <4B0306EC.8000105 at gmail.com>
> 
> Antonio de la Fuente wrote:
> >Hi everybody,
> >
> >This is my first post here. I have started learning python and I am new to
> >programing, just some bash scripting, no much. Thank you for the
> >kind support and help that you provide in this list.
> >
> >This is my problem: I've got a log file that is filling up very quickly, this
> >log file is made of blocks separated by a blank line, inside these blocks there
> >is a line "foo", I want to discard blocks with that line inside it, and create a
> >new log file, without those blocks, that will reduce drastically the size of the
> >log file.
> >
> >The log file is gziped, so I am going to use gzip module, and I am going to pass
> >the log file as an argument, so sys module is required as well.
> >
> >I will read lines from file, with the 'for loop', and then I will check them for
> >'foo' matches with a 'while loop', if matches I (somehow) re-initialise the
> >list, and if there is no matches for foo, I will append line to the list. When I
> >get to a blank line (end of block), write myList to an external file. And start
> >with another line.
> >
> >I am stuck with defining 'blank line', I don't manage to get throught the while
> >loop, any hint here I will really appreciate it.
> >I don't expect the solution, as I think this is a great exercise to get wet
> >with python, but if anyone thinks that this is the wrong way of solving the
> >problem, please let me know.
> >
> >
> >#!/usr/bin/python
> >
> >import sys
> >import gzip
> >
> >myList = []
> >
> ># At the moment not bother with argument part as I am testing it with a
> ># testing log file
> >#fileIn = gzip.open(sys.argv[1])
> >
> >fileIn = gzip.open('big_log_file.gz', 'r')
> >fileOut = open('outputFile', 'a')
> >
> >for line in fileIn:
> >    while line != 'blank_line':
> >        if line == 'foo':
> >            Somehow re-initialise myList
> >	    break
> >        else:
> >            myList.append(line)
> >    fileOut.writelines(myList)
> Observations:
> 0 - The other responses did not understand your desire to drop any
> paragraph containing 'foo'.

Yes, paragraph == block, that's it

> 1 - The while loop will run forever, as it keeps processing the same line.

Because the tabs in the line with foo?!

> 2 - In your sample log file the line with 'foo' starts with a tab.
> line == 'foo' will always be false.

So I need first to get rid of those tabs, right? I can do that with
line.strip(), but then I need the same formatting for the fileOut.

> 3 - Is the first line in the file Tue Nov 17 16:11:47 GMT 2009 or blank?

First line is Tue Nov 17 16:11:47 GMT 2009

> 4 - Is the last line blank?

last line is blank.

> 
> Better logic:
> 
I would have never thought this way of solving the problem. Interesting.
> # open files
> paragraph = []
> keep = True
> for line in fileIn:
>  if line.isspace(): # end of paragraph 

Aha! finding the blank line

>    if keep:
>      outFile.writelines(paragraph)
>    paragraph = []

This is what I called re-initialising the list.

>    keep = True
>  else:
>    if keep:
>      if line == '\tfoo':
>        keep = False
>      else:
>        paragraph.append(line)
> # anticipating last line not blank, write last paragraph
> if keep:
>   outFile.writelines(paragraph)
> 
> # use shutil to rename
> 
Thank you.

> 
> -- 
> Bob Gailer
> Chapel Hill NC
> 919-636-4239

-- 
-----------------------------
Antonio de la Fuente Martínez
E-mail: toni at muybien.org
-----------------------------

The problem with people who have no vices is that generally you can
be pretty sure they're going to have some pretty annoying virtues.
		-- Elizabeth Taylor


More information about the Tutor mailing list