[Tutor] Help with lists

D-Man dsh8290@rit.edu
Tue, 13 Mar 2001 15:16:17 -0500


On Tue, Mar 13, 2001 at 12:18:07AM -0700, VanL wrote:
| Hello,
| 
| I'm stuck on what is probably a simple problem, but I can't figure
| out how to solve it.
| 
| I am reading through some log files, and testing for some error
| conditions.  The log files are formatted this way:
| (begin)
| 
| Here is some header text, explaining the warning. This is several
| lines long.  Each header has something that is unique.
| 
| Here is an error message.
| Here is error message number 2.
| Here is error message number 3.
| Here is error message n.
| 
| Here is the beginning of the next header....
| (end)
| 
| Notice the blank lines before and after the error block.
| 
| Now I am opening these files up and feeding them to this function:
| 
| def filter(logfile):
|         thisfile = logfile.readlines()

I assume that 'logfile' is a file object.  After the above line, the
current location in the file is the end.  Any more reads will fail,
unless seek() is used to reposition the cursor.

| 
|         # Test for trigger strings -- these indicate the presence of
|         # a specific warning block.  The number after the string
|         # is the number of lines from the trigger line to the
|         # first blank line (before the block of error messages)
| 
|         test1 = ( 'Here is some header text, explaining the warning', 2 )
|         test2 = ( 'some other trigger text in header two', 3 )

What are the integer constants in the tuple representing?  Are they
correct?

| 
|         for line in thisfile:
| 
|                 # Get the results....
|                 result1 = string.find(line, test1[0])
|                 result2 = string.find(line, test2[0])
| 
|                 # For each positive result, parse the error block to
|                 # see what really went wrong.
| 
|                 if (result1 != -1):
|                         blockindex1 = (thisfile.index(line) + test1[1])
|                         if (find_bad_errors(logfile, blockindex1)):

Now you are giving the file object 'logfile' to find_bad_errors(), but
the cursor is at the end.

|                             return (0)
| 
|                 if (result2 != -1):
|                         blockindex2 = (thisfile.index(line) + test2[1])
|                         if (find_bad_errors(logfile, blockindex1)):
                                                                 ^
This should probably be a 2.

Same as above also.

|                             return 0
|         else:
|                 return 1
| 
| 
| def find_bad_errors(logfile, start):
| 
|         print "In function find_bad_errors"
| 
|         # start denotes the index of the blank line at the top of
|         # the error block.  I want to search through each line,
|         # beginning at the index of the top of the error block.
|         # If I find another blank line without encountering a really
|         # bad error, I'm done.
| 
|         block = logfile.readlines()

This will return [], the empty list since 'logfile' is at the end.

|         blockstart = start
|         for line in (block[(blockstart + 1):]):

The outer parens are unneccessary for delimiting the expression.
Without them the expression is  block[(blockstart + 1):]

You are slicing an empty list from (blockstart + 1) to the end.  This
gives the empty list back.  So the loop is (psuedo code)

for line in nothing :
    do something

or in other words

never :
    do something


| Now, here are my problems:
| 
| 1. I never seem to be going into the for loop. The top print
| statement ("In function find_bad_errors") always prints but the
| second ("In for loop") never does.

As explained above, you are iterating over an empty list, which means
the loop body never executes.

I would suggest handing only the list of strings that is necessary to
the find_bad_errors function.  Do the slicing in the caller, who
already has the list.  It would be much slower to have it read through
the file again.  It would also duplicate the memory used since there
would be two copies of the file in memory.  Most importantly, though,
it makes the code harder to understand since the function doesn't
really need to read the file, it only needs the list of strings from
it.

| 2. If I rewrite the code so that it is forced to go through the
| tests, I always fail because each line is getting split up into
| 1-character strings.  For example, you see that I test for
| "notbad1".  I never find it, though, because all I see when I print
| is
| "
| n
| o
| t
| b
| a
| d
| 1
| 
| "

print "f"
print "o"
print "o"

will give

f
o
o

for output.  If instead I write

print "f",
print "o",
print "o",

I will get

f o o for output

print will put a newline after whatever it prints by default.  If you
put a comma after it, it doesn't print the newline, but it stil puts a
space in trying to be extra helpful.  If you don't want either
situation, build up a string in a local variable and print it when the
string is complete.


I don't know how you got the second situation since you haven't shared
your rewrite that gets into the loop.  Try fixing the first problem,
and the second may go away. 

HTH,
-D