Not able to read blank lines and spaces on a small text file

Carlos Ribeiro carribeiro at gmail.com
Mon Sep 13 11:30:20 EDT 2004


Ruben

I hope you don't mind what I'm going to say. Your current solution is
a bit confusing, and there are better idioms in Python to solve your
problem. It seems that you either tried to port a program written in
other language such as C, or written this one with a C-like mind.
There are several reasons why we here like Python, but writing C-like
code is not one of them. Said that, I have to warn you that I'm not a
Python expert (there are a few terrific ones around here) and that my
opinion here is given as an non-authoritative advice. Be warned :-)

To loop over the lines in a text file, use the following snippet:

input_file = open("C:\\work\\readlines.txt", "r")
for line in input_file.readlines():
    print "[",line,"]"

There is no need to do a loop like you did. The loop above will check
all conditions - EOF< empty files, and so on. Now, in order to process
your lines, you need to write something like a state machine. It's
easier done than said. You just have to read line by line, checking
what you have read, and building the complete record as you go. Try
this -- it's heavily commented, but it's very short:

input_file = open("C:\\work\\readlines.txt", "r")

import string

for line in input_file.readlines():
    # line may still have the /n line ending marker -- trim it
    # it will also remove any extraneous blank space. it's
    # not actually mandatory, but helps a little bit if you
    # need to print the line and analyze it.
    line = line.strip()
    
    # we'll use the split function here because it's simpler
    # you can also use regular expressions here, but it's
    # slightly more difficult to read first time. Let's keep
    # it simple. maxsplit is a keyword parameter that tells
    # split to stop after doing finding the first splitting
    # position.
    try:
        field_name, field_value = string.split(line, maxsplit=1)
    except:
        # if it can't properly split the line in two, it's
        # either an invalid record or a blank line. Just
        # skip it and continue
        continue

    if field_name == "OrgID:":
        record_id = field_value
    if field_name == "OrgName:":
        record_value = field_value
        # assuming that this is the last value read,
        # print the whole record
        print record_id, "-", record_value

input_file.close()

The result is:

Joe S. Smith - Smith Foundation
Ronald K.Jones - Jones Foundation

Please note that I purposefully avoided defining new classes here or
using other common Python constructs. The solution could be much more
elegantly written than this, but I still hope to have helped you.

-- 
Carlos Ribeiro
Consultoria em Projetos
blog: http://rascunhosrotos.blogspot.com
blog: http://pythonnotes.blogspot.com
mail: carribeiro at gmail.com
mail: carribeiro at yahoo.com



More information about the Python-list mailing list