read lines

Bruno Desthuilliers bruno.42.desthuilliers at wtf.websiteburo.oops.com
Tue Dec 4 08:48:37 EST 2007


Chris a écrit :
> On Dec 4, 2:14 pm, Horacius ReX <horacius.... at gmail.com> wrote:
>> Hi, I have a text file like this;
>>
>> 1 -33.453579
>> 2 -148.487125
>> 3 -195.067172
>> 4 -115.958374
>> 5 -100.597841
>> 6 -121.566441
>> 7 -121.025381
>> 8 -132.103507
>> 9 -108.939327
>> 10 -97.046703
>> 11 -52.866534
>> 12 -48.432623
>> 13 -112.790419
>> 14 -98.516975
>> 15 -98.724436
>>
>> So I want to write a program in python that reads each line and
>> detects which numbers of the second column are the maximum and the
>> minimum.
>>
(snip)
> 
> You're not guaranteed to have that 2 or even 1 element after
> splitting.  If the line is empty or has 1 space you need to handle
> it.  Also is there really a need for regex for a simple string split ?
> 
> import sys
> 
> infile = open(sys.argv[1], 'r')
> min, max = 0, 0

# shadowing the builtin min and max functions may not be such
# a good idea !-)
# Also, you may want to use a sentinel value here instead:
   mini, maxi = None, None

> for each_line in infile.readlines():

# You don't need to read the whole file in memory
# the file object knows how to iterate over lines.
# Also, you may want to track line numbers so you can
# warn about an incorrect line, cf below

for linenum, line in enumerate(infile):

>     if each_line.strip():

# you're uselessly calling line.strip two times...
   line = line.strip()
   if line:

>         tmp = each_line.strip().split()

           tmp = line.split()

>         try:
>             b = tmp[1]
# Notice that here, b is a string, not a number...
           try:
               b = int(tmp[1])
>         except (IndexError, TypeError), e:

# you may want to warn about incorrect/unexpected format here
# (writing to sys.stderr, since stdout is for normal outputs)
               print >> sys.sdterr, \
                 "incorrect line format line %s ('%s') : %e" \
                 % (linenum, line, e)
>             continue


>         if b < min: min = b
>         if b > max: max = b

# If the first test succeeds, doing the second is useless.
# also, take into account the sentinel value. The identity test
# against None should not be too costly. If it was, it's simple to
# optimize it out of the for loop.

           if mini is None or b < mini:
             mini = b
           elif maxi is None or b > maxi:
             maxi = b


# closing the file might be a good idea too, at least for any
# serious app
infile.close()


Now there are also these two builtin functions min and max, and the 
itertools tee() function...

import sys
from itertools import tee

def extract_number(iterable):
   for linenum, line in enumerate(iterable):
     try:
       yield int(line.strip().split()[1])
     except (IndexError, TypeError), e:
       print >> sys.stderr, e
       continue

# please add proper error handling around here
infile = open(sys.argv[1])
lines1, lines2 = tee(infile)
print min(extract_numbers(lines1)), max(extract_numbers(lines2))
infile.close()


HTH



More information about the Python-list mailing list