Newbie question about file input

Grant Edwards grante at visi.com
Mon Aug 16 12:40:57 EDT 2004


On 2004-08-16, Aaron Deskins <ndeskins at ecn.purdue.edu> wrote:

> I'm trying to make a simple python script that will read a
> text file with a bunch of chess games and tell me how many
> games there are.

$ grep '^\[Event' | wc -l

;)

> #! /usr/bin/env python
> import string
> import sys
> zf=open('test.pgn','r')
> # games is number of games
> games = 0
> while 1:
>   line = zf.readline()
>   if line == '':
>     break
>   ls = line.split()
>   print ls[0]
>   if ls[0] == '[Event':
>    games+=1
> zf.close()
> print games
>
>
> I'm having problems when the script reads a blank line from the pgn 
> file. I get the following error message:
>    IndexError: list index out of range
> The problem is that ls[0] does not exist when a blank line is read. What 
> would be the best way of fixing this?

Ignore the blank lines by doing something like this before you
split them:

  line = line.strip()
  if not line:
      continue

Or by checking how many words were found after you split the
line:

  ls = line.split()
  if len(ls) == 0:
      continue  

Perhaps something like this (just to be a smartass, I'm going
to condense your file open/readline()/if-break construct into
the nice new file-as-iterator usage):
      
    numgames = 0
    for line in file('test.pgn','r'):
        ls = line.split()
        if len(ls) == 0:
            continue
        if ls[0] == '[Event':
            numgames += 1
    print numgames        

Or better yet, forget split() and use the startswith() string
method:

    games = 0
    for line in file('test.pgn','r'):
        if line.startswith('[Event'):
            games += 1
    print games        

If whitespace is allowed at the beginning of the line, then we
should also strip() the line:

    numgames = 0
    for line in file('test.pgn','r'):
        if line.strip().startswith('[Event'):
            numgames += 1
    print games        

An argument can be made that you're better of explicitly
opening/closing files, but that would add more lines that don't
really have anything to do with the algorithm we're playing with.

If you want to be particularly obtuse we can rely on the fact
that True evaluates to 1 and and False evaluates to 0, and just
sum up the boolean values returned by .startswith().  That only
takes one line (not counting the "import operator"):

 print reduce(operator.add,[l.startswith('[Event') for l in file('test.pgn','r')])

The 5-line version if probably slightly easier to understand at
a glance.

-- 
Grant Edwards                   grante             Yow!  Hello? Enema
                                  at               Bondage? I'm calling
                               visi.com            because I want to be happy,
                                                   I guess...



More information about the Python-list mailing list