file rewind?

Andy Gimblett gimbo at ftech.net
Fri Apr 5 09:43:26 EST 2002


> when I run the module from the comman line by itself, it counts the lines
> alright, but does not count the characters.  do I need to "rewind" the file
> reference after the first function call to countLines()?

As has been suggested, you could use seek to rewind to the start of
the file.  But why read the file twice when you could read it once?
You could rewrite countLines so they accept a list of strings instead
of a file object, then do this:

def test(name):
    file = open(name, "r")
    filelines = file.readlines()
    countLines(filelines)
    countChars(filelines)

However, you can do away with countLines and countChars entirely:

def wc(name):
    file = open(name, "r")
    data = file.read()
    chars = len(data)
    lines = string.count(data, "\n")
    return (chars, lines)

The standard library is a wonderful thing - explore it and use it!
I'm pretty sure string.count is actually implemented in C in which
case I promise you it'll be faster than countLines().  End result:
faster, shorter code without reinventing the wheel.

One thing though: if the file could be very large, you probably don't
want to read it all into memory at once, in which case you should read
bite-sized chunks instead.  The trade off is code complexity vs memory
usage, but it's still readily comprehensible:

import string
import sys

def wc(filename, chunksize=1024):
    (chars, lines) = (0, 0)
    file = open(filename, "r")
    while 1:
        data = file.read(chunksize)
        if not data:
            break
        chars = chars + len(data)
        lines = lines + string.count(data, "\n")
    return (chars, lines)

if __name__ == '__main__':
    filename = sys.argv[1]
    (chars, lines) = wc(filename)
    print "%s has %s characters and %s lines" % (filename, chars, lines)

HTH,

Andy

-- 
Andy Gimblett - Programmer - Frontier Internet Services Limited
Tel: 029 20 820 044 Fax: 029 20 820 035 http://www.frontier.net.uk/
Statements made are at all times subject to Frontier's Terms and
Conditions of Business, which are available upon request.





More information about the Python-list mailing list