Code block literals

Alex Martelli aleaxit at yahoo.com
Fri Oct 10 17:35:18 EDT 2003


Vis Mike wrote:

> "Lulu of the Lotus-Eaters" <mertz at gnosis.cx> wrote in message
   ...
>>     for line in file('input.txt').readlines():
>>         do_something_with(line)
>>
>>     for byte in file('input.txt').read():
>>         do_something_with(byte)
>>
>> Of course, both of those slurp in the whole thing at once.  Lazy lines
>> are 'fp.xreadlines()', but there is no standard lazy bytes.
> 
> xreadlines()? What kind of naming convention is that: :)

An obsolete one (to go with 'xrange').  Since about 3 years, the
correct Python spelling is just "for line in file("input.txt"):" .


>> A method 'fp.xread()' might be useful, actually.  And taking a good idea
>> from Dave Benjamin up-thread, so might 'fp.xreadwords()'.  Of course, if

I think that using methods for such things is not a particularly good idea.

A generator that takes a sequence (typically an iterator) of strings and
returns as the items the single bytes or words is more general:

def eachbyte(seq):
    for s in seq:
        for c in s:
            yield c

def eachword(seq):
    for s in seq:
        for w in s.split():
            yield w

and now you can loop "for b in eachbyte(file("input.txt")):" etc -- AND you
have also gained the ability to loop per-byte or per-word on any other
sequence of strings.  Actually eachbyte is much more general than its
name suggests -- feed it e.g. a list of files, and it will return the lines 
of each file -- one after the other -- as a single sequence.

OTOH, eachbyte is NOT particularly good for arbitrary binary files -- if
there happen to be no \n bytes at convenient point it may suck in much
more memory than needed.  Besides, typical need on arbitrary binary
files is to loop on block of N bytes for some N -- N==1 is a rather special
case.  So one might prefer:

def eachblock(afile, N):
    while 1:
        block = afile.read(N)
        if not block: break
        yield block

or variations thereon.


Alex





More information about the Python-list mailing list