Read a record instead of a line from a file

Andrew Dalke dalke at dalkescientific.com
Fri Aug 24 06:55:46 EDT 2001


YMK wrote:
>If I know the "Record Separator" of a flat file, how do I set to read
>one record at a time ?

Here's something I just tried out using 2.2's 'yield' statement.
(2.2 is currently in alpha release.)  Warning: this is my first
generator and I've also not fully tested it.

from __future__ import generators

def SepReader(infile, sep = "\n\n"):
    text = infile.read(10000)
    if not text:
        return
    while 1:
        fields = text.split(sep)
        for field in fields[:-1]:
            yield field
        text = fields[-1]
        new_text = infile.read(10000)
        if not new_text:
            yield text
            break
        text += new_text

It's used like this

for record in SepReader(open(fortunes), "%\n"):
    print record

If you want something that's really high speed, but uses the
mxTextTools C extension, you can try my Martel parser, which
is part of biopython.org.  The specific record readers are in
http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Martel/Record
Reader.py?cvsroot=biopython

                    Andrew
                    dalke at dalkescientific.com






More information about the Python-list mailing list