RS in Files: Novel Thoughts (well, not that much)

Moshe Zadka moshez at math.huji.ac.il
Sun Oct 17 15:04:23 EDT 1999


I'm reading Design Patterns, and I was struck by the decorator pattern:
exactly what's needed to implement record seperators in files. Well, 
I sat down, and in 15 minutes I had a (slow, highly non-optimized) version
of a record seperated file object. Here it is, to anyone who wants to 
implement it.

Note to the optimizer: a true masochist can do everything here in a C
level object, which will cache _file.read to minimize lookup, and might
even do optimizations if the underlying object is a Python file object.

Note to the regression tester: due to its length, the regression test
has been snipped. Sorry.

Note to the generalizer: this class can trivially be changed to 
support REs as record seperators, thereby having functionality Perl
doesn't, yet. Hint hint.

----------------- begin code ------------------------
from string import find

class FieldSeperatorFile:

        def __getattr__(self, name):
                return getattr(self._file, name)

        def __init__(self, file, field_seperator='\n'):
                self._file=file
                self._buf=''
                self._seperator=field_seperator
                self._buf_size=4096

        def set_sepeator(self, field_seperator):
                self._seperator=field_seperator

        def read(self, size=-1):
                if size<0:
                        ret=self._buf+self._file.read()
                        self._buf=''
                        return ret
                if size<len(self._buf):
                        ret=self._buf[:size]
                        self._buf=self._buf[size:]
                        return ret
                ret=self._buf+self._file.read(size-len(self._buf))
                self._buf=''
                return ret

        def readline(self):
                while 1:
                        n=find(self._buf, self._seperator)
                        if n>=0:
                                break
                        read=self._file.read(self._buf_size)
                        if not read:
                                ret=self._buf
                                self._buf=''
                                return ret
                        self._buf=self._buf+read
                ret=self._buf[:n+len(self._seperator)]
                self._buf=self._buf[n+len(self._seperator):]
                return ret
------------------- end code ----------------------

--
Moshe Zadka <mzadka at geocities.com>. 
INTERNET: Learn what you know.
Share what you don't.





More information about the Python-list mailing list