Record separator for readlines()

Bengt Richter bokr at oz.net
Sat Sep 3 00:31:14 EDT 2005


On Fri, 2 Sep 2005 22:10:18 -0500, jepler at unpythonic.net wrote:

>
>--SkvwRMAIpAhPCcCJ
>Content-Type: text/plain; charset=us-ascii
>Content-Disposition: inline
>
>I think you still have to roll your own.
>
>Here's a start:
>	def ireadlines(f, s='\n', bs=4096):
>	    if not s: raise ValueError, "separator must not be empty"
>	    r = []
>	    while 1:
>		b = f.read(bs)
>		if not b: break
>		ofs = 0
>		while 1:
>		    next = b.find(s, ofs)
>		    if next == -1: break
>		    next += len(s)
>		    yield ''.join(r) + b[ofs:next]
>		    del r[:]
>		    ofs = next
>		r.append(b[ofs:])
>	    yield ''.join(r)
>
What if len(s)>1 and read(bs) reads a partial s?

I posted file splitter some time back which UIGoofed handles that
(still not tested beyond the shown examples, so caveat utor(??) ;-)

    http://groups.google.com/group/comp.lang.python/msg/e333f8b2e2fcdc49

Thought I might be missing something, but

 >>> def ireadlines(f, s='\n', bs=4096):
 ...     if not s: raise ValueError, "separator must not be empty"
 ...     r = []
 ...     while 1:
 ...         b = f.read(bs)
 ...         if not b: break
 ...         ofs = 0
 ...         while 1:
 ...             next = b.find(s, ofs)
 ...             if next == -1: break
 ...             next += len(s)
 ...             yield ''.join(r) + b[ofs:next]
 ...             del r[:]
 ...             ofs = next
 ...         r.append(b[ofs:])
 ...     yield ''.join(r)
 ...
 >>> from StringIO import StringIO as SIO
 >>> f = SIO('123xx678xxCxx_and so forth')
 >>> for s in ireadlines(f,'xx',4): print repr(s),
 ...
 '123xx678xx' 'Cxx_and so forth'
 >>> for s in ireadlines(f,'xx',5): print repr(s),
 ...
 ''
oops
 >>> f.seek(0)
 >>> for s in ireadlines(f,'xx',5): print repr(s),
 ...
 '123xx' '678xx' 'Cxx' '_and so forth'

Regards,
Bengt Richter



More information about the Python-list mailing list