Canonical way of dealing with null-separated lines?
John Machin
sjmachin at lexicon.net
Thu Feb 24 15:56:49 EST 2005
On Thu, 24 Feb 2005 11:53:32 -0500, Christopher De Vries
<devries at idolstarastronomer.com> wrote:
>On Wed, Feb 23, 2005 at 10:54:50PM -0500, Douglas Alan wrote:
>> Is there a canonical way of iterating over the lines of a file that
>> are null-separated rather than newline-separated?
>
>I'm not sure if there is a canonical method, but I would recommending using a
>generator to get something like this, where 'f' is a file object:
>
>def readnullsep(f):
> # Need a place to put potential pieces of a null separated string
> # across buffer boundaries
> retain = []
>
> while True:
> instr = f.read(2048)
> if len(instr)==0:
> # End of file
> break
>
> # Split over nulls
> splitstr = instr.split('\0')
>
> # Combine with anything left over from previous read
> retain.append(splitstr[0])
> splitstr[0] = ''.join(retain)
>
> # Keep last piece for next loop and yield the rest
> retain = [splitstr[-1]]
> for element in splitstr[:-1]:
(1) Inefficient (copies all but the last element of splitstr)
> yield element
>
> # yield anything left over
> yield retain[0]
(2) Dies when the input file is empty.
(3) As noted by the OP, can return a spurious empty line at the end.
Try this:
!def readweird(f, line_end='\0', bufsiz=8192):
! retain = ''
! while True:
! instr = f.read(bufsiz)
! if not instr:
! # End of file
! break
! splitstr = instr.split(line_end)
! if splitstr[-1]:
! # last piece not terminated
! if retain:
! splitstr[0] = retain + splitstr[0]
! retain = splitstr.pop()
! else:
! if retain:
! splitstr[0] = retain + splitstr[0]
! retain = ''
! del splitstr[-1]
! for element in splitstr:
! yield element
! if retain:
! yield retain
Cheers,
John
More information about the Python-list
mailing list