Canonical way of dealing with null-separated lines?
John Machin
sjmachin at lexicon.net
Mon Feb 28 21:02:18 EST 2005
Douglas Alan wrote:
> I wrote:
>
> > Oops, I just realized that my previously definitive version did not
> > handle multi-character newlines. So here is a new definitive
> > version. Oog, now my brain hurts:
>
> I dunno what I was thinking. That version sucked! Here's a version
> that's actually comprehensible, a fraction of the size, and works in
> all cases. (I think.)
>
> def fileLineIter(inputFile, newline='\n', leaveNewline=False,
readSize=8192):
> """Like the normal file iter but you can set what string indicates
newline.
>
> The newline string can be arbitrarily long; it need not be
restricted to a
> single character. You can also set the read size and control
whether or not
> the newline string is left on the end of the iterated lines.
Setting
> newline to '\0' is particularly good for use with an input file
created with
> something like "os.popen('find -print0')".
> """
> outputLineEnd = ("", newline)[leaveNewline]
> partialLine = ''
> while True:
> charsJustRead = inputFile.read(readSize)
> if not charsJustRead: break
> lines = (partialLine + charsJustRead).split(newline)
The above line is prepending a short string to what will typically be a
whole buffer full. There's gotta be a better way to do it. Perhaps you
might like to refer back to CdV's solution which was prepending the
residue to the first element of the split() result.
> partialLine = lines.pop()
> for line in lines: yield line + outputLineEnd
In the case of leaveNewline being false, you are concatenating an empty
string. IMHO, to quote Jon Bentley, one should "do nothing gracefully".
> if partialLine: yield partialLine
>
> |>oug
More information about the Python-list
mailing list