[New-bugs-announce] [issue17083] can't specify newline string for readline for binary files

Bryant report at bugs.python.org
Wed Jan 30 19:32:05 CET 2013


New submission from Bryant:

When opening binary files in Python 3, the newline parameter cannot be set. While this kind of makes sense, readline() can still be used on binary files. This is great for my usage, but it is doing universal newline mode, I believe, so that any \r, \n, or \r\n triggers an EOL.

The data I'm working with is mixed ASCII/binary, with line termination specified by \r\n. I can't read a line (even though that concept occurs in my file) because some of the binary data includes \r or \n even though they aren't newlines in this context.

The issue here is that if the newline string can't be specified, readline() is useless on binary data, which often uses custom EOL strings. So would it be reasonable to add the newline parameter support to binary files? If not, then shouldn't readline() throw an exception when used on binary files?

I don't know if it's helpful here, but I've written a binary_readline() function supporting arbitrary EOL strings:

def binary_readline(file, newline=b'\r\n'):
    line = bytearray()
    newlineIndex = 0
    while True:
        x = file.read(1)
        if x:
            line += x
        else:
            if len(line) == 0:
                return None
            else:
                return line
        # If this character starts to match the newline string, start that comparison til it matches or doesn't.
        while line[-1] == newline[newlineIndex]:
            x = file.read(1)
            if x:
                line += x
            else:
                return line
            newlineIndex += 1
            if newlineIndex == len(newline):
                return line
               
        # We failed checking for the newline string, so reset the checking index
        newlineIndex = 0

----------
components: Library (Lib)
messages: 180984
nosy: susurrus
priority: normal
severity: normal
status: open
title: can't specify newline string for readline for binary files
type: behavior
versions: Python 3.3

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue17083>
_______________________________________


More information about the New-bugs-announce mailing list