when does newlines get set in universal newlines mode?

Chris Angelico rosuav at gmail.com
Mon May 4 08:13:31 EDT 2015


On Mon, May 4, 2015 at 10:01 PM, Peter Otten <__peter__ at web.de> wrote:
> I tried:
>
>>>> with open("tmp.txt", "wb") as f: f.write("alpha\r\nbeta\rgamma\n")
> ...
>>>> f = open("tmp.txt", "rU")
>>>> f.newlines
>>>> f.readline()
> 'alpha\n'
>>>> f.newlines
> # expected: '\r\n'
>>>> f.readline()
> 'beta\n'
>>>> f.newlines
> '\r\n' # expected: ('\r', '\r\n')
>>>> f.readline()
> 'gamma\n'
>>>> f.newlines
> ('\r', '\n', '\r\n')
>
> I believe this is a bug.

I'm not sure it is, actually; imagine the text is coming in one
character at a time (eg from a pipe), and it's seen "alpha\r". It
knows that this is a line, so it emits it; but until the next
character is read, it can't know whether it's going to be \r or \r\n.
What should it do? Read another character, which might block? Put "\r"
into .newlines, which might be wrong? Once it sees the \n, it knows
that it was \r\n (or rather, it assumes that files do not have lines
of text terminated by \r followed by blank lines terminated by \n -
because that would be stupid).

It may be worth documenting this limitation, but it's not something
that can easily be fixed without removing support for \r newlines -
although that might be an option, given that non-OSX Macs are
basically history now.

ChrisA



More information about the Python-list mailing list