[Python-Dev] [Python-3000] Universal newlines support in Python 3.0

Paul Moore p.f.moore at gmail.com
Sun Aug 12 18:58:44 CEST 2007


On 11/08/07, Guido van Rossum <guido at python.org> wrote:
> On 8/11/07, Tony Lownds <tony at pagedna.com> wrote:
> > Is this ok: when newline='\r\n' or newline='\r' is passed, only that
> > string is used to determine
> > the end of lines. No translation to '\n' is done.
>
> I *think* it would be more useful if it always returned lines ending
> in \n (not \r\n or \r). Wouldn't it? Although this is not how it
> currently behaves; when you set newline='\r\n', it returns the \r\n
> unchanged, so it would make sense to do this too when newline='\r'.
> Caveat user I guess.

Neither this wording, nor the PEP are clear to me, but I'm
assuming/hoping that there will be a way to spell the current
behaviour for universal newlines on input[1], namely that files can
have *either* bare \n, *or* the combination \r\n, to delimit lines.
Whichever is used (I have no need for mixed-style files) gets
translated to \n so that the program sees the same data regardless.

[1] ... at least the bit I care about :-)

This behaviour is immensely useful for uniform treatment of Windows
text files, which are an inconsistent mess of \n-only and \r\n
conventions.

Specifically, I'm looking to replicate this behaviour:

>xxd crlf
0000000: 610d 0a62 0d0a                           a..b..

>xxd lf
0000000: 610a 620a                                a.b.

>python
Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> open('crlf').read()
'a\nb\n'
>>> open('lf').read()
'a\nb\n'
>>>

As demonstrated, this is the default in Python 2.5. I'd hope it was so
in 3.0 as well.

Sorry I can't test this for myself - I don't have the time/toolset to
build my own Py3k on Windows...

Paul.


More information about the Python-Dev mailing list