[Python-Dev] New lines, carriage returns, and Windows

Guido van Rossum guido at python.org
Thu Sep 27 00:04:26 CEST 2007


On 9/26/07, Dino Viehland <dinov at exchange.microsoft.com> wrote:
> We ran into an interesting user-reported issue w/ IronPython and the way Python writes to files and I thought I'd get python-dev's opinion.
>
> When writing a string in text mode that contains \r\n we both write \r\r\n because the default write mode is to replace \n with \r\n.  This works great as long as you stay within an entirely Python world.  Because Python uses \n for everything internally you'll never end up writing out a \r\n that gets transformed into a \r\r\n.  But when interoperating with other native code (or .NET code in our case) it's fairly easy to be exposed to a string which contains \r\n.  Ultimately we see odd behavior when round tripping the contents of a multi-line text box through a file.
>
> So today users have to be aware of the fact that Python internally always uses \n.  They also need to be aware of any APIs that they call that might return a string with an embedded \r\n inside of them and transform the string back into the Python version.
>
> It could be argued that there's little value in doing the simple transformation from \r\n -> \r\r\n.  Ultimately that creates a file that has line endings which aren't good on any platform.  On the other hand it could also be argued that Python defines new-lines as \n and there should be no deviation from that.  And doing so could be considered a slippery slope, first file deals with it, and next the standard libraries, etc...  Finally this might break some apps and if we changed IronPython to behave differently we could introduce incompatibilities which we don't want.
>
> So I'm curious: Is there a reason this behavior is useful that I'm missing?

No, it is simply the way Microsoft's C stdio library works. :-(

A simple workaround would be to apply s.replace('\r', '') before
writing anything of course.

> Would there be a possibility (or objections to) making \r\n be untransformed in the Py3k timeframe?  Or should we just tell our users to open files in binary mode? :)

Py3k supports a number of different ways of working with newlines for
text files, but not (yet) one that leaves \r\n alone while translating
a lone \n into \r\n. It may not be too late to change that though.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-Dev mailing list