\r\n or \n notepad editor end line ???
John Machin
sjmachin at lexicon.net
Mon Jun 13 20:26:18 EDT 2005
Steven D'Aprano wrote:
> On Mon, 13 Jun 2005 11:53:25 +0200, Fredrik Lundh wrote:
>
>
>><ajikoe at gmail.com> wrote:
>>
>>
>>>It means in windows we should use 'wb' to write and 'rb' to read ?
>>>Am I right?
>>
>>no.
>>
>>you should use "wb" to write *binary* files, and "rb" to read *binary*
>>files.
>>
>>if you're working with *text* files (that is, files that contain lines of text
>>separated by line separators), you should use "w" and "r" instead, and
>>treat a single "\n" as the line separator.
>
>
> I get nervous when I read instructions like this. It sounds too much like
> voodoo: "Do this, because it works, never mind how or under what
> circumstances, just obey or the Things From The Dungeon Dimensions will
> suck out your brain!!!"
>
> Sorry Fredrik :-)
>
Many people don't appear to want to know why; they only want a solution
to what they perceive to be their current problem.
> When you read a Windows text file using "r" mode, what happens to the \r
> immediately before the newline?
The thing to which you refer is not a "newline". It is an ASCII LF
character. The CR and the LF together are the physical representation
(in a Windows text file) of the logical "newline" concept.
Internally, LF is used (irrespective of platform) to represent that concept.
> Do you have to handle it yourself?
No.
> Or will
> Python cleverly suppress it so you don't have to worry about it?
Suppressed: no, it's a transformation from a physical line termination
representation to a logical one. Cleverly: matter of opinion. By Python:
In general, no -- the transformation is handled by the underlying C
run-time library.
>
> And when you write a text file under Python using "w" mode, will the
> people who come along afterwards to edit the file in Notepad curse your
> name?
If they do, it will not be because other than CRLF has been written as a
line terminator.
> Notepad expects \r\n EOL characters, and gets cranky if the \r is
> missing.
AFAIR, it performs well enough for a text editor presented with a file
consisting of one long unterminated line with occasional embedded
meaningless-to-the-editor control characters. You can scroll it, edit
it, write it out again ... any crankiness is likely to be between the
keyboard and the chair :-)
>
> How does this behaviour differ from "universal newlines"?
>
Ordinary behaviour in text mode:
Win: \r\n -> newline -> \r\n
Mac OS X < 10: \r -> newline -> \r
other box: \n -> newline -> \n
Note : TFM does appear a little light on in this area. I suppose not all
users of Python have aquired this knowledge by osmosis through decades
of usage of C on multiple platforms :-)
"Universal newlines":
On *any* box: \r\n or \n or \r (even a mixture) -> \n on reading
On writing, behaviour is "ordinary" i.e. the line terminator is what is
expected by the current platform
"Universal newlines" (if used) solves problems like where an other-boxer
FTPs a Windows text file in binary mode and then posts laments about all
those ^M things on the vi screen and :1,$s/^M//g doesn't work :-)
HTH,
John
More information about the Python-list
mailing list