Reading in cooked mode (was Re: Python MSI not installing, log file showing name of a Viatnemese communist revolutionary)

Mark H Harris harrismh777 at gmail.com
Sun Mar 23 23:48:49 EDT 2014


On 3/23/14 10:17 PM, Chris Angelico wrote:
> Newline style IS relevant. You're saying that this will copy a file perfectly:
>
> out = open("out", "w")
> for line in open("in"):
>      out.write(line)
>
> but it wouldn't if the iteration and write stripped and recreated
> newlines? Incorrect, because this version will collapse \r\n into \n.
> It's still a *text file copy*. (And yes, I know about 'with'. Shut
> up.) It's idempotent, not byte-for-byte perfect.

Which was my point in the first place about new-line standards. We all 
know why its important to collapse \r\n into \n,  but why(?) in a 
general way would this be the universal desired end? (rhetorical)  Your 
example of byte-for-byte perfect copy is one good case (they are not). 
Another might be controller code (maybe ancient) where the \r is 
'required' and collapsing it to \n won't work on the device (tty, or 
other).

There does need to be a text file standard where what is desired is a 
file of "lines". Iterating over the file object should return the 
"lines" on any system platform, without the user being required to strip 
off the line-end (newline  \n) delimiter U+000a. The delimiter does not 
matter.

What python has done by collapsing the \r\n into \n is to hide the real 
problem (non standard delimiters between platforms) and in the process 
actually 'removes' possibly important information  (\r). {lossy}

We don't really use real tty devices any longer which require one code 
to bring the print head carriage back (\r) and one code to index the 
paper platten (\n).  Screen I/O doesn't work that way any longer either. 
Its time to standardize the newline and/or file text line end delimiters.

marcus



More information about the Python-list mailing list