file.read problem

wscrsurfdude mark at holmes.nl
Fri Feb 17 04:21:11 EST 2006


>if it's a binary file, open with mode "rb".
You are right about opening it in the rb mode (flaw in the start post),
but also when I do this in windows in front of every 0x0A is put a
0x0D. I found a explanation why it is working in linux it is below in
my post.

But what i get of this that in windows in front of every 0x0A is put a
0x0D as a line feed. II have to get rid of these. But if there is
already binary data in my original file with the data 0x0D0A the 0x0D
also is deleted, someone has an idea??

############################################
The whole subject of newlines and text files is a murky area of non
standard implementation by different operating systems. These
differences have their roots in the early days of data communications
and the control of mechanical teleprinters. Basically there are 3
different ways to indicate a new line:

 Carriage Return (CR) character ('\r')
 Line Feed (LF) character ('\n')
 CR/LF pair ('\r\n').
All three techniques are used in different operating systems. MS DOS
(and therefore Windows) uses method 3. Unix (including Linux) uses
method 2. Apple in its original MacOS used method 1, but now uses
method 2 since MacOS X is really a variant of Unix.

So how can the poor programmer cope with this multiplicity of line
endings? In many languages she just has to do lots of tests and take
different action per OS. In more modern languages, including Python,
the language provides facilities for dealing with the mess for you. In
the case of Python the assistance comes in the form of the os module
which defines a variable called linesep which is set to whatever the
newline character is on the current operating system. This makes adding
newlines easy, and rstrip() takes account of the OS when it does its
work of removing them, so really the simple way to stay sane, so far as
newlines are concerned is: always use rstrip() to remove newlines from
lines read from a file and always add os.linesep to strings being
written to a file.

That still leaves the awkward situation where a file is created on one
OS and then processed on another, incompatible, OS and sadly, there
isn't much we can do about that except to compare the end of the line
with os.linesep to determine what the difference is.
######################################




More information about the Python-list mailing list