BUG? Re: stripping cr/lf, lf, cr

David Bolen db3l at fitlinxx.com
Tue Apr 3 15:01:04 EDT 2001


"deadmeat" <root@[127.0.0.1]> writes:

> os.linesep is \015\012 on Windows.

Which is literally correct for that platform.  Note however, that
depending on how you open a file, library routines (typically the C
RTL on the platform, not even Python itself) may translate this
physical EOL indicator into a single NL internally.  That's the
difference between a text open and a binary open under Windows for
example.

> I seem to have found the real problem: readline()/readlines() returns the
> string(s) with only \012 on it, not both or none. I guess it's because
> DOS/Win use two bytes not one like Unix/Mac for EOL, making EOL a tad harder
> since it requires peeking ahead a byte to see if it's \012 or not.

The translation is actually occurring within the C library.  The
translation also occurs on the way out, so writing a NL (\012) to a
text file will expand to CRLF (\015\012).

If you open the file in binary (no translation, use "b" as a mode
option, such as "file = open('filename','rb')" then you'll find the
actual \015\012 in each string.

One advantage to this is that if you do work with text files, leaving
the open in text mode means that you can portably use "\n" within your
code as a line ending for both reading and writing.

(Having just written that I realize that I'm not positive this is true
on the Mac platform for example, so if someone could chime in for that
it'd be appreciated - it's definitely Windows/Unix portable).


--
-- David
-- 
/-----------------------------------------------------------------------\
 \               David Bolen            \   E-mail: db3l at fitlinxx.com  /
  |             FitLinxx, Inc.            \  Phone: (203) 708-5192    |
 /  860 Canal Street, Stamford, CT  06902   \  Fax: (203) 316-5150     \
\-----------------------------------------------------------------------/



More information about the Python-list mailing list