Strange behaviour with os.linesep

Jason Swails jason.swails at gmail.com
Tue Jul 23 08:39:59 EDT 2013


On Tue, Jul 23, 2013 at 7:42 AM, Vincent Vande Vyvre <
vincent.vandevyvre at swing.be> wrote:

> On Windows a script where de endline are the system line sep, the files
> are open with a double line in Eric4, Notepad++ or Gedit but they are
> correctly displayed in the MS Bloc-Notes.
>
> Example with this code:
> ------------------------------**----------------
> # -*- coding: utf-8 -*-
>
> import os
> L_SEP = os.linesep
>
> def write():
>     strings = ['# -*- coding: utf-8 -*-\n',
>                 'import os\n',
>                 'import sys\n']
>     with open('writetest.py', 'w') as outf:
>         for s in strings:
>             outf.write(s.replace('\n', L_SEP))
>

I must ask why you are setting strings with a newline line ending only to
replace them later with os.linesep.  This seems convoluted compared to
doing something like

def write():
    strings = ['#-*- coding: utf-8 -*-', 'import os', 'import sys']
    with open('writetest.py', 'w') as outf:
        for s in strings:
            outf.write(s)
            outf.write(L_SEP)

Or something equivalent.

If, however, the source strings come from a file you've created somewhere
(and are loaded by reading in that file line by line), then I can see a
problem.  DOS line endings are carriage returns ('\r\n'), whereas standard
UNIX files use just newlines ('\n').  Therefore, if you are using the code:

s.replace('\n', L_SEP)

in Windows, using a Windows-generated file, then what you are likely doing
is converting the string sequence '\r\n' into '\r\r\n', which is not what
you want to do.  I can imagine some text editors interpreting that as two
endlines (since there are 2 \r's).  Indeed, when I execute the code:

>>> l = open('test.txt', 'w')
>>> l.write('This is the first line\r\r\n')
>>> l.write('This is the second\r\r\n')
>>> l.close()

on UNIX and open the resulting file in gedit, it is double-spaced, but if I
just dump it to the screen using 'cat', it is single-spaced.

If you want to make your code a bit more cross-platform, you should strip
out all types of end line characters from the strings before you write
them.  So something like this:

with open('writetest.py', 'w') as outf:
    for s in strings:
        outf.write(s.rstrip('\r\n'))
        outf.write(L_SEP)

Hope this helps,
Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20130723/bac6453a/attachment.html>


More information about the Python-list mailing list