From garrytre at bigpond.com Wed Feb 1 09:04:23 2012
From: garrytre at bigpond.com (Garry Trethewey)
Date: Wed, 01 Feb 2012 18:34:23 +1030
Subject: [sapug] linux/win difference
Message-ID: <4F28F207.7070805@bigpond.com>
Hi all
I've got this irritating problem that I can't imagine can exist. I'm
doing a small conversion to a gpx file that is produced by a Windows
program & gets used by another Windows program.
this code:-
LF_CR = chr(13) + chr(10)
oldStrWin = '' + LF_CR + ''
newStrWin = '' + LF_CR + '' + LF_CR *2 + '' + LF_CR
+ ''
outStr = inStr.replace(oldStrWin, newStrWin)
- works as I'd expect in linux, but in Win Vista or Win 7 it doesn't.
So result in win vista & 7 :-
Results in linux :-
I have no idea what other info is relevant, but I'm happy to provide it.
thanks in anticipation
------------------------------------
Garry Trethewey
------------------------------------
From dt-sapug at handcraftedcomputers.com.au Wed Feb 1 11:20:41 2012
From: dt-sapug at handcraftedcomputers.com.au (Daryl Tester)
Date: Wed, 01 Feb 2012 20:50:41 +1030
Subject: [sapug] linux/win difference
In-Reply-To: <4F28F207.7070805@bigpond.com>
References: <4F28F207.7070805@bigpond.com>
Message-ID: <4F2911F9.6020706@handcraftedcomputers.com.au>
(* blows dust off mailing list *)
On 01/02/12 18:34, Garry Trethewey wrote:
> LF_CR = chr(13) + chr(10)
That's actually CR_LF (chr(13) is carriage return, chr(10) is linefeed). Not
that it affects anything, I'm just being semantically picky - feel free to
ignore this (although it may affect below).
> outStr = inStr.replace(oldStrWin, newStrWin)
> I have no idea what other info is relevant, but I'm happy to provide it.
Can you print the contents of the failing-to-be-converted inStr (preferably
with "print repr(inStr)")? I guess it's "obviously" not matching in some
fashion.
Cheers.
--
Regards,
Daryl Tester
Handcrafted Computers Pty. Ltd.
From twegener at fastmail.fm Wed Feb 1 12:25:08 2012
From: twegener at fastmail.fm (Tim Wegener)
Date: Wed, 01 Feb 2012 21:55:08 +1030
Subject: [sapug] linux/win difference
In-Reply-To: <4F28F207.7070805@bigpond.com>
References: <4F28F207.7070805@bigpond.com>
Message-ID: <4F292114.3030605@fastmail.fm>
Hi Garry,
On 01/02/12 18:34, Garry Trethewey wrote:
> I've got this irritating problem that I can't imagine can exist. I'm
> doing a small conversion to a gpx file that is produced by a Windows
> program & gets used by another Windows program.
This is almost certainly because you are opening the file in text mode
(the default).
I.e. open('blah.txt') is equivalent to open('blah.txt', 'r') and open
('blah.txt', 'rt')
Compare with open('blah.txt', 'rb') which is binary mode, and
open('blah.txt', 'rU') which is universal newline mode.
On Windows, text mode converts all new lines (cr-lf) to new lines (lf),
whereas binary mode leaves things exactly as they are.
On Linux it always behaves like binary mode, i.e. no magic.
With universal newline mode all the various platform new line combos are
normalised to '\n'.
The file object attribute somefile.newlines gives the newline character
encountered (or a tuple of distinct line endings encountered if there
are different types present).
See also 'pydoc file' and 'pydoc open'.
Here's some examples from the REPL to make it clear:
Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.chdir('c:\\Python27')
>>> open('NEWS.txt', 'r').read(60)
"Python News\n+++++++++++\n\n\nWhat's New in Python 2.7.2?\n======"
>>> open('NEWS.txt', 'rt').read(60)
"Python News\n+++++++++++\n\n\nWhat's New in Python 2.7.2?\n======"
>>> open('NEWS.txt', 'rb').read(60)
"Python News\r\n+++++++++++\r\n\r\n\r\nWhat's New in Python 2.7.2?\r\n="
>>> f = open('NEWS.txt', 'rU')
>>> f.read(60)
"Python News\n+++++++++++\n\n\nWhat's New in Python 2.7.2?\n======"
>>> f.newlines
'\r\n'
(YMMV with Python 3, it seems to be universal newline modes mode by
default according to the docs, but I haven't tried it out. See 'pydoc3
open' for further details if necessary.)
The official documentation for this could do with some work. You asked a
good question.
For your situation you probably want to do something like:
old_html = '\n'
new_html = old_html + ' \n\n\n'
input_file = open(filename, 'rU')
updated_html = input_file.read().replace(old_html, new_html)
Then depending on your needs, either:
# Use line ending style from current platform.
updated_html = updated_html.replace('\n', os.linesep)
...or...
# Force Windows line ending style.
updated_html = updated_html.replace('\n', '\r\n')
BTW, the above example was performed running Python for Windows under Wine:
$ wine msiexec /i ~/Downloads/python-2.7.2.msi
$ wine .wine/drive_c/Python27/python.exe
HTH,
Tim
>
> this code:-
>
> LF_CR = chr(13) + chr(10)
> oldStrWin = '' + LF_CR + ''
> newStrWin = '' + LF_CR + '' + LF_CR *2 + '' +
> LF_CR + ''
> outStr = inStr.replace(oldStrWin, newStrWin)
>
> - works as I'd expect in linux, but in Win Vista or Win 7 it doesn't.
>
> So result in win vista & 7 :-
>
>
>
> Results in linux :-
>
>
>
>
>
From garrytre at bigpond.com Thu Feb 2 01:33:33 2012
From: garrytre at bigpond.com (Garry Trethewey)
Date: Thu, 02 Feb 2012 11:03:33 +1030
Subject: [sapug] linux/win difference
In-Reply-To: <4F28F207.7070805@bigpond.com>
References: <4F28F207.7070805@bigpond.com>
Message-ID: <4F29D9DD.30104@bigpond.com>
On 01/02/12 18:34, Garry Trethewey wrote:
>
> this code:-
>
> LF_CR = chr(13) + chr(10)
> oldStrWin = '' + LF_CR + ''
> newStrWin = '' + LF_CR + '' + LF_CR *2 + '' + LF_CR
> + ''
> outStr = inStr.replace(oldStrWin, newStrWin)
>
> - works as I'd expect in linux, but in Win Vista or Win 7 it doesn't.
>
> So result in win vista & 7 :-
>
>
>
> Results in linux :-
>
>
>
>
>
Thanks both for that.
New code works :-
if osName == 'posix':
NL = chr(13) + chr(10)
if osName == 'nt':
NL = '\n' # or chr(10)
# d'oh!
# Had lots of trouble with win CR_LF **not** being CR_LF
# see emails
# Tim Wegener linux/20120201-2155
# Daryl Tester linux/20120201_2050
# for more options
oldStrWin = '' + NL + ''
newStrWin = '' + NL + '' + NL *2 + '' + NL + ''
outStr = inStr.replace(oldStrWin, newStrWin)
"It" (is that win or python?) converts chr(13) + chr(10) to chr(10) ie
\n as it reads, then converts back to chr(13) + chr(10) as it writes. So
without "print repr(inStr)" I'd never have seen the prob.
Yep, file opened in text mode. inFile = open(inFileName, "r")
The discussion of all the opening modes is something I never thought I'd
have a use for, only ever using txt files. But now I'm moving into
windows, I'll keep it for probs I never had before.
Thanks again
------------------------------------
Garry Trethewey
------------------------------------