From garrytre at bigpond.com Wed Feb 1 09:04:23 2012 From: garrytre at bigpond.com (Garry Trethewey) Date: Wed, 01 Feb 2012 18:34:23 +1030 Subject: [sapug] linux/win difference Message-ID: <4F28F207.7070805@bigpond.com> Hi all I've got this irritating problem that I can't imagine can exist. I'm doing a small conversion to a gpx file that is produced by a Windows program & gets used by another Windows program. this code:- LF_CR = chr(13) + chr(10) oldStrWin = '' + LF_CR + '' newStrWin = '' + LF_CR + '' + LF_CR *2 + '' + LF_CR + '' outStr = inStr.replace(oldStrWin, newStrWin) - works as I'd expect in linux, but in Win Vista or Win 7 it doesn't. So result in win vista & 7 :- Results in linux :- I have no idea what other info is relevant, but I'm happy to provide it. thanks in anticipation ------------------------------------ Garry Trethewey ------------------------------------ From dt-sapug at handcraftedcomputers.com.au Wed Feb 1 11:20:41 2012 From: dt-sapug at handcraftedcomputers.com.au (Daryl Tester) Date: Wed, 01 Feb 2012 20:50:41 +1030 Subject: [sapug] linux/win difference In-Reply-To: <4F28F207.7070805@bigpond.com> References: <4F28F207.7070805@bigpond.com> Message-ID: <4F2911F9.6020706@handcraftedcomputers.com.au> (* blows dust off mailing list *) On 01/02/12 18:34, Garry Trethewey wrote: > LF_CR = chr(13) + chr(10) That's actually CR_LF (chr(13) is carriage return, chr(10) is linefeed). Not that it affects anything, I'm just being semantically picky - feel free to ignore this (although it may affect below). > outStr = inStr.replace(oldStrWin, newStrWin) > I have no idea what other info is relevant, but I'm happy to provide it. Can you print the contents of the failing-to-be-converted inStr (preferably with "print repr(inStr)")? I guess it's "obviously" not matching in some fashion. Cheers. -- Regards, Daryl Tester Handcrafted Computers Pty. Ltd. From twegener at fastmail.fm Wed Feb 1 12:25:08 2012 From: twegener at fastmail.fm (Tim Wegener) Date: Wed, 01 Feb 2012 21:55:08 +1030 Subject: [sapug] linux/win difference In-Reply-To: <4F28F207.7070805@bigpond.com> References: <4F28F207.7070805@bigpond.com> Message-ID: <4F292114.3030605@fastmail.fm> Hi Garry, On 01/02/12 18:34, Garry Trethewey wrote: > I've got this irritating problem that I can't imagine can exist. I'm > doing a small conversion to a gpx file that is produced by a Windows > program & gets used by another Windows program. This is almost certainly because you are opening the file in text mode (the default). I.e. open('blah.txt') is equivalent to open('blah.txt', 'r') and open ('blah.txt', 'rt') Compare with open('blah.txt', 'rb') which is binary mode, and open('blah.txt', 'rU') which is universal newline mode. On Windows, text mode converts all new lines (cr-lf) to new lines (lf), whereas binary mode leaves things exactly as they are. On Linux it always behaves like binary mode, i.e. no magic. With universal newline mode all the various platform new line combos are normalised to '\n'. The file object attribute somefile.newlines gives the newline character encountered (or a tuple of distinct line endings encountered if there are different types present). See also 'pydoc file' and 'pydoc open'. Here's some examples from the REPL to make it clear: Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import os >>> os.chdir('c:\\Python27') >>> open('NEWS.txt', 'r').read(60) "Python News\n+++++++++++\n\n\nWhat's New in Python 2.7.2?\n======" >>> open('NEWS.txt', 'rt').read(60) "Python News\n+++++++++++\n\n\nWhat's New in Python 2.7.2?\n======" >>> open('NEWS.txt', 'rb').read(60) "Python News\r\n+++++++++++\r\n\r\n\r\nWhat's New in Python 2.7.2?\r\n=" >>> f = open('NEWS.txt', 'rU') >>> f.read(60) "Python News\n+++++++++++\n\n\nWhat's New in Python 2.7.2?\n======" >>> f.newlines '\r\n' (YMMV with Python 3, it seems to be universal newline modes mode by default according to the docs, but I haven't tried it out. See 'pydoc3 open' for further details if necessary.) The official documentation for this could do with some work. You asked a good question. For your situation you probably want to do something like: old_html = '\n' new_html = old_html + ' \n\n\n' input_file = open(filename, 'rU') updated_html = input_file.read().replace(old_html, new_html) Then depending on your needs, either: # Use line ending style from current platform. updated_html = updated_html.replace('\n', os.linesep) ...or... # Force Windows line ending style. updated_html = updated_html.replace('\n', '\r\n') BTW, the above example was performed running Python for Windows under Wine: $ wine msiexec /i ~/Downloads/python-2.7.2.msi $ wine .wine/drive_c/Python27/python.exe HTH, Tim > > this code:- > > LF_CR = chr(13) + chr(10) > oldStrWin = '' + LF_CR + '' > newStrWin = '' + LF_CR + '' + LF_CR *2 + '' + > LF_CR + '' > outStr = inStr.replace(oldStrWin, newStrWin) > > - works as I'd expect in linux, but in Win Vista or Win 7 it doesn't. > > So result in win vista & 7 :- > > > > Results in linux :- > > > > > From garrytre at bigpond.com Thu Feb 2 01:33:33 2012 From: garrytre at bigpond.com (Garry Trethewey) Date: Thu, 02 Feb 2012 11:03:33 +1030 Subject: [sapug] linux/win difference In-Reply-To: <4F28F207.7070805@bigpond.com> References: <4F28F207.7070805@bigpond.com> Message-ID: <4F29D9DD.30104@bigpond.com> On 01/02/12 18:34, Garry Trethewey wrote: > > this code:- > > LF_CR = chr(13) + chr(10) > oldStrWin = '' + LF_CR + '' > newStrWin = '' + LF_CR + '' + LF_CR *2 + '' + LF_CR > + '' > outStr = inStr.replace(oldStrWin, newStrWin) > > - works as I'd expect in linux, but in Win Vista or Win 7 it doesn't. > > So result in win vista & 7 :- > > > > Results in linux :- > > > > > Thanks both for that. New code works :- if osName == 'posix': NL = chr(13) + chr(10) if osName == 'nt': NL = '\n' # or chr(10) # d'oh! # Had lots of trouble with win CR_LF **not** being CR_LF # see emails # Tim Wegener linux/20120201-2155 # Daryl Tester linux/20120201_2050 # for more options oldStrWin = '' + NL + '' newStrWin = '' + NL + '' + NL *2 + '' + NL + '' outStr = inStr.replace(oldStrWin, newStrWin) "It" (is that win or python?) converts chr(13) + chr(10) to chr(10) ie \n as it reads, then converts back to chr(13) + chr(10) as it writes. So without "print repr(inStr)" I'd never have seen the prob. Yep, file opened in text mode. inFile = open(inFileName, "r") The discussion of all the opening modes is something I never thought I'd have a use for, only ever using txt files. But now I'm moving into windows, I'll keep it for probs I never had before. Thanks again ------------------------------------ Garry Trethewey ------------------------------------