Try this

John Machin sjmachin at lexicon.net
Sun Sep 16 18:28:01 EDT 2007


On Sep 17, 7:54 am, "mensana... at aol.com" <mensana... at aol.com> wrote:
> On Sep 16, 2:22?pm, Steve Holden <st... at holdenweb.com> wrote:
>
>
>
> > mensana... at aol.com wrote:
> > > On Sep 16, 1:10?pm, Dennis Lee Bieber <wlfr... at ix.netcom.com> wrote:
> > >> On Sun, 16 Sep 2007 01:46:34 -0700, GeorgeRXZ <george... at gmail.com>
> > >> declaimed the following in comp.lang.python:
>
> > >>> Then Open the Notepad and type the following sentence, and save the
> > >>> file and close the notepad. Now reopen the file and you will find out
> > >>> that, Notepad is not able to save the following text line.
> > >>> Well you are speed
> > >>> This occurs not only with above sentence but any sentence that has
> > >>> 4 3 3 5 (sequence of characters: Well=4 you=3 are=3 speed=5)
> > >>         I tried. I also opened the saved file in SciTE...
> > >> And the text WAS there...
>
> > >>         It is Notepad that can not properly render what it,
> > >> itself, saved.
>
> > > C:\Documents and Settings\mensanator\My Documents>type huh.txt
> > > Well you are speed
>
> > > Yes, file was saved correctly.
> > > But reopening it shows 9 unprintable characters.
> > > If I copy those to a new file (huh1.txt):
>
> > > C:\Documents and Settings\mensanator\My Documents>type huh1.txt
> > > ?????????
>
> > > But wait...the new file is 20 characters, not 9.
>
> > > 09/16/2007  01:44 PM                18 huh.txt
> > > 09/16/2007  01:54 PM                20 huh1.txt
>
> > > C:\Documents and Settings\mensanator\My Documents>dump huh.txt
> > > huh.txt:
> > > 00000000  5765 6c6c 2079 6f75 2061 7265 2073 7065 Well you are spe
> > > 00000010  6564                                    ed
>
> > > Here's what it's actually doing:
>
> > > C:\Documents and Settings\mensanator\My Documents>dump huh1.txt
> > > huh1.txt:
> > > 00000000  fffe 5765 6c6c 2079 6f75 2061 7265 2073 .~Well you are s
> > > 00000010  7065 6564                               peed
>
> > One word: Unicode.
>
> > The "open" and "save" dialogs allow you to specify an encoding.
>
> And the encoding specified was ANSI.
>
> > If you
> > specify Unicode the you will get what you see above.
>
> And if you specify ANSI _before_ you click the file name,
> the specification switches to Unicode and has to then
> be manually switched back to ANSI.
>
> > If you specify ANSI
> > you will get the text you entered.
>
> It's still a bug in the "open" dialog.

It's more like a bug/feature in its encoding detector. I can get it to
switch to Unicode only if there's an even number of characters AND the
line is NOT terminated by CRLF -- add/remove one alpha character, or
hit the enter key at the end of the line, and it won't detect it as
Unicode when you open it again.

You only get the BOM (0xfffe) if you are silly enough to save it while
it's open in Unicode mode.

>
>
>
> > By the way, this has precisely what to do with Python?
>
> I've been known to use Notepad to create Python
> source code.

Your source code would have to be trivially short to trigger the
strange behaviour.





More information about the Python-list mailing list