Using more than 7 bit ASCII on windows.

Paul Moore paul.moore at uk.origin-it.com
Mon Oct 30 10:04:51 EST 2000


On Mon, 30 Oct 2000 15:08:21 +0100, Paul Moore
<paul.moore at uk.origin-it.com> wrote:
>That sounds very likely - £ in latin-1 is different from £ in the
>DOS-codepage - a fact I had forgotten, which has probably caused all
>of this confusion.

OK, This proves something...

In a console window
-------------------

>>> print unicode("£","latin1").encode("cp437")
£
>>> unicode("£","cp437")
u'\243'
>>> unicode("£","latin1")
u'\234'
>>> print "%o" % ord("£")
234

In PythonWin
------------

>>> print unicode("£","latin1").encode("cp437")
Ã?Å?
>>> unicode("£","cp437")
u'\u252C\372'
>>> unicode("£","latin1")
u'\302\243'
>>> print "%o" % ord("£")
Traceback (innermost last):
  File "<interactive input>", line 1, in ?
TypeError: expected a character, length-2 string found
>>> print map(ord, unicode("£","latin1").encode("cp437"))
[194, 156]

... where the output on the first line displays as
capital-A-with-a-circumflex followed by lowercase-oe. Actually, what I
*typed* was "£" - the cut-and-paste has added the capital-A-hat in the
quotes! (PS, given that it may not make it through Usenet, what is
displayed in my newsreader is capital-A-with-two-dots, comma,
capital-A-with-a-circle, double-straightline-opening-quotes.

And worse, if I type ord("£") (those exact keypresses) I get the
length-2 string error. Which patently isn't what I typed, but is
obviously what Python sees...

Now, the console is comprehensible, but PythonWin is - to put it
politely - odd. Actually, the length-2 string stuff implies that
something is converting to UTF-8 somewhere, somehow...

Is PythonWin trying to use UTF-8 internally? If so, it looks like
you're doing one to many encodes, or some other nasty but easy-to-make
error...

Paul.

PS When I tried to send this message, I got a comment from the
newsreader that it couldn't be sent in Latin-1, because of
non-displayable characters. I've sent latin-1 anyway, as it's more
likely to be at least visible on other screens. But it just goes to
show that what is going on here is pretty nasty to get right...




More information about the Python-list mailing list