editing in Unicode

Neil Hodgson neilh at scintilla.org
Thu Sep 7 07:26:43 EDT 2000


Bertilo Wennergren:
> Is there no way of avoiding this additional step, getting Python to always
> automatically treat all strings as UTF-8 encoded Unicode strings? If I
need
> a lot of Unicode text strings it's a big bother to always have to
explicitly
> convert each and every one of them. A possible source of bugs, I'd say...

   This is still a topic of debate with some people wanting there to be a
global or per-file default encoding. M.A. Lemburg wants syntax like:

MAL> declare encoding = "latin-1"
MAL> x = u"This text will be interpreted as Latin-1 and stored as Unicode"
MAL>
MAL> declare encoding = "ascii"
MAL> y = u"This is supposed to be ASCII, but contains äöü Umlauts - error !"

> If I get this right the following simpler version ought to work:
>
> msg=unicode("@#&")
>
> Right? That I could live with.

   Yes, that is supposed to work but does not yet for me.

> msg = u'@#&'

   I would like that to work too but whether it will and how it will are
still being discussed.

> Great idea. I think my Unicode editor (UniRed) can already do this (or can
> be made to do it with minimal fiddling).

   That is good. I've only been able to find the
http://www.esperanto.mv.ru/UniRed/UTF8/ page which is in a language
(Esperanto?) that I don't understand.

   Neil






More information about the Python-list mailing list