Python's 8-bit cleanness deprecated?

John Roth johnroth at ameritech.net
Sat Feb 8 09:47:37 EST 2003


"Kirill Simonov" <kirill_simonov at mail.ru> wrote in message
news:mailman.1044655519.19088.python-list at python.org...
* John Roth <johnroth at ameritech.net>:
>
> After thinking about this for a few days, it suddenly occured to me
> that there may be a very obscure method in this madness. That is, by
> restricting python source to 7-bit ascii unless otherwise declared,
> it opens the way to migrate to UTF-8 input. This, in turn, would
> solve most of the character set problems in one fell swoop.
>

Why do you think that UTF-8 is a panacea?

For example, my little script

    print "ðÒÉ×ÅÔ!"

will become

    print u"ðÒÉ×ÅÔ!".encode('koi8-r')

if I am forced to use UTF-8 for my source code. I don't see any
advantage here.

Actually, it won't. Since UTF-8 includes all characters in Unicode,
the entire unicode framework that was invented for 2.0 is suddenly
irrelevant. It will be possible to use any character from the Unicode
standard anywhere it does not make a syntactic difference, such
as in literals and comments.

It would make no difference whatsoever to your script. (And I
note with pleasure (and some little bemusement) that my system
actually displayed your Cyrillic properly (at least I assume it's
correct!)

John Roth

--
xi







More information about the Python-list mailing list