[Python-Dev] #pragmas in Python source code

Guido van Rossum guido@python.org
Tue, 18 Apr 2000 06:35:33 -0400


> The idea is to make life a little easier for programmers
> who's native script is not easily writable using ASCII, e.g.
> the whole Asian world.
> 
> While originally only the encoding used within the quotes of
> u"..." was targetted (on the i18n sig), there has now been
> some discussion on this list about whether to move forward
> in a whole new direction: that of allowing whole Python scripts
> to be encoded in many different encodings. The compiler will
> then convert the scripts first to Unicode and then to 8-bit
> strings as needed.
> 
> Using this technique which was introduced by Fredrik Lundh
> we could in fact have Python scripts which are encoded in
> UTF-16 (two bytes per character) or other more obscure
> encodings. The Python interpreter would only see Unicode
> and Latin-1.

Wouldn't it make more sense to have the Python compiler *always* see
UTF-8 and to use a simple preprocessor to deal with encodings?

(Disclaimer: there are about 300 unread python-dev messages in my
inbox still.)

--Guido van Rossum (home page: http://www.python.org/~guido/)