[Python-Dev] #pragmas in Python source code

M.-A. Lemburg mal@lemburg.com
Wed, 12 Apr 2000 10:17:02 +0200


There currently is a discussion about how to write Python
source code in different encodings on i18n. The (experimental)
solution so far has been to add a command line switch to
Python which tells the compiler which encoding to expect
for u"...strings..." ("...8-bit strings..." will still be used
as is -- it's the user's responsibility to use the right
encoding; the Unicode implementation will still assume them
to be UTF-8 encoded in automatic conversions).

In the end, a #pragma should be usable to tell the compiler
which encoding to use for decoding the u"..." strings.

What we need now, is a good proposal for handling these
#pragmas... does anyone have experience with these ? Any
ideas ?

Here's a simple strawman for the syntax:

# pragma key: value

parser = re.compile(
		    '^#\s*pragma\s+'
	            '([a-zA-Z_][a-zA-Z0-9_]*):\s*'
	            '(.+)'
		   )

For the encoding this would be something like:

# pragma encoding: unicode-escape

The compiler would scan these pragma defs, add them to an
internal temporary dictionary and use them for all subsequent
code it finds during the compilation process. The dictionary
would have to stay around until the original compile() call has
completed (spanning recursive calls).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/