[Python-Dev] PEP 263 - Defining Python Source Code Encodings

Martin v. Loewis martin@v.loewis.de
14 Jul 2002 18:29:20 +0200


"M.-A. Lemburg" <mal@lemburg.com> writes:

> > Can you elaborate what you think the difference is? I believe the PEP
> > is silent on this specific aspect,
> 
> It does mention this as part of phase 2.

All I can find is

<quote>
The builtin compile() API will be enhanced to accept Unicode as input.
</quote>

That leaves the question open what the compile function *does* beyond
merely accepting Unicode strings; it is canonical that it tries to
compile it, as it would with a byte string.

The unspecified aspect is the treatment of byte strings within the
Unicode string. The current compiler treats them "as-is"; this is
clearly no option. The reasonable options are:

1. convert to byte string using "ascii" encoding,
2. convert to byte string using "utf-8" encoding,
3. convert to byte string using system default encoding,
4. convert to byte string using encoding declared inside the code
   string. If that route is taken, the question is what happens
   if no encoding declaration is found.

> No need for this. The PEP already mentions it.

Can you please quote the precise words in the text of the PEP that
answer the question which of the four options above is taken?

Regards,
Martin