PEP 263 comments

Sat Mar 2 03:36:04 EST 2002

"Jason Orendorff" <jason at jorendorff.com> writes:

> Unfortunately it's all impossible.

Unfortunately, yes. Let me explain how this will work under PEP 263.

>  * All my existing Python code should continue to run.

They will, but you will might get a DeprecationWarning; eventually,
your code might stop being accepted.

>  * I shouldn't have to understand what Unicode is, if all I want
>    is to bang out a quick script to say "hello world" in my native
>    language.

You won't need to understand what Unicode is. You will need to
understand what encodings are, though, i.e. you should know that "Grüß
Gott" requires Latin-1. With that knowledge, you can either change the
system default encoding, or put an encoding declaration in your file.

>  * I should be able to send Python files to other people in
>    other countries, and they should run fine there too.

That will work for the encoding declaration case (either with the
Unicode marker, or the UTF-8 signature). If people have changed the
system default encoding, this property may not hold.

>  * I should be able to use 'print' on strings and unicode strings
>    and get sensible output (I'll know it when I see it <wink>).

That will work depending on the terminal. In IDLE, it does currently
work. For plain strings, it will also normally work, unless the file's
encoding differs from the system encoding (or, rather, the user's
terminal encoding). For Unicode strings, I'll write a PEP describing
the use of Unicode at system interfaces, which targets this issue.

>  * Comments shouldn't affect the meaning of code.

They won't in phase 1 of the PEP (although you will get a warning if
you haven't declared an encoding, and left the system encoding at
ASCII). In phase 2, you'll get a warning if the comment is not
well-formed under the encoding.

>  * Random binary garbage in comments should be ignored, just like
>    it is today.

That property may go away in phase 2. E.g. in a UTF-8 encoded file,
Python will eventually verify that the comments actually are UTF-8.
Some people consider this a good thing, as editors can't really
round-trip files with different encodings at different offsets.

Regards,
Martin