[Python-Dev] directive statement (PEP 244)

Guido van Rossum guido@digicool.com
Mon, 16 Jul 2001 15:46:07 -0400


> Hmm, I guess you have something like this in mind...
> 
> 1. read the file
> 2. decode it into Unicode assuming some fixed per-file encoding
> 3. tokenize the Unicode content
> 4. compile it, creating Unicode objects from the given Unicode data
>    and creating string objects from the Unicode literal data
>    by first reencoding the Unicode data into 8-bit string data
> 
> To make this backwards compatible, the implementation would have to
> assume Latin-1 as the original file encoding if not given (otherwise,
> binary data currently stored in 8-bit strings wouldn't make the
> roundtrip).

To be compatible with the current default encoding, I would use ASCII
as the default encoding and issue an error if any non-ASCII characters
are found.  One should always use hex/oct escapes to enter binary data
in literals!

--Guido van Rossum (home page: http://www.python.org/~guido/)