[Python-Dev] directive statement (PEP 244)
Guido van Rossum
guido@digicool.com
Mon, 16 Jul 2001 15:46:07 -0400
> Hmm, I guess you have something like this in mind...
>
> 1. read the file
> 2. decode it into Unicode assuming some fixed per-file encoding
> 3. tokenize the Unicode content
> 4. compile it, creating Unicode objects from the given Unicode data
> and creating string objects from the Unicode literal data
> by first reencoding the Unicode data into 8-bit string data
>
> To make this backwards compatible, the implementation would have to
> assume Latin-1 as the original file encoding if not given (otherwise,
> binary data currently stored in 8-bit strings wouldn't make the
> roundtrip).
To be compatible with the current default encoding, I would use ASCII
as the default encoding and issue an error if any non-ASCII characters
are found. One should always use hex/oct escapes to enter binary data
in literals!
--Guido van Rossum (home page: http://www.python.org/~guido/)