[I18n-sig] Re: Strawman Proposal: Encoding Declaration V2
Paul Prescod
paulp@ActiveState.com
Sun, 11 Feb 2001 14:31:12 -0800
Fredrik Lundh wrote:
>
> > A source file with an encoding declaration must only use non-ASCII bytes
> > in places that can legally support Unicode characters. In Python 2.x the
> > only place is within a Unicode literal
>
> make that "in a string literal".
Yes, I think you're right. If a person needs to get at a Latin 1
character in a string literal they should be able to do so using
> if an encoding directive is present, the *entire* file should be
> assumed to use that encoding. this applies to comments, 8-bit
> string literals, and 16-bit string literals.
I've backed off somewhat on having the file be pre-decoded in the short
term. My major conceptual problem is if we decode to Unicode-escaped
ASCII or something then we mess up the column numbers and the syntax
errors will not be right. We might really need to have a Unicode-aware
parser before we can do this...
Paul Prescod