[Python-3000] PEP: Supporting Non-ASCII Identifiers

"Martin v. Löwis" martin at v.loewis.de
Tue Jun 5 19:15:35 CEST 2007


Jim Jewett schrieb:
> On 6/5/07, "Martin v. Löwis" <martin at v.loewis.de> wrote:
>> > Always normalizing would have the advantage of simplicity (no
>> > matter what the encoding, the result is the same), and I think
>> > that is the real path of least surprise if you sum over all
>> > surprises.
> 
>> I'd like to repeat that this is out of scope of this PEP, though.
>> This PEP doesn't, and shouldn't, specify how string literals get
>> from source to execution.
> 
> I see that as a gray area.

Please read the PEP title again. What is unclear about
"Supporting Non-ASCII Identifiers"?

> Unicode does say pretty clearly that (at least) canonical equivalents
> must be treated the same.

Chapter and verse, please?

> In theory, this could be done only to identifiers, but then it needs
> to be done inline for getattr.

Why that? The caller of getattr would need to apply normalization in
case the input isn't known to be normalized?

> Since we don't want the results of (str1 == str2) to change based on
> context, I think string equality also needs to look at canonicalized
> (though probably not compatibility) forms.  This in turn means that
> hashing a unicode string should first canonicalize it.  (I believe
> that is a change from 2.x.)

And you think this is still within the scope of the PEP?

Please, if you want that to happen, write your own PEP.

Regards,
Martin


More information about the Python-3000 mailing list