[Python-3000] Support for PEP 3131

Tue Jun 12 02:22:04 CEST 2007

Martin v. Löwis a écrit :
>>> Indeed, PEP 3131 gives a predictable identifier character set.
>>> Adding per-site options to change the set of allowable characters
>>> makes it less predictable.
>>>
>> true. However, this will only matter if you distribute code with non-ASCII
>> identifiers to the wider public.
> 
> No - it will matter for any kind of distribution, not just to the "wider
> public". If I move code to the next machine it may stop working, 
>
if that machine is controlled by you (or your sysadmin), you should be able to
reconfigure Python the way you like. However, I have to agree that this is
suboptimal.

> or if I upgrade to the next Python version, assuming the default is
> to restrict identifiers.
> 
That would only happen if the default changes to a more strict rule. If we start
with ASCII only, this is unlikely to ever happen!

>> The real question is: transparent *to whom*. Transparent to the developper
>> himself when he rereads his own code (which I value as a developper), or
>> transparent to the user of the program when he tries to fix a bug (which I value
>> as a user of open-source software) ? Non-ASCII identifiers are marginally better
>> for the first case, but can be dramatically worse for the second one. Clearly,
>> there is a tradeoff.
> 
> Why do you say that? Non-ASCII identifiers significantly improve the
> readability of code to speakers of the natural language from which
> the identifiers are drawn. With ASCII identifiers, the reader needs
> to understand the English words, or recognize the transliteration.
> With non-ASCII identifiers, the intended meaning of the class or
> function becomes immediately apparent, in the way identifiers have
> always been self-documentation for English-speaking people.
> 
my problem is then: what happens if the reader does not speak the same language
as the author of the code? Right now, if I come across python code written in a
language I don't speak, I can still try to make sense of it. Sure, I may have to
do without the comments, sure, I may not understand what the identifier names
mean. But I can still follow the instructions flow and try to figure out what
happens. With non-ASCII identifiers, I cannot do that because I cannot recognise
the identifiers from one another.

>>>> That is what makes these strengths so important.  I hope this
>>>> helps you understand why these concerns can't and shouldn't be
>>>> brushed off as "paranoia" -- this really has to do with the
>>>> core values of the language.
>>> It just seems that the concerns don't directly follow from
>>> the principles. Something else has to be added to make that
>>> conclusion. It may not be paranoia (i.e. excessive anxiety),
>>> but there surely is some fear, no?
>>>
>> That argument is not really honest :-) Every risk can be estimated opimistically
>> or pessimistically. In both cases, there is some part of irrationallity.
> 
> Still, what is the risk being estimated? Is it that somebody
> maliciously tries to provide patches that use look-alike
> characters? I honestly don't know what risks you see.
> 
Well, I have not followed acurately the discussion about security risks.
However, I see a much simpler risk: the risk that I come across with code that
is technically open source, but that I can't even debug in case of need because
I cannot make sense of it. This would reduce the usefulness of such code, and
cause fragmentation for the community.

Cheers,
BC