[Python-ideas] allow `lambda' to be spelled λ

Rustom Mody rustompmody at gmail.com
Tue Jul 19 10:40:42 EDT 2016


 
On Tuesday, July 19, 2016 at 7:41:38 PM UTC+5:30, Neil Girdhar wrote:
>
> On Tue, Jul 19, 2016 at 8:18 AM Rustom Mody  wrote:
>
>>
>> On Tuesday, July 19, 2016 at 5:06:17 PM UTC+5:30, Neil Girdhar wrote:
>>>
>>>
>>> On Tue, Jul 19, 2016 at 7:21 AM Steven D'Aprano wrote:
>>>
>> On Mon, Jul 18, 2016 at 10:29:34PM -0700, Rustom Mody wrote:
>>>>
>>>> > IOW
>>>> > 1. The lexer is internally (evidently from the error message) so
>>>> > ASCII-oriented that any “unicode-junk” just defaults out to 
>>>> identifiers
>>>> > (presumably comments are dealt with earlier) and then if that lexing 
>>>> action
>>>> > fails it mistakenly pinpoints a wrong *identifier* rather than just an
>>>> > impermissible character like python 2
>>>>
>>>> You seem to be jumping to a rather large conclusion here. Even if you
>>>> are right that the lexer considers all otherwise-unexpected characters
>>>> to be part of an identifier, why is that a problem?
>>>>
>>>
>>> It's a problem because those characters could never be part of an 
>>> identifier.  So it seems like a bug.
>>>
>>
>> An armchair-design solution would say: We should give the most 
>> appropriate answer for every possible unicode character category
>> This would need to take all the Unicode character-categories and Python 
>> lexical-categories and 'cross-product' them — a humongous task to little 
>> advantage
>>
>
> I don't see why this is a "humongous task".  Anyway, your solution boils 
> down to the simplest fix in the lexer which is to block some characters 
> from matching any category, does it not?
>

Block? Not sure what you mean… Nothing should change (in the simplest 
solution at least) apart from better error messages
My suggested solution involved this:
Currently the lexer — basically an automaton — reveals which state its in 
when it throws error involving "identifier"
Suggested change: 

if in_ident_state:
  if current_char is allowable as ident_char:
     continue as before
  elif current_char is ASCII:
     Usual error
  else:
     throw error eliding the "in_ident state"
else:
  as is...

BTW after last post I tried some things and found other unsatisfactory (to 
me) behavior in this area; to wit:

 >>> x = 0o19
  File "<stdin>", line 1
    x = 0o19
           ^
SyntaxError: invalid syntax

Of course the 9 cannot come in an octal constant but "Syntax Error"??
Seems a little over general

My preferred fix:
make a LexicalError sub exception to SyntaxError

Rest should follow for both

Disclaimer: I am a teacher and having a LexicalError category makes it nice 
to explain some core concepts
However I understand there are obviously other more pressing priorities 
than to make python superlative as a CS-teaching language :-) 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20160719/8194536b/attachment-0001.html>


More information about the Python-ideas mailing list