[Python-ideas] allow `lambda' to be spelled λ

Rustom Mody rustompmody at gmail.com
Tue Jul 19 10:40:42 EDT 2016

On Tuesday, July 19, 2016 at 7:41:38 PM UTC+5:30, Neil Girdhar wrote:
> On Tue, Jul 19, 2016 at 8:18 AM Rustom Mody  wrote:
>> On Tuesday, July 19, 2016 at 5:06:17 PM UTC+5:30, Neil Girdhar wrote:
>>> On Tue, Jul 19, 2016 at 7:21 AM Steven D'Aprano wrote:
>> On Mon, Jul 18, 2016 at 10:29:34PM -0700, Rustom Mody wrote:
>>>> > IOW
>>>> > 1. The lexer is internally (evidently from the error message) so
>>>> > ASCII-oriented that any “unicode-junk” just defaults out to 
>>>> identifiers
>>>> > (presumably comments are dealt with earlier) and then if that lexing 
>>>> action
>>>> > fails it mistakenly pinpoints a wrong *identifier* rather than just an
>>>> > impermissible character like python 2
>>>> You seem to be jumping to a rather large conclusion here. Even if you
>>>> are right that the lexer considers all otherwise-unexpected characters
>>>> to be part of an identifier, why is that a problem?
>>> It's a problem because those characters could never be part of an 
>>> identifier.  So it seems like a bug.
>> An armchair-design solution would say: We should give the most 
>> appropriate answer for every possible unicode character category
>> This would need to take all the Unicode character-categories and Python 
>> lexical-categories and 'cross-product' them — a humongous task to little 
>> advantage
> I don't see why this is a "humongous task".  Anyway, your solution boils 
> down to the simplest fix in the lexer which is to block some characters 
> from matching any category, does it not?

Block? Not sure what you mean… Nothing should change (in the simplest 
solution at least) apart from better error messages
My suggested solution involved this:
Currently the lexer — basically an automaton — reveals which state its in 
when it throws error involving "identifier"
Suggested change: 

if in_ident_state:
  if current_char is allowable as ident_char:
     continue as before
  elif current_char is ASCII:
     Usual error
     throw error eliding the "in_ident state"
  as is...

BTW after last post I tried some things and found other unsatisfactory (to 
me) behavior in this area; to wit:

 >>> x = 0o19
  File "<stdin>", line 1
    x = 0o19
SyntaxError: invalid syntax

Of course the 9 cannot come in an octal constant but "Syntax Error"??
Seems a little over general

My preferred fix:
make a LexicalError sub exception to SyntaxError

Rest should follow for both

Disclaimer: I am a teacher and having a LexicalError category makes it nice 
to explain some core concepts
However I understand there are obviously other more pressing priorities 
than to make python superlative as a CS-teaching language :-) 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20160719/8194536b/attachment-0001.html>

More information about the Python-ideas mailing list