Why is there no natural syntax for accessing attributes with names not being valid identifiers?

Antoon Pardon antoon.pardon at rece.vub.ac.be
Wed Dec 4 14:57:11 EST 2013


Op 04-12-13 14:02, rusi schreef:
> On Wednesday, December 4, 2013 6:02:18 PM UTC+5:30, Antoon Pardon wrote:
>> Op 04-12-13 13:01, rusi schreef:
>>> On Wednesday, December 4, 2013 3:59:06 PM UTC+5:30, Antoon Pardon wrote:
>>>> Op 04-12-13 11:09, rusi schreef:
>>>>> I used the spaces case to indicate the limit of chaos. 
>>>>> Other characters (that
>>>>> already have uses) are just as problematic.
>>>>
>>>> I don't agree with the latter. As it is now python can make the
>>>> distinction between
>>>>
>>>> from A import B    and     fromAimportB.
>>>>
>>>> I see no a priori reason why this should be limited to letters. A
>>>> language designer might choose to allow a bigger set of characters
>>>> in identifiers like '-', '+' and others. In that case a-b would be
>>>> an identifier and a - b would be the operation. Just as in python
>>>> fromAimportB is an identifier and from A import B is an import
>>>> statement.
>>>
>>> Im not sure what you are saying.
>>> Sure a language designer can design a language differently from python.
>>> I mentioned lisp. Cobol is another behaving exactly as you describe.
>>>
>>> My point is that when you do (something like) that, you will need to change the
>>> lexical and grammatical structure of the language.  And this will make 
>>> for rather far-reaching changes ALL OVER the language not just in what-follows-dot.
>>
>> No you don't need to change the lexical and grammatical structure of
>> the language. Changing the characters allowed in identifiers, is not a
>> change in lexical structure. The only difference in lexical structuring
>> would be that '-', '>=' and other similars symbols would have to be
>> treated like keyword like 'from', 'as' etc instead of being recognizable
>> by just being present.
> 
> Well I am mystified…
> Consider the string a-b in a program text.
> A Cobol or Lisp system sees this as one identifier.
> Python, C (and most modern languages) see this ident, operator, ident.
> 
> As I understand it this IS the lexical structure of the language and the lexer
> is the part that implements this:
> - in cobol/lisp keeping it as one
> - in python/C breaking it into 3
> 
> Maybe you understand in some other way the phrase "lexical structure"?

Yes I do. The fact that a certain string is lexically evaluated differently
is IMO not enough to conclude the language has a different lexical structure.
It only means that the values allowed within the structure are different. What
I see here is that some languages have an other alphabet over which identifiers
are allowed.

>> And the grammatical structure of the language wouldn't change at all.
>> Sure a-b would now be an identifier and not an operation but that is
>> of no concern for the parser.
> 
> About grammar maybe what you are saying will hold: presumably if the token-set
> is the same, one could keep the same grammar, with the differences being 
> entirely inter-lexeme ones.

And the question is. If the token-set is the same, how is then is the lexical
structure different rather than just the possible values associate with the tokens?

-- 
Antoon Pardon




More information about the Python-list mailing list