Why is there no natural syntax for accessing attributes with names not being valid identifiers?

Wed Dec 4 08:02:37 EST 2013

On Wednesday, December 4, 2013 6:02:18 PM UTC+5:30, Antoon Pardon wrote:
> Op 04-12-13 13:01, rusi schreef:
> > On Wednesday, December 4, 2013 3:59:06 PM UTC+5:30, Antoon Pardon wrote:
> >> Op 04-12-13 11:09, rusi schreef:
> >>> I used the spaces case to indicate the limit of chaos. 
> >>> Other characters (that
> >>> already have uses) are just as problematic.
> >>
> >> I don't agree with the latter. As it is now python can make the
> >> distinction between
> >>
> >> from A import B    and     fromAimportB.
> >>
> >> I see no a priori reason why this should be limited to letters. A
> >> language designer might choose to allow a bigger set of characters
> >> in identifiers like '-', '+' and others. In that case a-b would be
> >> an identifier and a - b would be the operation. Just as in python
> >> fromAimportB is an identifier and from A import B is an import
> >> statement.
> > 
> > Im not sure what you are saying.
> > Sure a language designer can design a language differently from python.
> > I mentioned lisp. Cobol is another behaving exactly as you describe.
> > 
> > My point is that when you do (something like) that, you will need to change the
> > lexical and grammatical structure of the language.  And this will make 
> > for rather far-reaching changes ALL OVER the language not just in what-follows-dot.
>
> No you don't need to change the lexical and grammatical structure of
> the language. Changing the characters allowed in identifiers, is not a
> change in lexical structure. The only difference in lexical structuring
> would be that '-', '>=' and other similars symbols would have to be
> treated like keyword like 'from', 'as' etc instead of being recognizable
> by just being present.

Well I am mystified…
Consider the string a-b in a program text.
A Cobol or Lisp system sees this as one identifier.
Python, C (and most modern languages) see this ident, operator, ident.

As I understand it this IS the lexical structure of the language and the lexer
is the part that implements this:
- in cobol/lisp keeping it as one
- in python/C breaking it into 3

Maybe you understand in some other way the phrase "lexical structure"?

> And the grammatical structure of the language wouldn't change at all.
> Sure a-b would now be an identifier and not an operation but that is
> of no concern for the parser.

About grammar maybe what you are saying will hold: presumably if the token-set
is the same, one could keep the same grammar, with the differences being 
entirely inter-lexeme ones.