Allow anything in identifiers?

Thu Mar 1 12:26:14 EST 2001

Thomas Wouters wrote:
> 
> On Thu, Mar 01, 2001 at 01:04:30PM +0100, Sverker Nilsson wrote:
> > David Porter wrote:
> 
> > > > alu.id'+' = lambda x, y...
> 
> > > It would break any code using the name 'id', likely a lot of code.
> 
> > How? Not id <string>, should likely be on the same line, may allow
> > space in between. That gives a syntax error currently, except when id
> > is some of a few special ones like r, u, etc.
> 
> If you want for id"string" as a string literal, the example above will not

No, id"string" should work exactly as an identifier!

> work. String literals create strings, not identifiers. Creating a special
> kidn of string that really creates an identifier would be... Weird.

Humm, 'weird', that might be what I was refering to with 'a kind of
uglyness'. I thought of other ways, maybe some other kinds of quotes,
maybe curly brackets... id{asdf} ? Not sure if it clashes with anything
in Python. It would confuse Perl programmers though.

I'd like to note though that Ada allows strings as function names,
albeit restricted: you can define functions overloading the predefined
operators symbols like this:

function "=" (left, right: my_type) return boolean;

I don't think that was or is considered one of the more controversial
or weirdest features in Ada, on the contrary.

> 
> >   operator.__lt__ (though I am just guessing about
> >   [that] one and I  am sure it's wrong - that's a point of the
> >   argument :-)
> 
> The latter doesn't exist. But neither does operator.__cmp__, so it's not
> that big a suprise.
> 

Well sorry for that confusion, I was sloppy.

For an example that has actually lead to demonstratable confusion:
operator.__invert__ that IIRC corresponds to the ~ operator.  But in
the magic method for objects, it's called __inv__. I found this some
weeks ago in Python 1.5.2 and had to work around it. (quite easily
admittedly, when I had found it out)

I have not reported it as a bug 'cause I thought it might be fixed in
versions 2...  or didnt know how to submit it and/or was lazy...
anyway I'd say it's a typical example of how having to invent
new names may lead to misunderstandings & bugs.

> > * Same holds for the corresponding class methods __add__ etc.
> 
> Actually, you guessed exactly right with __lt__. That's the name for the
> less-than operator 'magic method'. But even with special syntax to declare

Oh, that's actually new for me, I thought they only had the __cmp__
operator. New in version 2.. I suppose?

> operator'+', you can't throw away all magic names. What to do with __cmp__,
> __init__, __del__, __setitem/attr__ ?

Nothing I suppose, if they are not renaming something obvious. If
one wants to get fancy, maybe one could think of some regexp that
matches the occurence in a program, but it seems that could be
unnecessarily complicated.

> 
> > * Command interpreters could dispatch to functions that had obvious
> >   names corresponding to the actual commands.
> 
> > * Variables corresponding to files with fixed name, could have
> >   the obvious name.
> 
> >>> class X:
> ...     pass
> ...
> >>> x = X()
> >>> setattr(x, '%$@(*#@$+', 2)
> >>> getattr(x, '%$@(*#@$+')
> 2
> 
> I don't see a lot of people using that hack. I also don't see how using that

Maybe because they already need to come up with some other name anyway.
I.E. if you need to define a function that can't be expressed by a
lambda,
then you need a name for it anyway, and then you have already lost the
advantage you might get from having it named more naturally.

If you can define your function with a simple lambda, I think it can
be natural to use the operators name when applicable. I did that
myself in making an interpreter for a Python-like language. To define
some operators I made a table:

comparison_op_table = {
    '<'		: lambda a, b: a.cmp_op(b, lambda c : c < 0),
    '>'		: lambda a, b: a.cmp_op(b, lambda c : c > 0),
   ...
    }

I defined this at a separate table at the global level. I think it
would have been more natural if I could have definied it directly in
the class it was used, with a method for each case.  I could do that
by setattr I suppose, as you suggested, but that wouldn't be quite as
clear, longer to write, and not in declarative form, so to say.  And if
the functions wouldn't fit in simple lambdas I'd had to make named
functions from them anyways.

> kind of variable name can be any clearer. I especially dont believe in

Well for an other example, I have always found writing type(()) being
_clearer_ than types.TupleType --- it's more direct, you see directly
what it is, don't need to remember or lookup another name. Only reason
I see not to use type(()) is that it might be a bit slower, I
suppose. If the names were types.id'type(())', I think that might be
both faster and clearer... not?

In general, I think the less names you need to invent and remember, and
the more you can depend on general constructs, the clearer the programs
would be.

> 'fixed named files': you should name your variables according to what

Well you are right about not tying general programs to file names that
can be parameters-in-some-sense to the program.

> function they perform, not what they contain -- what they contain is already
> obvious ;) I might call a template file
> '/usr/local/lib/<application group>/<application>.tmpl', but the variable
> would simply be called 'tmpl' or 'TEMPLATE' or some such.

Right, I suppose. But if I make a special short script to handle a
hard-coded file 'mother's recipes', why do I need to invent a new name
to hold the file object? I don't usually do that with imported modules
- despite they are file names in the same way, only difference is that
their creators know they'r going to be used as Python
identifiers. Mother doesn't necessarily know that.

Well I'm hesitating about file names my self. Maybe if the file system
was more integrated with the variables it would be more obvious.

Repeating a quote taking it in another context:

> function they perform, not what they contain -- what they contain is already
> obvious ;)

I'd agree at least for a variable that can contain different things
that perform similar things - one would want to name it according to
what is similar. Actually I think these new identifiers could be
mostly applicable to constants. Functions and other constants. 

Example: From the math constants: Looking in C's math.h you have
examples of how having more general identifiers would have
helped. There are constants for simple constants like pi, e, but also
2/sqrt(pi), sqrt(2), 1/sqrt(2) with non-obvious names like M_2_SQRTPI,
M_SQRT2, M_SQRT1_2. (They may follow a pattern but it's not THAT
obivous and not general enough to allow for more complicated
expressions, lacking parenthesises for example.)

>From Python's math.h I recall only having constants e and pi last
time I looked. I suppose more constants would be useful and maybe
it would help if we could define many of them with the 'obvious'
names so we don't have to think longer about what to call them...
and also not be tied to C's more or less arbitrarily subset of
useful constants.

> 
> > One drawback I can think of myself right now is a kind of uglyness
> > by using quotes but at least they make the construct stand out so it
> > should not be so confusing to ordinary id's.
> 
> The main drawback I see is your 'strong point': I don't believe allowing
> more characters in identifiers is going to make programs easier to read or
> follow, or going to make constructs more selfexplanatory. I just see added
> complexity with no obvious benifit.

Humm. Well, I am trying to understand why you are beliving and seeing
those things but I haven't managed yet. If you'd like to explain it
more it would be helpful, I would appreciate it but of course it's
about your time and effort.

I'am interested in this not only as a possible addition to Python but
also more in general, I am thinking about whether to add or not to add
this feature to a small language that I am making that is based on
Python. (Part of a bigger project.)

Thanks for your comments so far!

Sverker Nilsson