user-defined operators: a very modest proposal

jepler at unpythonic.net jepler at unpythonic.net
Tue Nov 22 17:52:10 EST 2005


If your proposal is implemented, what does this code mean?
	if [1,2]+[3,4] != [1,2,3,4]: raise TestFailed, 'list concatenation'
Since it contains ']+[' I assume it must now be parsed as a user-defined
operator, but this code currently has a meaning in Python.

(This code is the first example I found, Python 2.3's test/test_types.py, so it
is actual code)

I don't believe that Python needs user-defined operators, but let me share my
terrible proposal anyway:  Each unicode character in the class 'Sm' (Symbol,
Math) whose value is greater than 127 may be used as a user-defined operator.
The special method called depends on the ord() of the unicode character, so
that __u2044__ is called when the source code contains u'\N{FRACTION SLASH}'.
Whatever alternate syntax is adopted to allow unicode identifier characters to
be typed in pure ASCII will also apply to typing user-defined operators.  "r"
and "i" versions of the operators will of course exist, as in __ru2044__ and
__iu2044__.

Also, to accomodate operators such as u'\N{DOUBLE INTEGRAL}', which are not
simple unary or binary operators, the character u'\N{NO BREAK SPACE}' will be
used to separate arguments.  When necessary, parentheses will be added to
remove ambiguity.  This leads naturally to expressions like
	\N{DOUBLE INTEGRAL} (y * x**2) \N{NO BREAK SPACE} dx \N{NO BREAK SPACE} dy
(corresponding to the call (y*x**2).__u222c__(dx, dy)) which are clearly easy
to love, except for the small issue that many inferior editors will not clearly
display the \N{NO BREAK SPACE} characters.

Some items on which I think I'd like to hear the community's ideas are:
    * Do we give special meaning to comparison characters like
      \N{NEITHER LESS-THAN NOR GREATER-THAN}, or let users define them in new
      ways?  We could just provide, on object,
	def __u2279__(self, other): return not self.__gt__(other) and other.__gt__(self)
      which would in effect satisfy all users.

    * Do we immediately implement the combination of operators with nonspacing
      marks, or defer it?  If we implement it, do we allow the combination with
      pure ASCII operators, as in 
        u'\N{COMBINING LEFT RIGHT ARROW ABOVE}+'
      or treat it as a syntax error?  (BTW the method name for this would be
      __u20e1u002b__, even though it might be tempting to support __u20e1x2b__,
      __u2oe1add__ and similar method names)  How and when do we normalize
      operators combined with more than one nonspacing mark?

    * Which unicode operator methods should be supported by built-in types?
      Implementing __u222a__ and __iu222a__ for sets is a no-brainer,
      obviously, but what about __iu2206__ for integers and long?

    * Should some of the unicode mathematical symbols be reserved for literals?
      It would be greatly preferable to write \u2205 instead of the other proposed
      empty-set literal notation, {-}.  Perhaps nullary operators could be defined,
      so that writing \u2205 alone is the same as __u2205__() i.e., calling the
      nullary function, whether it is defined at the local, lexical, module, or
      built-in scope.

    * Do we support characters from the category 'So' (symbol, other)?  Not
      doing so means preventing programmers from using operators like
      \u"n{HEAVY CONCAVE-POINTED BLACK RIGHTWARDS ARROW}".  Who are we to
      make those kinds of choices for our users?

Jeff
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20051122/4e8f3cf2/attachment.sig>


More information about the Python-list mailing list