Method Underscores?

Thu Oct 21 02:54:58 EDT 2004

Chris S. <chrisks at NOSPAM.udel.edu> wrote:

> Is there a purpose for using trailing and leading double underscores for
> built-in method names?

They indicate that a method is special (not 'built-in').  One that
causes Python to call it implicitly under certain circumstances.

So, for example, a class which happened to define method iter would not
start behaving strangely when 'iter' acquired a special meaning in some
future version of the language: the special meaning if it comes will be
instead put on __iter__ .  This has indeed happened (in 2.2).
Otherwise, you'd have the same problem with special methods as you do
with keywords: introducing one is a _major_ undertaking since it risks
breaking backwards compatibility (built-in names do not have that risk;
it may not be obvious but some reflection will show that).

(( The general practice of marking some class of identifiers with
special characters to distinguish them from others is known as stropping
and was introduced in early Algol 60 implementations, to distinguish
keywords from names; the Algol standard used roman font versus italics
for the purpose, but that didn't translate well to punched cards! ))

> My impression was that underscores are supposed 
> to imply some sort of pseudo-privatization,

Leading-only underscores do.  Underscores in the middle imply nothing,
it's just a style of making an identifier from many words; some like
this_way, some like thatWay.  Trailing-only underscore normally is used
when otherwise an identifier would be a keyword, as in 'class_' or
'print_' (you need some convention for that when you're interfacing
external libraries -- ctypes, COM, Corba, SOAP, etc, etc -- since
nothing stops the external library from having defined a name which
happens to clash with a Python keyword).  Leading AND trailing double
underscores imply specialness.

> but would using 
> myclass.len() instead of myclass.__len__() really cause Python 
> considerable harm?

If you were designing Python from scratch, the tradeoff would be:
-- unstropped specialnames are easier to read, but
-- future evolution of the language will be severely hampered (or
   else backwards compatibility will often get broken).

So it's a tradeoff, just like the choice of stropping or not for
barenames (identifiers); Perl strops because Larry Wall decided early on
he wanted lots of easy evolution (there's also a tradition of stropping
identifiers in scripting languages, from EXEC to the present; sometimes
under guise of _substitution_, where an identifier being bound is not
stropped but it needs stropping to be used, as in sh and tcl; Rexx and
Python deliberately reject that tradition to favour legibility).

I think Guido got that design choice right: unstropped barenames for all
normal uses, unstropped keywords, pay the price whenever a keyword needs
to be added (that's rarely), stropped-by-convention specialnames
(they're way rarer than barenames in general, _and_ the addition of
specialnames is more frequent).

On the specific lexical-sugar issue of what punctuation characters to
use for this stropping, I pass; the double underscores on both sides are
a bit visually invasive, other choices might have been sparer, but then
I guess that part of the choice was exactly to make the specialness of
specialnames stand out starkly.  Since it's unlikely I'll soon need to
design a Python-like language and thus to decide on exactly how to strop
specialnames, it's blissfully superfluous for me to decide;-).

Alex