Method or function?

Alex Martelli aleaxit at yahoo.com
Fri Nov 3 06:57:24 EST 2000


"Dale Strickland-Clark" <dale at out-think.NOSPAMco.uk> wrote in message
news:9js40t8t9tcc0cqtg8h4lkg79kheqp6fca at 4ax.com...
    [snip]
> I was also curious about the new join() method which strikes me as
> being the wrong way around. Join should be a method of sequences not
> of the string that is placed between the elements.

I guess you and I have a different viewpoint about what the
"proper" role of a method is/should-be -- in a Python (or
most other OO languages, i.e., without multi-dispatch)
context.  For me, it's definitely not an issue of syntax
sugar -- it's an issue of polymorphism.  Since I will
be able to dispatch only on one specific "argument" (the
object to which the method is applied), the crucial issue
is that the method must belong to the specific object for
which I might want to get polymorphic behaviour.

A sequence must have "a way" to supply all of its items,
one after another, with some indication of there being
no more items to supply.  That "iterator" will be used
in any context where iteration is desired; the sequence
must do the same job anyway, whether its iteration is
being used by map(), by a for statement, by join(), by
zip(), etc, etc.  So, having _one and only one_ "way"
seems to be the best architecture here.

The "added-value" of being able to supply different
sequences of arguments when iteration is requested
in different contexts would seem to me to be _negative_:
I cannot think of a situation in which I would like
such disuniformity.  Therefore, I'm very happy that
Python's "iteration protocol" (__getitem__ called
with progressive integers, 0 up, with IndexError
being raised to signify end-of-sequence) is uniformly
applied for any iteration need.

'join' (like 'map', 'zip', 'for', ...) needs to
iterate on a sequence (more than one sequence,
in the case of map and zip, is also possible); to
get uniformity of iteration, I'm _very_ happy that
none of these be methods on sequence objects.


Is there positive added-value in having 'join' be
a method of the _joiner_?  We've seen why I do not
think it would make sense to have it as a method
of the _joinee_ (it would not substantially help,
it _might_ hurt at least by suggesting possible
disuniformity in linear iteration on sequence...),
but wouldn't it be better to keep 'join' just as
a function, like 'map' and 'zip' ('for' is even a
_statement...)?

The actual motivation for having join become a
method of the 'joiner' was, no doubt, like for
all of the other string-methods, easing the
introduction of Unicode strings side by side
with 'ordinary' strings -- avoiding the need for
a function to test the argument (for join, the
separator-argument) about its type and switch
on that (method-dispatch is handier & speedier).
The polymorphism of .join and the rest between
ordinary and Unicode strings does make it easier
to write higher level code that can work in either
environment; a small but positive added-value.

But, there may be more; I _can_ envisage cases
in which join's polymorphism upon the joiner (as
opposed to the joinee) makes my life easier in
application programming.  Specifically, this
gives me the ability to have the joiner be an
instance of some suitable class, that defines
the join method to perform slightly different
but compatible tasks.


Suppose, for example, that I'm emitting data
that start as
    results = ['this', 'that', 'the other']
or
    results = ('wine', 'women', 'song')
by
def emit(results, separator=', '):
    print separator.join(results)

This gives me the output
    this, that, the other
    wine, women, song
which is passable, but not _quite_ what I
want.  Thanks to dispatch-on-separator,
the fix is easy:

class MultiSep:
    def __init__(self, last=' and ', others=', '):
        self.last=last; self.others=others
    def join(self, sequence):
        return self.others.join(sequence[:-1]) +\
               self.last + sequence[-1]


Now, redefining emit, just to change its second
argument's default value:

def emit(results, separator=MultiSep()):
    print separator.join(results)

most easily gives the desired output format:

    this, that and the other
    wine, women and song

for the same existing calls to emit(results).


Note that the parameters of MultiSep.__init__
are typical of such needs to tweak join(): they
are all about *the joiner* (aka 'the separator'),
NOT about *the sequence*.  The sequence does
its job (enumerating its items, period), and
the joiner does _ITS_ own -- joining those
items that get enumerated in the right way.

If join was dispatched _on the sequence_, how
could we redefine emit, to use multiple separators
by default, but still be easily callable with a
different separator (joiner, I'd call it:-)
object?  I think it would have to be rather less
direct, more abstruse, with a sequence-wrapping
class etc etc.


Sure, multiple dispatch would ensure the most
potential generality.  But as long as dispatch
is to be on one object only (the method's
"owner"), then it definitely seems to me that
having that object be the joiner rather than
the joinee is by far the best architecture
for the join method.


Alex






More information about the Python-list mailing list