[Python-checkins] CVS: python/dist/src/Lib string.py,1.46,1.47

bjorn bjorn at roguewave.com
Tue Feb 22 14:45:50 EST 2000


Gordon McMillan wrote:

> bjorn wrote:
>
> > I must respectfully disagree.  When doing OO design, the fundamental question
> > is "if I want to do 'foo', to what am I doing 'foo'?"...  In this case "if I
> > want to 'join', what am I 'joining'?", the answer is that you're joining a
> > sequence type.  Thus, IMHO it the natural (and extensible to user types) way
> > to do it would be:
> >
> >     [1,2,3].join()
> >     (1,2,3).join(' ')
>
> string.join only works on lists of strings. If you introduce a
> special "sequence of strings" type, then join could be a
> method on that. Otherwise, you need to work out some new
> semantics.

Sorry, bad example:

    ['a', 'b', 'c'].join()

[1,2,3].join() would presumably be as ill defined as string.join([1,2,3])...

> > if you think of another domain, say a gui with a textarea this might become
> > clearer. Nobody would want to write:
> >
> >     "foo bar".put(textarea)
>
> If textareas had a well defined interface that strings knew
> about, that would be fine,
> [snip]

Exactly. Strings can't possibly know enough about all possible sequence types,
including user defined ones, to make this efficient.

> [snip]
> But the scope here is the existing "string.join". The fact that
> the procedural interface has the instance as the 2nd arg is a
> distraction. It is a string method, not a list method.
>
> - Gordon

Just because it happened to be in the string module does not make it a string method
(it's definitely possible, but one doesn't logically follow from the other). The
fact that the string module interface has uses the second argument for the join
field is pedagogically an important one.  E.g. the self parameter to a method is
always passed first, and any of the ast/oo implementations in other procedural
languages always pass the object being acted on as the first argument.  It therefore
seems natural for a Python programmer to infer that join(seq, sep) should be parsed
as seq.join(sep).

from Frederik:

-- python 1.6 will have *two* different string types.

which presumably would be convertible, so implementing myContainer.join for regular
strings should be convertable to Unicode strings (no?)

-- python 1.6 will have lots of sequence types (at
   least one more than 1.5.2!)

Which is (a) a good argument for implementing join in the sequence types, since
there are much more than *two* sequence types <wink> and (b) suggests that we really
need a common base class for the sequence types to share common functionality
(Python is an OO language after all...)

-- while join can be *defined* as a series of concatenations,
   *implementing* things that way doesn't work in real life.  do
   you really think that *all* sequence objects should know
   how to join all possible string types in an efficient way?  or
   is that better left to the string implementation?

Conversely, to you really think that both string classes should know how to
efficiently join all possible sequence types?

Consider:

    class MyTree:
        def join(sep=''):
            lhs = rhs = ''
            if self.lhs:
                lhs = self.join(self.lhs)
            if self.rhs:
                rhs = self.join(self.rhs)
            # presumably the + operator would do any neccessary coercion?
            return lhs + sep + self.data + sep + rhs

sometimes implementing join as a series of concatenations does work, and is much
easer than trying to implement __getitem__.

at-least-until-we-get-generators'ly y'rs
-- bjorn






More information about the Python-list mailing list