I come to praise .join, not to bury it...

Steve Holden sholden at holdenweb.com
Mon Mar 5 21:14:10 EST 2001


"Huaiyu Zhu" <hzhu at users.sourceforge.net> wrote in message
news:slrn9a89dk.2rm.hzhu at rocket.knowledgetrack.com...
> It is all nice and well to say that sep.join(list) is good for
polymorphism,
> except that there are several practical annoyances:
>
> 1. The word join can be either transitive or nontransitive:
>    sep joins list.
>    list joins with sep.
>
It can also be a noun. This is no more help than your assertion.

>    Might sep.glue(join) be clearer?  Well, compatibility with str.join?
>
> 2. Can we really join all kinds of lists?
>
No. I was somewhat startled to see that a UserList of integers could not be
given as an argument to join(). Relief of a kind arrived when I realised
that the same was true of lists themselves.

>>> import UserList
>>> reallist = [1, 2, 3, 4, 5]
>>> userlist = UserList.UserList(reallist)
>>> ":".join(reallist)
Traceback (innermost last):
  File "<interactive input>", line 1, in ?
TypeError: sequence item 0: expected string, int found
>>> ":".join(userlist)
Traceback (innermost last):
  File "<interactive input>", line 1, in ?
TypeError: sequence item 0: expected string, int found
>>>

It would seem reasonable to expect the join() method to try and coerce
things to lists, but maybe I'm not a reliable guide to what's reasonable.

[ ... ]
>
> 3. What's the inverse of join?  Is it split?
>
> >>> b = [str(x) for x in a]
> >>> c = "".join(b)
> >>> c
> '0123456789'
> >>> d = "".split(c)
> >>> d
> ['']

Bzzt. This is along the lines of saying that since 0*x == o*y then x and y
must be equal. You really shouldn't argue from degenerate cases. The null
string is not a practical separator (for split()). Clearly it works for
join().

>
> Oops! The other way round.  Try again
> >>> d = c.split("")
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> ValueError: empty separator
>
Aha! So you''believe the interpreter, then?

> What?  Oh, in this particular case, the inverse is
> >>> d = list(c)
> >>> d
> ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
>
> For other cases this is OK:
> >>> c = ".".join(b)
> >>> c
> '0.1.2.3.4.5.6.7.8.9'
> >>> d = c.split(".")
> >>> d
> ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
>
Quite.

> 4. So can we just change the order of split?  No.  Consider
>
> file.readlines()
> file.read().splitlines()
> file.read().split("\n")
>
> More abstractly, when we split something, we have
>    whole - sep ==> components.
> But when we join things together, we do it like
>    glue + components ==> whole,
> rather then the "more natural"
>    components + glue => whole,
> where "more natural" is defined as "most likely to be guessed by
uninitiated
> minds".
>
You either have to understand the basis of a programming language's
operation, or copy other people's code in the hopes that it's well formed
and your use of it does not break it. "More natural" is an unprovable
assertion without usability testing, something which did feature in Python's
early development.

Nowadays, of course, everyone simply bitches like mad on c.l.py, and Guido
ignores our rantings and introduces execrescences such as "print >>" (don't
get me started :-)

> Of course, all these point do not detract from the fact that from
> implementer's point of view, sep.join(list) is the most natural.
>
There's the rub. We could all take the source and be our own implementer.
Few of us have the time and the stupidity^H^H^H^H^H^H^H^H^Hcourage.

> So what am I trying to say here?  I'd say that there's not much point in
> arguing, or even explaining which way is better.  What would help is to
> make things more symmetrical/smooth if possible.
>
> Huaiyu

Erm, use string.join() until it is taken away.

regards
 Steve






More information about the Python-list mailing list