I come to praise .join, not to bury it...

Huaiyu Zhu hzhu at users.sourceforge.net
Mon Mar 5 18:47:38 EST 2001


It is all nice and well to say that sep.join(list) is good for polymorphism,
except that there are several practical annoyances:

1. The word join can be either transitive or nontransitive:
   sep joins list.
   list joins with sep.
   
   Might sep.glue(join) be clearer?  Well, compatibility with str.join?
   
2. Can we really join all kinds of lists?

>>> a = range(10)
>>> "".join(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: sequence item 0: expected string, int found

3. What's the inverse of join?  Is it split?

>>> b = [str(x) for x in a]
>>> c = "".join(b)
>>> c
'0123456789'
>>> d = "".split(c)
>>> d
['']

Oops! The other way round.  Try again
>>> d = c.split("")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: empty separator

What?  Oh, in this particular case, the inverse is
>>> d = list(c)
>>> d
['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']

For other cases this is OK:
>>> c = ".".join(b)
>>> c
'0.1.2.3.4.5.6.7.8.9'
>>> d = c.split(".")
>>> d
['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']

4. So can we just change the order of split?  No.  Consider

file.readlines()
file.read().splitlines()
file.read().split("\n")

More abstractly, when we split something, we have 
   whole - sep ==> components.
But when we join things together, we do it like
   glue + components ==> whole,
rather then the "more natural"
   components + glue => whole,
where "more natural" is defined as "most likely to be guessed by uninitiated
minds".

Of course, all these point do not detract from the fact that from
implementer's point of view, sep.join(list) is the most natural.

So what am I trying to say here?  I'd say that there's not much point in
arguing, or even explaining which way is better.  What would help is to
make things more symmetrical/smooth if possible.

Huaiyu



More information about the Python-list mailing list