[Python-ideas] str.split() oddness

Arnaud Delobelle arnodel at gmail.com
Sat Feb 26 22:31:06 CET 2011


On 26 February 2011 14:03, Mart Sõmermaa <mrts.pydev at gmail.com> wrote:
> IMHO, x.join(a).split(x) should be "idempotent"
> in regard to a.

Idempotent is the wrong word here.  A function f is idempotent if
f(f(x)) == f(x) for all x.  What you are stating is that given:

    f_s(x) = s.join(x)
    g_s(x) = x.split(s)

Then for all s and x, g_s(f_s(x)) == x. If this condition is satisfied
then f_s and g_s are said to be each other's inverse.  First you have
to define clearly the domain of both functions for this to make sense.
 It seems that you consider the following domains:

    Domain of g_s =  all strings

    Domain of f_s = all lists of strings which do not contain s

Note that the domain of f_s is already quite complicated.  As you
point out, it can't work. As f_s([]) == f_s(['']) == '', g_s('') can't
be both [] and [''].  But if you change the domain of f_s to:

    Domain of f_s = all non-empty lists of strings which do not contain s

Then f_s and g_s are indeed the inverse of each other.

Note also that in ruby,

    [''].join(s).split(s) == ['']

evaluates to false.  So the problem is also present with ruby.  Ruby
decided that ''.split(s) is [], whereas Python decided that
''.split(s) is [''].

The only solution would be to raise an exception when joining an empty
list, which I guess is not very desirable.

-- 
Arnaud



More information about the Python-ideas mailing list