Small inconsistency between string.split and "".split

Alex Martelli aleaxit at yahoo.com
Fri Sep 17 08:01:48 EDT 2004


Carlos Ribeiro <carribeiro at gmail.com> wrote:

> Walter,
> 
> On Tue, 14 Sep 2004 12:01:29 +0200, Walter Dörwald
> <walter at livinglogic.de> wrote:
> > Carlos Ribeiro wrote:
> > I've fixed the docstring for both unicode.split() and
> > string.split() to give a hint about the None default. Note
> > that the docstring for str.split() already *did* mention
> > the None option.
> 
> I don't know if you can do it, but isn't easy to modify the split
> method to accept maxsplit as a keyword parameter? It would make it

Feasible, not hard, not trivial.  The problem is different...:

kallisti:~/downloads/Python-2.4a3 alex$ find . -name '*.c' | xargs cat |
grep -c 'METH_KEYWORDS'
92
kallisti:~/downloads/Python-2.4a3 alex$ find . -name '*.c' | xargs cat |
grep -c 'METH_VARARGS'
1272
kallisti:~/downloads/Python-2.4a3 alex$ find . -name '*.c' | xargs cat |
grep -c 'METH_'
2429

In other words: throughout the current C sources for Python (across all
platforms etc) there are about 2429 specifications of how various
functions (methods, of course, include) take their parameters.  Of
these, about half are METH_VARARGS (400 are METH_NOARGS, i.e.e functions
and methods accepting no explicit arguments, and 739 are METH_O,
accepting just one), and less than 4% accept keyword-style arguments.
Many of those are pretty recent additions, too, and some play special
roles which you just couldn't fulfil otherwise (e.g. consider the
optional key= vs cmp= arguments that 2.4 accepts for the list.sort
method -- they are mutually exclusive...).

Having ALL C-coded functions and methods that accept any argument accept
keyword-style arguments in particular would surely lead to a more
consistent language, once the impact of thousands of modifications to
the source stabilizes again -- a slightly bigger and slower interpreter,
no doubt, but probably only slightly.  But these thousands of changes
will require very substantial and disruptive editing -- substantial
manpower to perform them all, AND ensure they're all well tested (I
suspect the set of unit tests would have to more than double to do a
halfway decent job).  It would have to be among the major targets of a
given Python release, I suspect, and raising enthusiasm for such a job
might not be easy, even though Python would be a better language in
consequence.  Maybe it will be feasible as part of the 3.0 release,
which is slated to be incompatible anyway... remove the METH_VARARGS
altogether, breaking compatibility with all existing extensions, so
EVERY C-coded function in the future, if it takes any argument at all,
will HAVE to take them in keyword form, too.

Until it's feasible to perform such a sweeping change, justifying
changes to ONE specific method of an object which has dozens is going to
be pretty hard.  Perhaps, if someone volunteered a patch to make ALL
methods of string and unicode objects specifically accepts arguments in
keyword form as well as positionally, with all the needed tests & docs,
in time for Python 2.4's first beta in a couple of weeks, it might be
accepted (if separate but similar patches also existed for methods of
other built-in types, that would help all of their acceptance chances,
IMHO).  But a patch to change ONE method out of dozens, I suspect, would
be shot down -- the slight, useful extra functionality might be judged
to not be worth the increase in inconsistency in this area (which IMHO
must, sadly, count as a wart in today's Python, sigh).


Alex


> consistent with string.split(), and as far as I'm aware, it should not
> cause any sizeable performance penalty. But the most important reason
> is that keyword parameters for often-unused options make code more
> readable; for example,
> 
>     mystring.split(maxsplit=2)
> 
> reads better than:
> 
>    mystring.,split(None, 2)
> 
> That's my opinion, anyway...



More information about the Python-list mailing list