[Python-ideas] str.split with multiple individual split characters

MRAB python at mrabarnett.plus.com
Mon Feb 28 01:36:20 CET 2011


On 28/02/2011 00:14, Andy Buckley wrote:
> Here's another str.split() suggestion, this time an extension (Pythonic,
> I think) rather than a change of semantics.
>
> There are cases where, especially in handling user input, I'd like to be
> able to treat any of a series of possible delimiters as acceptable.
> Let's say that I want commas, underscores, and hyphens to all be treated
> as delimiters (as I did in some code I was writing today). I guessed,
> based on some other Python std lib behaviours, that this might work:
>
> usertokens = userstr.split([",", "_", "-"])
>
> It doesn't work though, since the sep argument *has* to be a string. I
> think it would be nice for an extension like this to be supported,
> although I would guess a 90% probability of there being an insightful
> reason for why it's not such a great idea after all* ;-)
>
> Unlike many extensions, I don't think that the general solution to this
> is *very* quick and idiomatic in current Python. As for a compelling
> use-case... well, I'm very sympathetic to not adding functions for which
> there is no demand (I forget the relevant acronym) but this is a case
> where I suddenly found that I did have that problem to solve and that
> Python didn't have the nice built-in answer that I semi-expected it to.
> Extension of single arguments to iterables of them is quite a common
> Python design feature: one of those things where you think "ooh, this
> really is a nice, consistent, powerful language" when you find it. So I
> hope that this suggestion finds some favour.
>
> Best wishes,
> Andy
>
> [*] Such as "how do you distinguish between a string, which is iterable
> over its characters, and a list/tuple/blah of individual strings?" Well,
> that doesn't strike me as too big a technical issue, but maybe it is.

There are a number of additions which could be useful, such as
splitting on multiple separators (compare with str.startswith and
str.endswith) and stripping leading and/or trailing /strings/ (perhaps
str.stripstr, str.lstripstr and str.rstripstr), but it does come down
to use cases.

As has been pointed out previously, it's easy to keep adding stuff, but
once something is added we'll be stuck with it forever (virtually), so
we need to be careful.

The relevant acronym, by the way, is "YAGNI" ("You Aren't Going to Need
It" or "You Ain't Gonna Need It").



More information about the Python-ideas mailing list