[Python-ideas] Easily remove characters from a string.
Steven D'Aprano
steve at pearwood.info
Sun Oct 23 11:37:34 EDT 2016
On Sat, Oct 22, 2016 at 03:34:23PM +0700, Simon Mark Holland wrote:
> Having researched this as heavily as I am capable with limited experience,
> I would like to suggest a Python 3 equivalent to string.translate() that
> doesn't require a table as input. Maybe in the form of str.stripall() or
> str.replaceall().
stripall() would not be appropriate: "strip" refers to removing from the
front and end of the string, not the middle, and str.strip() already
implements a "strip all" functionality:
py> '+--+*abcd+-*xyz-*+-'.strip('*+-')
'abcd+-*xyz'
But instead of a new method, why not fix translate() to be more user-
friendly? Currently, it takes two method calls to delete characters
using translate:
table = str.maketrans('', '', '*+-.!?')
newstring = mystring.translate(table)
That's appropriate when you have a big translation table which you are
intending to use many times, but its a bit clunky for single, one-off
uses.
Maybe we could change the API of translate to something like this:
def translate(self, *args):
if len(args) == 1:
# Same as the existing behaviour.
table = args[0]
elif len(args) == 3:
table = type(self).maketrans(*args)
else:
raise TypeError('too many or not enough arguments')
...
Then we could write:
newstring = mystring.translate('', '', '1234567890')
to delete the digits.
So we could fix this... but should we? Is this *actually* a problem that
needs fixing, or are we just adding unnecessary complexity?
> My reasoning is that while it is currently possible to easily strip()
> preceding and trailing characters, and even replace() individual characters
> from a string,
Stripping from the front and back is a very common operation; in my
experience, replacing is probably half as common, maybe even less. But
deleting is even less common.
> My proposal is that if strip() and replace() are important enough to
> receive modules, then the arguably more common operation (in terms of
> programming tutorials, if not mainstream development) of just removing all
> instances of specified numbers, punctuation, or even letters etc from a
> list of characters should also.
I think the reason that deleting characters is common in tutorials is
that it is a simple, easy, obvious task that can be programmed by a
beginner in just a few lines. I don't think it is actually something
that people need to do very often, outside of exercises.
--
Steve
More information about the Python-ideas
mailing list