[Python-ideas] More user-friendly version for string.translate()

Mikhail V mikhailwas at gmail.com
Tue Oct 25 18:32:42 EDT 2016


On 25 October 2016 at 19:10, Stephen J. Turnbull
<turnbull.stephen.fw at u.tsukuba.ac.jp> wrote:

 > So my previous thought on it was, that there could be set of such functions:
 >
 > str.translate_keep(table) - this is current translate, namely keeps
 > non-defined chars untouched
 > str.translate_drop(table) - all the same, but dropping non-defined chars
 >
 > Probaly also a pair of functions without translation:
 > str.remove(chars) - removes given chars
 > str.keep(chars) - removes all, except chars
 >
 > Motivation is that those can be optimised for speed and I suppose those
 > can work faster than re.sub().

>That said, multiple methods is a valid option for the API.  Eg, Guido
>generally prefers that distinctions that can't be made on type of
>arguments (such as translate_keep vs translate_drop) be done by giving
>different names rather than a flag argument.  Do you *like* this API,
>or was this motivated primarily by the possibilities you see for
>optimization?

Certainly I like the look of distinct functions more.
It allows me to visually parse the code effectively,
so e.g. for str.remove() I would not need to look
in docs to understand what the function does.
It has its downside of course, since new definitions
can accidentally be similar to current ones, so more
names, more the probability that no good names are left.
Speed is not so important for majority of cases, at least
for my current tasks. However if I'll need to process very large
texts (seems like I will), speed will be more important.

>The width is constant for any given string.  However, I don't see at
>this point that you'll need more than the functions available in
>Python already, plus one or more wrappers to marshal the information
>your API accepts to the data that str.translate wants.

Just in some cases I need to convert them to numpy arrays back and forth,
so this unicode vanity worries me a bit. But I cannot clearly explain
why exactly I need this.

 > >> but as said I don't like very much the idea and would be OK for me to
 > >> use numeric values only.
 > Yeah I am strange. This however gives you guarantee for any
environment that you
 > can see and input them ans save the work in ASCII.

>This is not going to be a problem if you're running Python and can
>enter the program and digits.  In any case, the API is going to have
>to be convenient for all the people who expect that they will never
>again be reduced to a hex keypad and 7-segment display

Here I will dare to make a lyrical degression again.
It could have made an impression that I am stuck in nineties or
something. But that is not the case. In nineties
I used the PC mostly to play Duke Nukem (yeh big times!).
And all the more I hadnt any idea what is efficiency
of information representation and readability.
Now I kind of realize it.
So I am just not the one who believes in these
maximalistical "we need over 9000 glyphs" talks.
And, somewhat prophetic view on this:
with the come of cyber era this all be flushed
so fast, that all this diligences around unicode
could look funny actually. And a hex keypad
will not sound "retro" but "brand new".

In other words: I feel really strong that nothin
besides standard characters must appear in code sources.
If one wants to process unicode, then parse them
as resources.
So please, at least out of respect to rationally
minded, don't make a code look like a christmas-tree.
BTW, I use VIM to code actually so anyway I will not
see them in my code.


Mikhail


More information about the Python-ideas mailing list