[Python-ideas] More user-friendly version for string.translate()

Mikhail V mikhailwas at gmail.com
Mon Oct 24 17:31:11 EDT 2016


On 24 October 2016 at 22:54, Chris Barker <chris.barker at noaa.gov> wrote:
> On Mon, Oct 24, 2016 at 1:30 PM, Mikhail V <mikhailwas at gmail.com> wrote:
>>
>> But how would you with current translate function drop all characters
>> that are not in the table?
>
>
> that is another question altogether, and one for a different list, actually.
>
> I don't know a way to do "remove every character except these", but someone
> I expect there is a way to do that efficiently with Python strings.
>
> you could probably (ab)use the codecs module, though.
>
> If there really is no way to do it, then you might have feature worth
> pursuing, but be prepared with use-cases!
>
> The only use-case I've had for that sort of this is when I want only ASCII
> -- but I can uses the ascii codec for that :-)
>
>> This for example
>> is needed for filtering out all non-standard characters from paths, etc.
>
>
> You'd usually want to replace those with something, rather than remove them
> entirely, yes?

Just a pair of usage cases which I was facing in my practice:
1. Imagine I perform some admin tasks in a company with very different users
who also tend to name the files as they wish. So only God knows what can
be there in filenames. And I know foe example that there can be Cyrillic besides
ASCII their. So I just define a table like:
{
1072: 97
1073: 98
1074: 99
...
[which localizes Cyrillic into ASCII]
...
97:97
98:98
99:99
...
[those chars that are OK, leave them]
}

Then I use os.walk() and os.rename() and voila! the file system
regains it virginity
in one simple script.

2. Say I have a multi-lingual file or whatever, I want to filter out
some unwanted
characters so I can do it similarly.


Mikhail


More information about the Python-ideas mailing list