[Python-ideas] More user-friendly version for string.translate()

Chris Barker chris.barker at noaa.gov
Mon Oct 24 14:02:00 EDT 2016


On Mon, Oct 24, 2016 at 10:50 AM, Ryan Birmingham <rainventions at gmail.com>
wrote:

> I also believe that using a text file would not be the best solution;
> using a dictionary,
>

actually, now that you mention it -- .translate() already takes a dict, so
if youw ant to put your translation table in a text file, you can use a
dict literal to do it:

# contents of file:

>
{
32: 95,

> 105: 64,
115: 36,
}

then use it:

s.translate(ast.literal_eval(open("trans_table.txt").read()))
now all you need is a tiny little utility function:

def translate_from_file(s, filename):
    return s.translate(ast.literal_eval(open(filename).read()))


:-)

-Chris


>
>
>
> other data structure, or anonomyous function would make more sense than
> having a specially formatted file.
>
> On Oct 24, 2016 13:45, "Chris Barker" <chris.barker at noaa.gov> wrote:
>
>> my thought on this:
>>
>> If you need translate() you probably can write the code to parse a text
>> file, and then you can use whatever format you want.
>>
>> This seems a very special case to build into the stdlib.
>>
>> -CHB
>>
>>
>>
>>
>> On Mon, Oct 24, 2016 at 10:39 AM, Mikhail V <mikhailwas at gmail.com> wrote:
>>
>>> Hello all,
>>>
>>> I would be happy to see a somewhat more general and user friendly
>>> version of string.translate function.
>>> It could work this way:
>>> string.newtranslate(file_with_table, Drop=True, Dec=True)
>>>
>>> So the parameters:
>>>
>>> 1. "file_with_table" : a text file with table in following format:
>>>
>>> #[In]    [Out]
>>>
>>> 97    {65}
>>> 98    {66}
>>> 99    {67}
>>> 100    {}
>>> ...
>>> 110    {110}
>>>
>>>
>>> Notes:
>>> All values are decimal or hex (to switch between parsing format use
>>> Dec parameter)
>>> As it turned out from my last discussion, majority prefers hex notation,
>>> so I am not in mainstream with my decimal notation here, but both
>>> should be supported.
>>> Empty [Out] value {} means that the character will be deleted.
>>>
>>> 2. "Drop = True" this will set the default behavior for those values
>>> which are NOT in the table.
>>>
>>> For Drop = True: all values not defined in table set to [out] = {},
>>> and be deleted.
>>>
>>> For Drop=False: all values not defined in table set [out] = [in], so
>>> those remain as is.
>>>
>>> 3. Dec= True : parsing format Decimal/hex. I use decimal everywhere.
>>>
>>>
>>> Further thoughts: for 8-bit strings this should be simple to implement
>>> I think. For 16-bit of course
>>> there is issue of memory usage for lookup tables, but the gurus could
>>> probably optimise it.
>>> E.g. at the parsing stage it is not necessary to build the lookup
>>> table  for whole 16-bit range of course,
>>> but take only values till the largest ordinal present in the table file.
>>>
>>> About the format of table file: I suppose many users would want also
>>> to define characters directly, I am not sure
>>> if it is really needed, but if so, additional brackets or escape char
>>> could be used, like this for example:
>>>
>>> a    {A}
>>> \98    {\66}
>>> \99    {\67}
>>>
>>> but as said I don't like very much the idea and would be OK for me to
>>> use numeric values only.
>>>
>>> So approximately I see it.
>>> Feel free to share thoughts or criticise.
>>>
>>>
>>> Mikhail
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at python.org
>>> https://mail.python.org/mailman/listinfo/python-ideas
>>> Code of Conduct: http://python.org/psf/codeofconduct/
>>>
>>
>>
>>
>> --
>>
>> Christopher Barker, Ph.D.
>> Oceanographer
>>
>> Emergency Response Division
>> NOAA/NOS/OR&R            (206) 526-6959   voice
>> 7600 Sand Point Way NE   (206) 526-6329   fax
>> Seattle, WA  98115       (206) 526-6317   main reception
>>
>> Chris.Barker at noaa.gov
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20161024/cb403921/attachment.html>


More information about the Python-ideas mailing list