hard_decoding
Peter Maas
peter at somewhere.com
Thu Feb 10 08:04:31 EST 2005
Tamas Hegedus schrieb:
> Do you have a convinient, easy way to remove special charachters from
> u'strings'?
>
> Replacing:
> ÀÁÂÃÄÅ => A
> èéêë => e
> etc.
> 'L0xe1szl0xf3' => Laszlo
> or something like that:
> 'L\xc3\xa1szl\xc3\xb3' => Laszlo
>>> ord(u'ë')
235
>>> ord(u'e')
101
>>> cmap = {235:101}
>>> u'hello'.translate(cmap)
u'hello'
>>> u'hëllo'.translate(cmap)
u'hello'
The inconvenient part is to generate cmap. I suggest you write a
helper class genmap for this:
>>> g = genmap()
>>> g.add(u'ÀÁÂÃÄÅ', u'A')
>>> g.add(u'èéêë', u'e')
>>> 'László'.translate(g.cmap())
Laszlo
--
-------------------------------------------------------------------
Peter Maas, M+R Infosysteme, D-52070 Aachen, Tel +49-241-93878-0
E-mail 'cGV0ZXIubWFhc0BtcGx1c3IuZGU=\n'.decode('base64')
-------------------------------------------------------------------
More information about the Python-list
mailing list