Least-lossy string.encode to us-ascii?

Ethan Furman ethan at stoneleaf.us
Thu Sep 13 18:29:40 EDT 2012


[sorry for the direct reply, Tim]

Tim Chase wrote:
> I've got a bunch of text in Portuguese and to transmit them, need to
> have them in us-ascii (7-bit).  I'd like to keep as much information
> as possible, just stripping accents, cedillas, tildes, etc.  So
> "serviço móvil" becomes "servico movil".  Is there anything stock
> that I've missed?  I can do mystring.encode('us-ascii', 'replace')
> but that doesn't keep as much information as I'd hope.

I haven't yet used it myself, but I've heard good things about
http://pypi.python.org/pypi/Unidecode/

~Ethan~




More information about the Python-list mailing list