convert unicode characters to visibly similar ascii characters
John Machin
sjmachin at lexicon.net
Tue Jul 1 20:29:52 EDT 2008
On Jul 2, 9:55 am, Jim <jim.heffe... at gmail.com> wrote:
> Peter Bulychev wrote:
> > I want to convert unicode character into ascii one.
>
> You have to make some arbitrary choices of what to translate. Based
> on some materials on effbot's site, and a recipe, I made
> ftp://alan.smcvt.edu/hefferon/unicode2ascii.py
> which has at least some of what you are looking for.
> $ grep HYPHEN unicode2ascii.py
> u'\N{SOFT HYPHEN}':u'-',
> u'\N{HYPHEN}':u'-',
> u'\N{NON-BREAKING HYPHEN}':u'-',
> u'\N{SOFT HYPHEN}': '-',
> No doubt I have some terrible gaffes and some things missing.
> Corrections appreciated.
Comments on the above grep output:
1. You have SOFT HYPHEN twice, mapping it to u'-' and '-'
2. The idea of a soft hyphen is as a hint to a hyphenator about where
to insert a hyphen if one is necessary and the hyphenator is suspected
of acting cluelessly without the hint. IMHO, asciification should
substitute u'', not u'-'.
3. Read PEP 8. s/:/: /
Cheers,
John
More information about the Python-list
mailing list