Replace accented chars with unaccented ones

Michael Hudson mwh at python.net
Tue Mar 16 06:15:30 EST 2004


Jeff Epler <jepler at unpythonic.net> writes:

> You have two options.  First, convert the string to Unicode and use code
> like the following:
> 
>     replacements = [(u'\xe9', 'e'), ...]
>     def remove_accents(u):
>         for a, b in replacements:
>             u = u.replace(a, b)
>         return u
> 

There must be some more high powered way of doing this... something
like:

def remove_accent1(c):
    return unicodedata.normalize('NFD', c)[0]
def remove_accents(s):
    return u''.join(map(remove_accent1, s))

?

Cheers,
mwh

-- 
  We've had a lot of problems going from glibc 2.0 to glibc 2.1.
  People claim binary compatibility.  Except for functions they
  don't like.                       -- Peter Van Eynde, comp.lang.lisp



More information about the Python-list mailing list