Normalize a polish L

Roberto Bonvallet rbonvall at gmail.com
Tue Oct 16 12:51:47 EDT 2007


On Oct 15, 6:57 pm, John Machin <sjmac... at lexicon.net> wrote:
> To "asciify" such text, you need to build a look-up table that suits
> your purpose. unicodedata.decomposition() is (accidentally) useful in
> providing *some* of the entries for such a table.

This is the only approach that can actually work, because every
language has different conventions on how to represent text without
diacritics.

For example, in Spanish, "ü" (u with umlaut) should be represented as
"u", but in German, it should be represented as "ue".

    pingüino -> pinguino
    Frühstück -> Fruehstueck

I'd like that web applications (e.g. blogs) took into account these
conventions when creating URLs from the title of an article.
--
Roberto Bonvallet




More information about the Python-list mailing list