Convert on uppercase unaccentent unicode character

John Machin sjmachin at lexicon.net
Wed Oct 3 20:23:52 EDT 2007


On Oct 4, 7:06 am, Duncan Booth <duncan.bo... at invalid.invalid> wrote:
> Steve Holden <st... at holdenweb.com> wrote:
> >> No, that will uppercase the string, but it doesn't (and shouldn't)
> >> strip the accents:
>
> > I can agree that is doesn't (though I am taking your word for it), but
> > a French person will definitely feel it's doing the wrong thing. Upper
> > case letters aren't accented in written French.
>
> I didn't know that, and I'm not sure I believe it: but then the French
> tend to have conventions honoured more in the breach than the observance. I
> just hit a few French websites, and the first one that I found which had
> any capital letters that might be accented had four accented capital
> letters on its front page (two capitalized words and two words in block
> capitals).

The usual rationale for such treatment of accented characters is for
fuzzy matching:
if upshiftedunaccented(text1) == upshiftedunaccented(text2):





More information about the Python-list mailing list