Upper/lowercase regex matching in unicode

Jason Stitt jason at pengale.com
Wed Oct 19 18:23:16 EDT 2005


What's the best way to match uppercase or lowercase characters with a  
regular expression in a unicode-aware way? Obviously [A-Z] and [a-z]  
aren't going to cut it. I thought there were character classes of the  
form ::upper:: or similar syntax, but can't find them in the docs.  
Maybe I'm getting it mixed up with Perl regexen.

The upper() and lower() methods do work on accented characters in a  
unicode string, so there has to be some recognition of unicode case  
in there somewhere.

Thanks,

Jason



More information about the Python-list mailing list