Python and Cyrillic characters in regular expression

Fredrik Lundh fredrik at pythonware.com
Thu Sep 4 13:53:32 EDT 2008


phasma wrote:

> Hi, I'm trying extract all alphabetic characters from string.
> 
> reg = re.compile('(?u)([\w\s]+)', re.UNICODE)
> buf = re.match(string)
> 
> But it's doesn't work. If string starts from Cyrillic character, all
> works fine. But if string starts from Latin character, match returns
> only Latin characters.

can you provide a few sample strings that show this behaviour?

</F>




More information about the Python-list mailing list