problems with regex in Japanese?
Martin von Loewis
loewis at informatik.hu-berlin.de
Sat Aug 11 06:03:16 EDT 2001
Joe Strout <joe at strout.net> writes:
> > python no longer uses pcre, the pcre based regexp module
> > was replaced by a new unicode-aware implementation called sre (written
> > by Fredrik Lundh). sre is much faster too...
>
> Wow, I didn't know that. Where can I find out more about sre?
In Python 2.x, the re module is really sre, not pcre. I recommend not
to use UTF-8 strings, but convert them to Unicode objects, and pass
those into your regular expressions. Please read the re module
documentation for details; take particular notice of the UNICODE flag,
which determines whether character properties will or will not come
from the Unicode character database.
Regards,
Martin
More information about the Python-list
mailing list