Identifying unicode punctuation characters with Python regex
Shiao
multiseed at gmail.com
Fri Nov 14 05:23:08 EST 2008
Hello,
I'm trying to build a regex in python to identify punctuation
characters in all the languages. Some regex implementations support an
extended syntax \p{P} that does just that. As far as I know, python re
doesn't. Any idea of a possible alternative?
Apart from manually including the punctuation character range for each
and every language, I don't see how this can be done.
Thank in advance for any suggestions.
John
More information about the Python-list
mailing list