regular expressions.

Fri Aug 8 08:58:12 EDT 2008

Atul. wrote:

> The same file when I use with the following does not work.
> 
> import re
> vowel =
> r'[u"\u093e"u"\u093f"u"\u0940"u"\u0941"u"\u0942"u"\u0943"u"\u0944"u"\u0945"u"\u0946"u"\u0947"u"\u0948"u"\u0949"u"\u094a"u"\u094b"u"\u094c"]'
> print re.findall(vowel, u"\u092f\u093e\u0902\u091a\u094d\u092f\u093e",
> re.UNICODE)
> 
> 
> 
> atul at atul-desktop:~/Work/work/programs$ python fourth.py
> []
> atul at atul-desktop:~/Work/work/programs$
> 
> 
> is this the way to use Unicode in REs?

No, u"..." is part of the string, not the character. The regex becomes

# untested
vowel = u'[\u093e\u093f\u0940\u0941\u0942\u0943\u0944\u0945\u0946\u0947\u0948\u0949\u094a\u094b\u094c]'

Peter