unicode "em space" in regex

Xah Lee xah at xahlee.org
Sun Apr 17 09:24:26 EDT 2005


Thanks. Is it true that any unicode chars can also be used inside regex
literally?

e.g.
re.search(ur' +',mystring,re.U)

I tested this case and apparently i can. But is it true that any
unicode char can be embedded in regex literally. (does this apply to
the esoteric ones such as other non-printing chars and combining
forms...)

----
Related...:

The official python doc:
 http://python.org/doc/2.4.1/lib/module-re.html
says:

"Regular expression pattern strings may not contain null bytes, but can
specify the null byte using the \number notation."

What is meant by null bytes here? Unprintable chars?? and the "\number"
is meant to be decimal? and in what encoding?

 Xah
 xah at xahlee.orghttp://xahlee.org/




More information about the Python-list mailing list