python regex character group matches
Steven D'Aprano
steve at REMOVE-THIS-cybersource.com.au
Wed Sep 17 10:55:35 EDT 2008
On Wed, 17 Sep 2008 15:56:31 +0200, Fredrik Lundh wrote:
> Assuming that you want to find runs of \uXXXX escapes, simply use
> non-capturing parentheses:
>
> pat = re.compile(u"(?:\\\u[0-9A-F]{4})")
Doesn't work for me:
>>> pat = re.compile(u"(?:\\\u[0-9A-F]{4})")
UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in position
5-7: truncated \uXXXX escape
Assuming that the OP is searching byte strings, I came up with this:
>>> pat = re.compile('(\\\u[0-9A-F]{4})+')
>>> pat.search('abcd\\u1234\\uAA99\\u0BC4efg').group(0)
'\\u1234\\uAA99\\u0BC4'
--
Steven
More information about the Python-list
mailing list