How to escape # hash character in regex match strings

David Shapiro David.Shapiro at sas.com
Wed Jun 10 12:42:44 EDT 2009


Maybe a using a Unicode equiv of # would do the trick.

-----Original Message-----
From: python-list-bounces+david.shapiro=sas.com at python.org [mailto:python-list-bounces+david.shapiro=sas.com at python.org] On Behalf Of Peter Otten
Sent: Wednesday, June 10, 2009 11:32 AM
To: python-list at python.org
Subject: Re: How to escape # hash character in regex match strings

504crank at gmail.com wrote:

> I've encountered a problem with my RegEx learning curve -- how to
> escape hash characters # in strings being matched, e.g.:
> 
>>>> string = re.escape('123#abc456')
>>>> match = re.match('\d+', string)
>>>> print match
> 
> <_sre.SRE_Match object at 0x00A6A800>
>>>> print match.group()
> 
> 123
> 
> The correct result should be:
> 
> 123456

>>> "".join(re.findall("\d+", "123#abc456"))
'123456'

> I've tried to escape the hash symbol in the match string without
> result.
> 
> Any ideas? Is the answer something I overlooked in my lurching Python
> schooling?

re.escape() is used to build the regex from a string that may contain 
characters that have a special meaning in regular expressions but that you 
want to treat as literals. You can for example search for r"C:\dir" with 

>>> re.compile(re.escape(r"C:\dir")).findall(r"C:\dir C:7ir")
['C:\\dir']

Without escaping you'd get

>>> re.compile(r"C:\dir").findall(r"C:\dir C:7ir")
['C:7ir']

Peter

-- 
http://mail.python.org/mailman/listinfo/python-list




More information about the Python-list mailing list