RegEx issues

Scott David Daniels Scott.Daniels at Acm.Org
Sat Jan 24 13:59:44 EST 2009


Sean Brown wrote:
> I have the following string ...:  "td[ct] = [[ ... ]];\r\n"
> The ... (representing text in the string) is what I'm extracting ....
> So I think the regex \[\[(.*)\]\]; should do it.
> The problem is it appears that python is escaping the \ in the regex
> because I see this:
>>>> reg = '\[\[(.*)\]\];'
>>>> reg
> '\\[\\[(.*)\\]\\];'
> Now to me looks like it would match the string - \[\[ ... \]\];
> ...

OK, you already have a good answer as to what is happening.
I'll mention that raw strings were put in the language exactly for
regex work.  They are useful for any time you need to use the backslash
character (\) within a string (but not as the final character).
For example:
     len(r'\a\b\c\d\e\f\g\h') == 16 and len('\a\b\c\d\e\f\g\h') == 13

If you get in the habit of typing regex strings as r'...' or r"...",
and examining the patters with print(somestring), you'll ease your life.

--Scott David Daniels
Scott.Daniels at Acm.Org



More information about the Python-list mailing list