Why this regex for string literals can't handle escaped quotes?.... '"(\\.|[^"])*"'

Peter Otten __peter__ at web.de
Thu Aug 9 13:03:29 EDT 2018


cseberino at gmail.com wrote:

> Why this regex for string literals
> can't handle escaped quotes?.... '"(\\.|[^"])*"'
> 
> See this...
> 
>>>> string_re = '"(\\.|[^"])*"'
> 
>>>> re.match(string_re, '"aaaa"')
> <_sre.SRE_Match object; span=(0, 6), match='"aaaa"'>
> 
>>>> re.match(string_re, '"aa\"aa"')
> <_sre.SRE_Match object; span=(0, 4), match='"aa"'>
> 
> How make the last match be the entire string '"aa\"aa"' ?
> 
> cs

You did not escape the string literals:

>>> '"aa\"aa"' == '"aa"aa"'
True

Once you fix this for both the regex and the search string you get the 
expected match:

>>> re.match('"(\\\\.|[^"])*"', '"aa\\"aa"')
<_sre.SRE_Match object; span=(0, 8), match='"aa\\"aa"'>

Raw strings make this a bit easier to read:

>>> re.match(r'"(\\.|[^"])*"', r'"aa\"aa"')
<_sre.SRE_Match object; span=(0, 8), match='"aa\\"aa"'>





More information about the Python-list mailing list