Bug? concatenate a number to a backreference: re.sub(r'(zzz:)xxx', r'\1'+str(4444), somevar)

abdulet abdulet at gmail.com
Fri Oct 23 08:27:21 EDT 2009


On 23 oct, 13:54, Peter Otten <__pete... at web.de> wrote:
> abdulet wrote:
> > Well its this normal? i want to concatenate a number to a
> > backreference in a regular expression. Im working in a multprocess
> > script so the first what i think is in an error in the multiprocess
> > logic but what a sorprise!!! when arrived to this conclussion after
> > some time debugging i see that:
>
> > import re
> > aa = "zzz:xxx"
> > re.sub(r'(zzz:).*',r'\1'+str(3333),aa)
> > '[33'
>
> If you perform the addition you get r"\13333". How should the regular
> expression engine interpret that? As the backreference to group 1, 13, ...
> or 13333? It picks something completely different, "[33", because "\133" is
> the octal escape sequence for "[":
>
> >>> chr(0133)
>
> '['
>
> You can avoid the ambiguity with
>
> extra = str(number)
> extra = re.escape(extra)
> re.sub(expr r"\g<1>" + extra, text)
>
> The re.escape() step is not necessary here, but a good idea in the general
> case when extra is an arbitrary string.
>
> Peter
Aha!!! nice thanks i don't see that part of the re module
documentation and it was in front of my eyes :(( like always its
something silly jjj so thanks again and yes!! is a nice idea to escape
the variable ;)

cheers



More information about the Python-list mailing list