[Python-Dev] Behavior of matching backreferences

Gustavo Niemeyer niemeyer@conectiva.com
Mon, 24 Jun 2002 00:04:58 -0300


> > I still think it should, because otherwise the "^(a)?b\1$" can never be
> > used, and this expression will become "^((a)?)b\1$" if more than one
> > character is needed.
> 
> Is that a real concern?  I mean that in the sense of whether you have an
> actual application requiring that some multi-character bracketing string
> either does or doesn't appear on both ends of a thing, and typing another
> set of parens is a burden.  Both parts of that seem strained.

No, it isn't. Even because there is some way to implement this,
as Barry and you have shown, and because *I* know it doesn't work as
I'd expect. :-))

Indeed, I've found it while implementing another feature which in my
opinion is really useful, and can't be easily achieved. But that's
something for another thread, another day.

[...]
> ?  Your example is hiding in there, on the "third iteration of the outer
> loop".  The official POSIX interpretation was that it should match just the
> first 6 characters, and not the trailing #,
> 
>     because in a third iteration of the outer subexpression, . would match
>     nothing (as distinct from matching a null string) and hence \2 would
>     match nothing.
[...]

Thanks for giving me a strong and detailed reason. I understand that
small issues can end up in endless discussions and different
implementations. I'm happy that the POSIX people thought about that
before me <2.0 wink>.

> > Could you please reject the patch at SF?
> 
> I'm not sure which one you mean, so on your authority I'm going to reject
> all patches at SF.  Whew!  This makes our job much easier <wink>.

That's good! You'll take back the time you wasted with me. ;-))

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]