sre \Z bug or feature?

Tue Jan 2 13:20:54 EST 2001

On Tue, 2 Jan 2001, Tim Peters wrote:

> You may want to add this test case to the bug "New re breaks on some '*?'
> matches" opened yesterday:
> 
> http://sourceforge.net/bugs/?func=detailbug&bug_id=127259&group_id=5470

Done. Thanks for the reference.

> > ... any hints how to go around this bug without updating Python
> > from CVS after it is fixed?
> 
> It's unclear what you're trying to accomplish.  Tell us in words what it is
> you're trying to match, and I'm sure we can find an equivalent regexp that
> doesn't use *?.  As is, your regexp should match any string whatsoever that
> ends with a } followed by optional whitespace, and set groupdict('rest') to
> an empty string.  It can never set 'rest' to anything other than an empty
> string.  If that's really what you intended, then
> 
>     r'(?s).*}\s*\Z(?P<rest>)'
> 
> is a simpler and faster way to accomplish that.  But I doubt that's what you
> intended.

Here is a complete re.match command that I use in my application:

m=re.match(r'\A(?ms)\s*@\s*(?P<name>\w+)\s*{\s*(?P<body>.*?)\s*}\s*\Z',item)

and it should match an item in the .bib files (the BiBTeX input file).
This item is expected to be of the following form:

@name{body}

where `name' is purely alphabetic string, body may contain arbitrary
characters, including `{',`}',`\n'. Between the parts `@', `name', `{',
`body', and `}' there may be any number of whitespace characters,
including newlines. Note also that `item' contains exactly one such bibtex
item block (that is correctly ensured by the other parts of my
application).

So, do you think that the pattern I use is somehow invalid or inefficient?

In order to get my application to work under Python 2.0, I use now

import pre as re

that works perfectly.

Thanks,
	Pearu