shortest match regexp operator anyone?
Harald Kirsch
kirschh at lionbioscience.com
Thu Jul 12 03:14:34 EDT 2001
"Steve Holden" <sholden at holdenweb.com> writes:
> "Harald Kirsch" <kirschh at lionbioscience.com> wrote in ...
> >
> > SHORT STORY:
> > Does anyone know of a regular expression library which has an operator
> > that forces a subexpression to insist on its shortest match, even if
> > that ruins the overall match?
[snip]
> Had you thought about using lookahead assertions, which don't actually match
> anything, but fail unless the specified pattern is (or, for a negative
> lookahead assertion, is not) present? Combined with non-greedy matching this
> might get you where you want to be.
No. Friedl's book has an example similar to
(.*?)(?=<A>)<A>B
but that matches "xx<A>x<A>B" i.e. the match contains an <A> in the
part covered by ".*". Again I cannot force "(.*?)(?=<A>)<A>" to insist
on the "shortest match" and not give it up for an overall match.
I tried other combinations, e.g. "(.(?!<A>))*?<A>" but none really
works.
Advocacy: The `shortest match' operator is really missing from regexp
languages.
Harald Kirsch
--
----------------+------------------------------------------------------
Harald Kirsch | kirschh at lionbioscience.com | "How old is the epsilon?"
LION bioscience | +49 6221 4038 172 | -- Paul Erdös
*** Please do not send me copies of your posts. ***
More information about the Python-list
mailing list