[Tutor] regular expression question

=?UTF-8?Q?Marek_Spoci=C5=84ski =?UTF-8?Q?Marek_Spoci=C5=84ski
Tue Apr 28 11:26:49 CEST 2009


Dnia 28 kwietnia 2009 11:16 Andre Engels <andreengels at gmail.com> napisał(a):
> 2009/4/28 Marek Spociński at go2.pl,Poland :
> >> Hello,
> >>
> >> The following code returns 'abc123abc45abc789jk'. How do I revise the pattern so
> >> that the return value will be 'abc789jk'? In other words, I want to find the
> >> pattern 'abc' that is closest to 'jk'. Here the string '123', '45' and '789' are
> >> just examples. They are actually quite different in the string that I'm working
> >> with.
> >>
> >> import re
> >> s = 'abc123abc45abc789jk'
> >> p = r'abc.+jk'
> >> lst = re.findall(p, s)
> >> print lst[0]
> >
> > I suggest using r'abc.+?jk' instead.
> >
> > the additional ? makes the preceeding '.+' non-greedy so instead of matching as long string as it can it matches as short string as possible.
> 
> That was my first idea too, but it does not work for this case,
> because Python will still try to _start_ the match as soon as
> possible. To use .+? one would have to revert the string, then use the
> reverse regular expression on the result, which looks like a rather
> roundabout way of doing things.

I don't have access to python right now so i cannot test my ideas...
And i don't really want to give you wrong idea too.


More information about the Tutor mailing list