Does my RE rock, or suck?
Gustavo Niemeyer
niemeyer at conectiva.com
Fri Jul 9 15:53:24 EDT 2004
[...]
> In my example I compile the RE:
>
> '^(Mary|John( Smith( (Sr|Jr)|)|)|Bob)'
>
> If the RE compiler is smart, searching should involve scanning
> the longest prefix that could still match, plus looking at just
This won't happen right now. Minimum and maximum lenght is currently
only checked for the whole pattern, not for every branch.
> one character for each branch-point along the way. (If it uses
> a jump-table at decision points, it should be truly linear.)
OTOH, there's some intelligence for this specific case. The first
character is checked even before "entering" the branch (IOW,
preparing the enviornment for checking the branch), so your
pattern should match really fast.
It will be something close to:
(1) Entering a branch, save current state as necessary.
(2) Check first character, does it match?
(3) If yes, get into the branch, saving state as necessary, and
checking other characters, subpatterns, etc.
(4) No, jump to next alternative, go back to (2).
(5) Yes, leave the current branch with success, restoring states
as necessary.
--
Gustavo Niemeyer
http://niemeyer.net
More information about the Python-list
mailing list