[Python-ideas] Updated PEP 428 (pathlib)

Devin Jeanpierre jeanpierreda at gmail.com
Thu Mar 7 04:48:29 CET 2013


On Wed, Mar 6, 2013 at 9:26 PM, MRAB <python at mrabarnett.plus.com> wrote:
>> It's not something I've ever used, but it doesn't look that difficult
>> compared to regex if all it has is "*", "?", "[...]" and "[!...]".
>>
> Here's a very simple, all-Python, implementation I've just cooked up:
--snip--

Because positions is never culled of duplicate states, this suffers
the exact same problem.

In this case, fnmatch('a'*50, '*a*'*50) returns in 6500 years instead of 200.

If you want to solve it, you should either affix it with a memoization
cache, or use Thompson's algorithm instead of backtracking search.

Since this isn't recursive, memoization is a bit annoying, though, so
instead I modified it below to use Thompson's algorithm on NFA (with
no error checking though):

def fnmatch(name, pattern):
    positions = {0}

    for char in name:
        new_positions = set()
        for pattern_pos in positions:
            if pattern_pos >= len(pattern):
                continue
            pattern_char = pattern[pattern_pos]
            if pattern_char == '*':
                if pattern_pos == len(pattern) - 1:
                    return True

                new_positions.update([pattern_pos, pattern_pos + 1])
            elif pattern_char == '?':
                new_positions.add(pattern_pos + 1)
            elif pattern[pattern_pos] == '[':
                negative = pattern[pattern_pos + 1] == "!"
                pattern_pos += 2 if negative else 1

                close_pos = pattern.index(']', pattern_pos)
                if (char in pattern[pattern_pos : close_pos]) != negative:
                    new_positions.add(close_pos + 1)
            elif char == pattern_char:
                new_positions.add(pattern_pos + 1)
        positions = new_positions

    return len(pattern) in positions

Backseatingly yours,
-- Devin



More information about the Python-ideas mailing list