regex question

John Machin sjmachin at lexicon.net
Fri Aug 4 17:55:34 EDT 2006


Slawomir Nowaczyk wrote:
> On Thu, 03 Aug 2006 22:10:55 +0100
> Gabriel Murray <gabriel.murray at gmail.com> wrote:
>
> #> Hello, I'm looking for a regular expression ....
>
>   Some people, when confronted with a problem, think "I know, I'll
>   use regular expressions." Now they have two problems.
>                                    -- Jamie Zawinski
>
> Therefore:
>
> def test(data):
>     format, index = 'abcd', 0
>     for c in data:
>         i = format.index(c)
>         if i > index+1:
>             return False
>         index = i
>     return index==format.index('d')
>
> Could be made faster if format was made a dictionary or if one wanted
> to compare characters directly. Writing (and profiling) left as an
> exercise for a reader.

Premature optimisation ....

#>>> test('bcd')
True
#>>>

Here's an implementation using a general purpose graph path checking
"algorithm" (it's not really so fancy as to be called that).
8<---
from string import ascii_lowercase as alphabet

def build_maze(size):
    possible = {}
    for i in range(size):
        c = alphabet[i]
        possible[c] = set(alphabet[:min(i+2, size)])
    return possible, set(alphabet[0]), set(alphabet[size-1])

def valid_path(graph, entries, exits, path):
    """
    Check a proposed path through a maze.
    'graph' is a mapping {node: set of allowable next nodes}.
    'entries' is a set of valid start nodes.
    'exits' is a set of valid exit nodes.
    'path' is an iterable which produces nodes.
    A node label can be any hashable, even None.
    Returns True if 'path' is valid.
    """
    choices = entries
    pathlen = 0
    for node in path:
        pathlen += 1
        if node not in choices:
            return False
        if node not in graph:
            return False
            # aliter: raise ValueError("Node %r not in graph" % node)
        choices = graph[node]
    return pathlen and node in exits

tests = [
    ('abcd', True),
    ('aaaaaaaaaaabbbbbccc', False),
    ('aabbccaabbccabcdddcabababbbccccdddd', True),
    ('aabbccaabbccabcabababbbccccddddabcd', True),
    ('aaaaabbbbbccccaaaaadddd', False),
    ('aabbccaabbccacabababbbccccdddd', False),
    ('abccccdaaaabbbbccccd', True),
    ('abcdcd', True),
    ('aabbbaabbcccbbbcccddd', True),
    ('aabbccaabbccabcabababbbccccdddd', True),
    ('abccccdccccd', True),
    ('aabcabcd', True),
    ('bcd', False),
    ('abc', False),
    ('', False),
    ('a', False),
    ('d', False),
    ]

def checkit(maze, tests=tests):
    for test, expected in tests:
        gr, inset, outset = maze
        matched = valid_path(gr, inset, outset, test)
        if matched == expected:
            print "PASSED: %s with %s" % (test, expected)
        else:
            print "FAILED: %s with %s" % (test, expected)

maze4 = build_maze(4)
print maze4
checkit(maze4)
8<---

Cheers,
John




More information about the Python-list mailing list