RegEx for matching brackets

Derek Martin code at pizzashack.org
Sat May 3 00:48:49 EDT 2008


On Fri, May 02, 2008 at 03:51:16PM -0700, NevilleDNZ wrote:
> Thanx for the link to these parsers. ANTLR looks interesting.
> Yoyo: http://www-users.cs.york.ac.uk/~fisher/software/yoyovwg/readme
> 
> I figured out a way to do it in python.
[...]
> 
> def check_open_close(str):
>   try:
>     eval("".join({"{":"[","}":"],"}[c] for c in re.findall( "([{}])|(?:
> [^{}]+)", str) if c))

Ouchie...  It may be short, but if I had to maintain this code, and I
saw this, your name would be used in sentences with lots of curse
words. ;-)  That's one hard-to-read line of code...  Also this may or
may not do what you want it to do -- I think it doesn't...  

This problem is the classical example of when to use a stack (Last-In,
First-Out) data structure.  If all you want to do is make sure that
the line has the same number of opens and closes in a line, your code
does that.  But if you actually want to verify the syntax (i.e. make
sure that there are the same number of open brackets as close
brackets, AND make sure that they occur in the correct order, opens
first, closes last, AND that the closes come in the same (reverse)
order as the opens), your code does not do that.

I changed the tests in your code (specifically the brackets, and
nothing else) to demonstrate this:

DETECTED:   { a test  BAD
DETECTED:   { a test } OK
# This should fail, because the closing ] comes before the open [
MISSED:   { a test ] [ a test } BAD
DETECTED:   { a test } { this { a test } is a test } OK
# this should fail, for the above reason, and because the order is wrong
MISSED:   { a test  { this { a test ] is a test } missing close [}} BAD
DETECTED:   { a test  { this { a test ] is a test } missing close } BAD
# this should also fail for both reasons
MISSED:   { a test  ] this { a test } is a test } missing close [ BAD
DETECTED:   a test } { this { a test } is a test } BAD
DETECTED:   { a test } this { a test } is a test } BAD

It doesn't detect the brackets in the right order (opens before
closes), nor does it detect that they occur in the correct sequence.

Clever code isn't always so clever...  I think there's something to be
said for writing code that's a little bit more lengthy, but easier to
understand.  This version is only a few lines longer than yours (in
total, not counting the comments), but it is a bit clearer and easier
to follow.  Note that I added angle brackets and mixed the bracket
types in the tests.  I also didn't use your referee...  In my code,
the test simply succeeds if the brackets match, and fails if they are
unbalanced or out of order.

#!/usr/bin/python

# define the matching pairs
bm = { '}': '{', 
       ']': '[', 
       ')': '(', 
       '>': '<' }

def bracket_balance(str):
    # initialize the stack to an empty list
    blist = []
    for c in str:
        # If c is an open bracket of any type, place on stack
        if c in bm.values():
            blist.append(c)
	# If it is a close bracket, pull one off the stack and
	# see if they are matching open-close pairs.  If the stack
	# is empty, there was no matching open.  Return false in that
	# case, or if they don't match.
        if c in bm.keys():
            try:
                foo = blist.pop()
            except IndexError: 
                return False
            if foo != bm[c]:
                return False
    # End of the line: if we still have brackets on the stack, we
    # didn't have enough close brackets.  Return false.
    if blist != []: 
        return False
    # If we got here, the stack is empty, and there are no brackets
    # left unmatched.  we're good!
    return True

tests="""
  { this is a test  BAD
  < this is a test > OK
  { this is a test } { this is a test } OK
  { this is a test } [ this { this is a test } is a test ] OK
  { this is a test  { this { this is a test } is a test } missing close BAD
""".splitlines()[1:]

for test in tests:
    print "Testing %s:" % test
    if bracket_balance(test):
        print "-> OK" 
    else:
        print "-> FAILED"

Testing with your original set of tests:

$ ./brackets.py 
Testing   { this is a test  BAD:
-> FAILED
Testing   < this is a test > OK:
-> OK
Testing   { this is a test } { this is a test } OK:
-> OK
Testing   { this is a test } [ this { this is a test } is a test ] OK:
-> OK
Testing   { this is a test  { this { this is a test } is a test } missing close BAD:
-> FAILED

Testing with my modified set of tests:

$ ./brackets.py 
Testing   { a test  BAD:
-> FAILED
Testing   { a test } OK:
-> OK
Testing   { a test ] [ a test } BAD:
-> FAILED
Testing   { a test } { this { a test } is a test } OK:
-> OK
Testing   { a test  { this { a test ] is a test } missing close [}} BAD:
-> FAILED
Testing   { a test  { this { a test ] is a test } missing close } BAD:
-> FAILED
Testing   { a test  ] this { a test } is a test } missing close [ BAD:
-> FAILED
Testing   a test } { this { a test } is a test } BAD:
-> FAILED
Testing   { a test } this { a test } is a test } BAD:
-> FAILED

In all cases, this code correctly identifies when the brackets are out
of order, or unbalanced.

-- 
Derek D. Martin
http://www.pizzashack.org/
GPG Key ID: 0x81CFE75D

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20080503/d8f90d8c/attachment-0001.sig>


More information about the Python-list mailing list