Using Regular Expressions to Parse SQL

ct60 at aol.com ct60 at aol.com
Tue Feb 5 09:31:11 EST 2008


Hello again -

I do not seem to be able to get a handle on non-greedy pattern
matching.

I am trying to parse the following - note that there are no line
breaks in the string:

" FROM ((qry_Scores_Lookup1 INNER JOIN CSS_Rpt1 ON
(qry_Scores_Lookup1.desc = CSS_Rpt1.desc) AND
(qry_Scores_Lookup1.lastcdu = CSS_Rpt1.lastcdu))"

What I would like to do is be able to split on the "AND" and the "="
to come up with the following:
qry_Scores_Lookup1.desc
CSS_Rpt1.desc
qry_Scores_Lookup1.lastcdu
CSS_Rpt1.lastcdu

The following is one of my many attempts to do this:

import re

s= " FROM ((qry_Scores_Lookup1 INNER JOIN CSS_Rpt1 ON
(qry_Scores_Lookup1.desc = CSS_Rpt1.desc) AND
(qry_Scores_Lookup1.lastcdu = CSS_Rpt1.lastcdu))"

pat = " FROM .+ (?:INNER|LEFT|RIGHT) JOIN .+ ON (?:AND)*?((.+)=(.+))"

m = re.match(pat, s)

if m is None:
    print "No Match"
else:
    for mat in m.groups():
        print mat

My pattern does not even come close.

Any help would be greatly appreciated.  My goal is to analyse a large
number of SQL querys to try to identify the join field and see where
indexing might make sense.

While I am mostly interested in understanding regular expressions, I
would also be interested in knowing about any Python SQL parsers out
there.

Thanks in advance.

Chris (ct60 at aol.com)





More information about the Python-list mailing list