[XML-SIG] SQL -> XML ?

Michael Sanborn michael@graphion.com
Wed, 10 Feb 1999 10:16:26 -0800


A month ago, I had been working on an XML project as a way of learning
Python. It was quite rewarding, in a way, but now that I'm returning to
it, I think that I need advice on my approach.

At my work, we get in database files that need to be converted into
directories. There are many variations on the theme, but a typical
directory structure would be (in XML syntax):

<physicians>
  <county>Alameda County
    <city>Alameda
      <specialty>Dermatology
        <type>Groups
          <provider>
            <name>Affiliates In Dermatology Med Grp</name>
            <address>2241 Central Ave</address>
            <phone>(510) 523-9866</phone>
            <member>Graham, Richard S.</member>
            <member>Hilger Jr., Leslie G.</member>
            <member>Paslin, David A.</member>
          </provider>
        </type>
      </specialty>
    </city>
  </county>
</physicians>

These are expressed as consecutive one-to-many relationships in a set of
database tables. What I've been trying to do is to read the tables using
SQL statements in Python and then translate the data into XML using
PyDOM. My first approach was to make one big SQL statement like (pause
for breath):

SELECT COUNTY.COUNTYNAME, PPOPROV.CITY, SPECALTY.SPECIALTYH,
PPOPROV.AGREEMENTT, PPOPROV.PROVIDERNU, PPOPROV.NAME, PPOPROV.ADDRESS,
PPOPROV.PHONE, MEMBER.NAME FROM MEMBER RIGHT JOIN (SPECALTY INNER JOIN
(COUNTY INNER JOIN PPOPROV ON COUNTY.COUNTYCODE = PPOPROV.COUNTYCODE) ON
SPECALTY.SEQNUM = PPOPROV.SPECIALTYH) ON MEMBER.PROVIDERNU =
PPOPROV.PROVIDERNU ORDER BY COUNTY.COUNTYNAME, PPOPROV.CITY,
SPECALTY.SPECIALTYH, PPOPROV.AGREEMENTT, PPOPROV.NAME,
PPOPROV.PROVIDERNU, MEMBER.NAME;

and then made a Reader object that recursively directed Builder.py to
create DOM nodes, weaving up and down the tree as the various levels of
grouped headings changed. (Sorry if this is unclear; I'm new to trees,
and am not certain what the standard programming terminology is for what
I'm doing. A minuscule amount of Design Pattern knowledge is a dangerous
thing. :-})

It worked, but it was very slow, and when I tried to adapt it to varying
structures, I found it to be very brittle as well. So I thought of
breaking down the SQL statements by level, like this (to just use the
two topmost levels):

from xml.dom.builder import Builder
from calldll import odbc
from calldll import odbc_installer

builder = Builder()
installer = odbc_installer
installer.config_data_source(installer.ODBC_ADD_DSN, 'Microsoft FoxPro
Driver (*.dbf)', 'DSN=myData;DBQ=.;')
environment = odbc.environment()
connection = environment.connection()
connection.connect('myData')
builder.startElement('physician')
record_set = connection.query('SELECT DISTINCT COUNTY.COUNTYNAME
    FROM COUNTY, PPOPROV
    WHERE COUNTY.COUNTYCODE = PPOPROV.COUNTYCODE
    ORDER BY COUNTYNAME;')
record_set = record_set[1:]
for i in record_set:
        builder.startElement('county')
        builder.text(i[0])
        record_set1 = connection.query('SELECT DISTINCT PPOPROV.CITY
    FROM COUNTY, PPOPROV
    WHERE COUNTY.COUNTYCODE = PPOPROV.COUNTYCODE AND
COUNTY.COUNTYNAME=\'' + i[0] + '\'
    ORDER BY PPOPROV.CITY;')
        record_set1 = record_set1[1:]
        for j in record_set1:
                builder.startElement('city')
                builder.text(j[0])
                builder.endElement('city')
        builder.endElement('county')
builder.endElement('physician')
print builder.document.toxml()


But this is already quite slow, and I've hardly scratched the surface.

So, can anyone provide me with a little guidance here? Should I be
looking at a SAX approach, rather than DOM? Does the fact that I'm using
the ODBC library in calldll matter, rather than, say, mxODBC? Is there
some standard approach to this problem? Should I just go back to South
Dakota?

Any help would be very much appreciated.

Thanks,

Michael Sanborn
Graphion