[XML-SIG] Re: [DB-SIG] XML Databases and Python

Bob Kline bkline@rksystems.com
Fri, 11 Jan 2002 10:29:02 -0500 (EST)


On Fri, 11 Jan 2002, TORZEC Nicolas thesard FTRD/DIH/LAN wrote:

> Dear all,
> I am currently involved in a project where I am using Python and
> where I manage (create /update /delete /select /use) many XML files.
>
> - My original idea was to create my own XML file managing system, but
> creating such a system will take a long time.
> - My second idea was to use a lightweight relationnal database, but it's not
> the easiest and the most efficient way to manipulate such semi-structured
> data
> - My third idea was to use a lightweight object-oriented database
> - My fourth idea was to use a lightweight native XML database
>
>
>
> What do you think about these solutions ?
> According to your Python/XML/Database experience, what is the most
> appropriate database type for this kind of job ?
> Do you know lightweight relationnal, object-oriented or XML oriented
> databases compatible with Python ? (preference is given to open source
> projects)

Hi, Nicolas.

The choice we made for the project I'm currently working on for one of
my clients was to use a relational database for the underlying storage
mechanism.  This allowed us to use the relational tables for managing
and querying the document and system metadata.  We store the XML in a
single column of the master document table.  We have tables for such
things as user accounts, groups, and permissions, versioning of the
documents, audit trails, inter-document link tracking, and element
values for which XML Query is supported.  We looked at the other
solutions, which may be well suited to other projects, but concluded
that for our purposes, at the time the architectural decisions were
being made (back in 2000), this was the most appropriate.  In particular
we looked at the off-the-shelf SGML/XML repository management systems,
and concluded that the tradeoffs between the amount of customization
that would need to be done anyway, the uncertainties introduced by
dependencies on third-parties in a rapidly evolving market, and the
performance penalties introduced by at least some of the commercial
approaches used by the commercial solutions, the customer would be
better off if we built that piece ourselves.

On the other hand, we evaluated and selected one of the commercially
available XML editing packages (XMetaL from SoftQuad) for use as our
primary front end (this didn't eliminate the need for heavy
customization on the client end, though).

The fact that we are using a standard DBMS underneath the repository
means that we can take advantage of the richness of establish tools to
work with the relational tables.  We use Python heavily for much of our
reporting (the parts that aren't handled well by XSL/T) and web-based
administrative administrative tools.  It's hard to find a good RDBMS
with which Python is not compatible.

As I say, this approach may not be the best for every XML repository,
and we might not have made the same decisions after the commercial
repository software has had a chance to mature further, but it's working
well for us.

Hope this is useful.

-- 
Bob Kline
mailto:bkline@rksystems.com
http://www.rksystems.com