ZODB performance (was Re: low-end persistence strategies?)

Wed Feb 16 11:24:46 EST 2005

> Chris (or anyone else), could you comment on ZODB's performance? I've
> Googled around a bit and haven't been able to find anything concrete, so
> I'm really curious to know how ZODB does with a few hundred thousand
> objects.

> Specifically, what level of complexity do your ZODB queries/searches have?
> Any idea on how purely ad hoc searches perform? Obviously it will be
> affected by the nature of the objects, but any insight into ZODB's
> performance on large data sets would be helpful. What's the general ratio
> of reads to writes in your application?

This is a somewhat weak point of zodb. Zodb simply lets you store arbitrary
object graphs. There is no indices created to access these, and no query
language either. You can of course create indices yourself - and store them
as simply as all other objects. But you've got to hand-tailor these to the
objects you use, and create your querying code yourself - no 4gl like sql
available.

Of course writing queries as simple predicates evaluated against your whole
object graph is straightforward - but unoptimized.

The retrieval of objects themselves is very fast - I didn't compare to a
rdbms, but as there is no networking involved it should be faster. And of
course no joins are needed.

So in the end, if you have always the same kind of queries that you only
parametrize and create appropriate indices and hand-written "execution
plans" things are nice.

But I want to stress another point that can cause trouble when using zodb
and that I didn't mention in replies to Paul so far, as he explicitly
didn't want to use an rdbms:

For rdbms'ses, a well-defined textual representation of the entities stored
in the db is available. So while you have to put some effort on creating on
OR-mapping (if you want to deal with objects) that will most likely evolve
over time, migrating the underlying data usually is pretty straightforward,
and even toolsupport is available. Basically, you're only dealing with
CSV-Data that can be easily manipulated and stored back.

ZODB on the other side is way easier to code for - but the hard times begin
if you have a rolled out application that has a bunch of objects inside
zodb that have to be migrated to newer versions and possibly changed object
graph layouts. This made me create elaborate yaml/xml serializations to
allow for im- and exports and use with xslt and currently I'm investigating
a switch to postgres.

This point is important, and future developments of mine will take that into
consideration more than they did so far.

-- 
Regards,

Diez B. Roggisch