|Jeremy Hylton : weblog : 2003-04-18|
Friday, April 18, 2003
One component of ZODB is ZEO, a client-server storage layer that allows many clients to share a single database. Client-server support is a critical feature. Even if you only expect a single client during normal operation, it is useful to connect a second client for debugging.
The current ZEO system (2.0) works well enough, but it is fairly complex and difficult to extend. If I had free reign to start over, there are a bunch of interesting possibilities to consider.
The current ZEO implementation uses version two of a homegrown RPC protocol. One serious problem with the current scheme is that it uses cPickle to serialize objects before sending them over the wire. cPickle is very efficient for serializing small objects. Since all the logic of serialization is in C, it's cheap to construct the pickle. But pickle does badly for large string objects -- like the objects returned from load(). The pickler creates a copy of the string. So to return a very large object from ZEO, we need to load the object into memory, then copy it, then send it out over asyncore. Sending it via asyncore will almost certainly break it up into little pieces. The client has to do the same thing on unpickling. This can lead to excess memory usage or outright failures (MemoryErrors) when fragmentation makes it impossible to allocate a big enough buffer.
Most of the ZEO protocol involves asynchronous methods. The store() method, for example, is asynchronous. This ends up complicating the storage API, because the storage must return a new serial number to the client. The asynchrony is important, because it improves performance. An interesting middle ground would be to extend the RPC layer with support for promises.
Given the existing problems and the paucity of time available to work on it, a better strategy is probably to use an existing solution for the communications layer. The two most promising candidates are probably Twisted's perspective broker and omniORB for Python.
The client cache is critical for good ZEO performance. The cache keeps copies of recently used objects on disk, so that subsequent accesses don't require a trip to the server.
Guido developed a trace-driven simulator for the ZEO cache that lead to a number of improvements. The current cache keeps two data files. Every time an object is loaded, it is save in the current data file. When the current data file fills up, the cache deletes the other data file and starts with a new, empty current file. It's a very simple scheme (perhaps too simple), but it works well enough in practice if the cache is very large. Guido's simulator also showed that a buddy cache would work well.
Prefetching could improve the performance of the client cache. It's also possible to use a page-based cache system to achieve some benefits over the object-based cache we've got now.
If we can anticipate client access patterns, we can fetch objects before they are needed. The prefetched objects could be delivered in the background while the application is running. Or the server could send a group of objects in response to a request for one.
One need the Chandler project identified for its ZODB integration was the ability to exploit their server's bulk fetch API; for example, prefetching all the objects in a query's result set.
Mark Day's thesis suggests that a small amount of structural caching could be a good thing, where structural caching means that structure of application objects is exploited to inform caching. We would need to revise ZEO to allow clients to receive objects asynchronously.