Async client for PostgreSQL?

Laszlo Nagy gandalf at shopzeus.com
Sat Sep 1 15:28:34 EDT 2012


> Hi
>
> does running on tornado imply that you would not consider twisted 
> http://twistedmatrix.com ?
>
> If not, twisted has exactly this capability hiding long running 
> queries on whatever db's behind deferToThread().
All right, I was reading its documentation

http://twistedmatrix.com/documents/10.1.0/api/twisted.internet.threads.deferToThread.html

It doesn't tell too much about it: "Run a function in a thread and 
return the result as a Deferred.".

Run a function but in what thread? Does it create a new thread for every 
invocation? In that case, I don't want to use this. My example case: 10% 
from 100 requests/second deal with a database. But it does not mean that 
one db-related request will do a single db API call only. They will 
almost always do more: start transaction, parse and open query, fetch 
with cursor, close query, open another query etc. then commit 
transaction. 8 API calls to do a quick fetch + update (usually under 
100msec, but it might be blocked by another transaction for a while...) 
So we are talking about 80 database API calls per seconds at least. It 
would be insane to initialize a new thread for each invocation. And 
wrapping these API calls into a single closure function is not useful 
either, because that function would not be able to safely access the 
state that is stored in the main thread. Unless you protet it with 
locks. But it is whole point of async I/O server: to avoid using slow 
locks, expensive threads and context switching.

Maybe, deferToThread uses a thread pool? But it doesn't say much about 
it. (Am I reading the wrong documentation?) BTW I could try a version 
that uses a thread pool.

It is sad, by the way. We have async I/O servers for Python that can be 
used for large number of clients, but most external modules/extensions 
do not support their I/O loops. Including the extension modules of the 
most popular databases. So yes, you can use Twisted or torandoweb until 
you do not want to call *some* API functions that are blocking. (By 
*some* I mean: much less blocking than non-blocking, but quite a few.) 
We also have synchronous Python servers, but we cannot get rid of the 
GIL, Python threads are expensive and slow, so they cannot be used for a 
large number of clients. And finally, we have messaging services/IPC 
like zeromq. They are probably the most expensive, but they scale very 
well. But you need more money to operate the underlying hardware. I'm 
starting to think that I did not get a quick answer because my use case 
(100 clients) fall into to the "heavy weight" category, and the solution 
is to invest more in the hardware. :-)

Thanks,

    Laszlo




More information about the Python-list mailing list