Psycopg2 pool clarification

Israel Brewster israel at ravnalaska.net
Thu Jun 8 12:47:59 EDT 2017


-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------




> On Jun 7, 2017, at 10:31 PM, dieter <dieter at handshake.de> wrote:
> 
> israel <israel at ravnalaska.net> writes:
>> On 2017-06-06 22:53, dieter wrote:
>> ...
>> As such, using psycopg2's pool is essentially
>> worthless for me (plenty of use for it, i'm sure, just not for me/my
>> use case).
> 
> Could you not simply adjust the value for the "min" parameter?
> If you want at least "n" open connections, then set "min" to "n".

Well, sure, if I didn't care about wasting resources (which, I guess many people don't). I could set "n" to some magic number that would always give "enough" connections, such that my application never has to open additional connections, then adjust that number every few months as usage changes. In fact, now that I know how the logic of the pool works, that's exactly what I'm doing until I am confident that my caching replacement is solid. 

Of course, in order to avoid having to open/close a bunch of connections during the times when it is most critical - that is, when the server is under heavy load - I have to set that number arbitrarily high. Furthermore, that means that much of the time many, if not most, of those connections would be idle. Each connection uses a certain amount of RAM on the server, not to mention using up limited connection slots, so now I've got to think about if my server is sized properly to be able to handle that load not just occasionally, but constantly - when reducing server load by reducing the frequency of connections being opened/closed was the goal in the first place. So all I've done is trade dynamic load for static load - increasing performance at the cost of resources, rather than more intelligently using the available resources. All-in-all, not the best solution, though it does work. Maybe if load was fairly constant it would make more sense though. So like I said *my* use case, which is a number of web apps with varying loads, loads that also vary from day-to-day and hour-to-hour.

On the other hand, a pool that caches connections using the logic I laid out in my original post would avoid the issue. Under heavy load, it could open additional connections as needed - a performance penalty for the first few users over the min threshold, but only the first few, rather than all the users over a certain threshold ("n"). Those connections would then remain available for the duration of the load, so it doesn't need to open/close numerous connections. Then, during periods of lighter load, the unused connections can drop off, freeing up server resources for other uses. A well-written pool could even do something like see that the available connection pool is running low, and open a few more connections in the background, thus completely avoiding the connection overhead on requests while never having more than a few "extra" connections at any given time. Even if you left of the expiration logic, it would still be an improvement, because while unused connections wouldn't drop, the "n" open connections could scale up dynamically until you have "enough" connections, without having to figure out and hard-code that "magic number" of open connections.

Why wouldn't I want something like that? It's not like its hard to code - took me about an hour and a half to get to a working prototype yesterday. Still need to write tests and add some polish, but it works. Perhaps, though, the common thought is just "throw more hardware at it and keep a lot of connections open at all time?" Maybe I was raised to conservatively, or the company I work for is too poor.... :-D

> 
> -- 
> https://mail.python.org/mailman/listinfo/python-list




More information about the Python-list mailing list