My Object pool isnt working

Fri Apr 4 13:20:56 EST 2003

[sam]
> I need to pool some database connections so i thought I'd write a
> object pool. I thought I'd use a design I've used in Java before -
> having a locked and unlocked hashtable and keeping references of the
> connections in two states (i.e. allocated and unallocated). The
> problem is that the total number of connections always seems to rise.
> Also the total number of connections is less than the sum of the
> lengths of the hashtables (dictionaries) which shouldn't happen. I
> believe the problems are connected. As I said I've used this design
> before in Java and its been fine. The only difference I can think of
> (barring shoddy implementation, which I can't rule out) is that
> Hashtables in Java are thread safe, but I'm not sure if the built in
> dict type in python is.

It is.

> I've decided to use jonpy's dbpooling module
> (http://sourceforge.net/projects/jonpy) instead of writing my own now,
> but I'd still like to know whats going on. I have tested on python
> 1.52 and python 2.2 (both on win2k) and get the same result. All
> thoughts appriciated

"both on win2k" was the crucial clue:  you're indexing a dict by timestamp,
but time.time() on all flavors of Windows updates only about 18.2 times per
second:  it's extremely common on Windows for successive calls to
time.time() to return the same value.

>>> import time
>>> for i in range(10):
...     print time.time()
...
1049479681.84
1049479681.84
1049479681.84
1049479681.84
1049479681.84
1049479681.84
1049479681.84
1049479681.84
1049479681.84
1049479681.84
>>>

> Regards
>
> Sam Owen
>
> <code>
> from Threading import *

This is no module named Threading in Python (although there is a module
named "threading").  Is this the actual code you ran?  Do you have your own
module named Threading.py?

> import time
> import random
> import sys
>
> class Object:
>     def sayHello(self, caller):
>         print "hello", self, caller
>         time.sleep(int(random.random() * 5))

Note that it's not *necessary* to do the int() business here; Python's
time.sleep() can sleep for fractional seconds.

> class ObjectPool:
>     LOCKED= {}
>     UNLOCKED= {}
>     sema_connect = Semaphore(1)
>     sema_return = Semaphore(1)
>     expiretime = 60 #30 mins

The comment here doesn't make sense, of course.

>     totalConns=0
>     numAllocated = 0
>     numReturned = 0

The intent of all these assignments is unclear.  Do you intend that all
instances of ObjectPool share these values?  That would be surprising.  If
it's your intent that each instance of ObjectPool have its own set of these
variables, then these assignments should be in an ObjectPool __init__
method, a la

        def __init__(self):
            self.LOCKED = {}
            # etc

You create only one ObjectPool instance in this code, so the difference
isn't apparent.

>     def getConnection(self):
>         try:
>             self.sema_connect.acquire()
>             if len(self.UNLOCKED):

More idiomatic would be

              if self.UNLOCKED:

However, the "if" statement here doesn't appear to serve a purpose; if
UNLOCKED is empty, the following "for" won't enter the body (.keys() will
return an empty list).

>                 for ts in self.UNLOCKED.keys() :
>                     if ts + self.expiretime < time.time():
>                         print "deleting ..."
>                         self.totalConns = self.totalConns-1
>                         del self.UNLOCKED[ts]
>                     else:
>                         print "allocating ..."
>                         self.numAllocated = self.numAllocated+1
>                         o = self.UNLOCKED[ts]
>                         self.LOCKED[o]=1
>                         del self.UNLOCKED[ts]
>                         return o
>             print "creating ..."
>             self.numAllocated = self.numAllocated+1
>             self.totalConns = self.totalConns+1
>             o = Object()
>             self.LOCKED[o] = 1
>             return o
>         finally:
>             self.sema_connect.release()
>
>     def returnConnection(self, con):
>         try:
>             self.sema_return.acquire()
>             if self.LOCKED.has_key(con):
>                 print "returning"
>                 self.numReturned = self.numReturned +1
>                 self.UNLOCKED[time.time()]=con

This is where you're getting killed on Windows:  whenever time.time()
repeats a value t previously seen, you're simply losing the connection
object that *used* to be stored at UNLOCKED[t].  time.clock() has much finer
resolution on Windows than time.time() (>1MHz vs 18.2Hz), so you could use
time.clock() instead.  If you feel you need to do a portable timestamp-based
gimmick, you'll need to implement an object that passes out unique
timestamps.  Alternatively, you could use a dict mapping a timestamp to a
list of connection objects.  Simple hack:  replace the last line above with:

    t = time.time()
    # Break ties (2.2 syntax used here).
    while t in self.UNLOCKED:
        t += random.random() / 10.0  # no more than 0.1 second away
    self.UNLOCKED[t] = con

> ...