Newbie queue question

Jure Erznožnik jure.erznoznik at gmail.com
Thu Jun 18 17:25:26 EDT 2009


Thanks for the suggestions.
I've been looking at the source code of threading support objects and
I saw that non-blocking requests in queues use events, while blocking
requests just use InterlockedExchange.
So plain old put/get is much faster and I've managed to confirm this
today with further testing.

Sorry about the semicolon, just can't seem to shake it with my pascal
& C++ background :)

Currently, I've managed to get the code to this stage:

    class mt(threading.Thread):

        q = Queue.Queue()
        def run(self):
            dbf1 = Dbf('D:\\python\\testdbf\\promet.dbf', readOnly=1)
            for i1 in xrange(len(dbf1)):
                self.q.put(dbf1[i1])
            dbf1.close()
            del dbf1
            self.q.put(None)

    t = mt()
    t.start()
    time.sleep(22)
    rec = 1
    while rec <> None:
        rec = t.q.get()

    del t

Note the time.sleep(22). It takes about 22 seconds to read the DBF
with the 200K records (71MB). It's entirely in cache, yes.

So, If I put this sleep in there, the whole procedure finishes in 22
seconds with 100% CPU (core) usage. Almost as fast as the single
threaded procedure. There is very little overhead.
When I remove the sleep, the procedure finishes in 30 seconds with
~80% CPU (core) usage.
So the threading overhead only happens when I actually cause thread
interaction.

This never happened to me before. Usually (C, Pascal) there was some
threading overhead, but I could always measure it in tenths of a
percent. In this case it's 50% and I'm pretty sure InterlockedExchange
is the fastest thing there can be.

My example currently really is a dummy one. It doesn't do much, only
the reading thread is implemented, but that will change with time.
Reading the data source is one task, I will proceed with calculations
and with a rendering engine, both of which will be pretty CPU
intensive as well.

I'd like to at least make the reading part behave like I want it to
before I proceed. It's clear to me I don't understand Python's
threading concepts yet.

I'd still appreciate further advice on what to do to make this sample
work with less overhead.



More information about the Python-list mailing list