threading - race condition?
skunkwerk
skunkwerk at gmail.com
Mon May 12 15:29:16 EDT 2008
On May 11, 1:55 pm, Dennis Lee Bieber <wlfr... at ix.netcom.com> wrote:
> On Sun, 11 May 2008 09:16:25 -0700 (PDT),skunkwerk
> <skunkw... at gmail.com> declaimed the following in comp.lang.python:
>
>
>
> > the only issue i have now is that it takes a long time for 100 threads
> > to initialize that connection (>5 minutes) - and as i'm doing this on
> > a webserver any time i update the code i have to restart all those
> > threads, which i'm doing right now in a for loop. is there any way I
> > can keep the thread stuff separate from the rest of the code for this
> > file, yet allow access? It wouldn't help having a .pyc or using
> > psycho, correct, as the time is being spent in the runtime? something
> > along the lines of 'start a new thread every minute until you get to a
> > 100' without blocking the execution of the rest of the code in that
> > file? or maybe any time i need to do a search, start a new thread if
> > the #threads is <100?
>
> Is this running as part of the server process, or as a client
> accessing the server?
>
> Alternative question: Have you tried measuring the performance using
> /fewer/ threads... 25 or less? I believe I'd mentioned prior that you
> seem to have a lot of overhead code for what may be a short query.
>
> If the .get_item() code is doing a full sequence of: connect to
> database; format&submit query; fetch results; disconnect from
> database... I'd recommend putting the connect/disconnect outside of the
> thread while loop (though you may then need to put sentinel values into
> the feed queue -- one per thread -- so they can cleanly exit and
> disconnect rather than relying on daemonization for exit).
>
> thread:
> dbcon = ...
> while True:
> query = Q.get()
> if query == SENTINEL: break
> result = get_item(dbcon, query)
> ...
> dbcon.close()
>
> Third alternative: Find some way to combine the database queries.
> Rather than 100 threads each doing a single lookup (from your code, it
> appears that only 1 result is expected per search term), run 10 threads
> each looking up 10 items at once...
>
> thread:
> dbcon = ...
> terms = []
> terminate = False
> while not terminate:
> while len(terms) < 10:
> query = Q.get_nowait()
> if not query: break
> if query == SENTINEL:
> terminate = True
> break
> terms.append(query)
> results = get_item(dbcon, terms)
> terms = []
> #however you are returning items; match the query term to the
> #key item in the list of returned data?
> dbcon.close()
>
> where the final select statement looks something like:
>
> SQL = """select key, title, scraped from ***
> where key in ( %s )""" % ", ".join("?" for x in terms)
> #assumes database adapter uses ? for placeholder
> dbcur.execute(SQL, terms)
> --
> Wulfraed Dennis Lee Bieber KD6MOG
> wlfr... at ix.netcom.com wulfr... at bestiaria.com
> HTTP://wlfraed.home.netcom.com/
> (Bestiaria Support Staff: web-a... at bestiaria.com)
> HTTP://www.bestiaria.com/
thanks again Dennis,
i chose 100 threads so i could do 10 simultaneous searches (where
each search contains 10 terms - using 10 threads). the .get_item()
code is not doing the database connection - rather the intialization
is done in the initialization of each thread. so basically once a
thread starts the database connection is persistent and .get_item
queries are very fast. this is running as a server process (using
django).
cheers
More information about the Python-list
mailing list