Use a Thread to reload a Module?

Carl Banks pavlovevidence at gmail.com
Fri Dec 22 23:02:31 EST 2006


Gregory Piñero wrote:
> Hi Python Experts,
>
> I hope I can explain this right.  I'll try.
>
> Background:
> I have a module that I leave running in a server role.  It has a
> module which has data in it that can change.  So every nth time a
> function in the server gets called, I want to reload the module so it
> has the freshest data.  But there's a lot of data so it takes 5-10
> seconds to do a reload.
>
> My question is:
> Would I be able to launch a seperate thread to reload the module while
> the server does it work?  Hopefully it would be using the old module
> data right up until the thread was finished reloading.
>
> Thanks in advance,
>
> Greg
>
> Here's some psuedo code that might illustrate what I'm talking about:
>
> import lotsa_data
>
> def serve():
>     reload(lotsa_data) #can this launch a thread so "do_stuff" runs right away?
>     do_stuff(lotsa_data.data)
>
> while 1:
>     listen_for_requests()

Using a thread for this purpose is no problem.  Using a module: yep,
that's a problem.  (I'd say using a module in this way, to update data,
is very far from best practice, but its convenience justifies simple
uses.  You are going beyond simple now, though.)

Not knowing more about your program, I'd say the simplest way is:

1. exec, don't reload, your data file (with the standard warning that
exec should only be used on carefully contructed code, or to
deliberately give the user the power to input python code--of course
the same warning applies when reloading a dynamically generated
module).

2. Store the new data somewhere (such as a Queue) waiting for a good
time to update.

3. At a convenient time, overwrite the old data in the module.

I'm going to assume that your server has heretofore been
single-threaded; therefore you don't need locks or queues or semaphores
in your main code.  Here, then, is a very improvable example to
consider.  Notice that the lotsa_data module is empty.  Instead, you
call load_data() to exec the file where the data really is, and it puts
the loaded data into a queue.  Next time you wait for a new request, it
checks to see if there are any data updates in the queue, and updates
the date in lotsa_data module if so.

import Queue
import threading
import lotsa_data ## empty!

data_update_queue = Queue.Queue()

def serve():
    if request_count % n:
        threading.Thread(target=load_data).start()
    do_stuff(lotsa_data.data)
    request_count += 1

def load_data()
    d = {}
    exec "/path/to/data/file" in d
    data_update_queue.put(d)

load_data() # run once in main thread to load data initially
while True:
    try:
        d = data_update_queue.get_nowait(d)
    except Queue.Empty:
        pass
    else:
        for key,value in d:
            setattr(losta_data,key,value)
    listen_for_requests()


There is much room for improvement.  For example, can requests come in
fast enough to spawn another load_data before the first had ended?  You
should consider trying to acquire a threading.Lock in load_data and
waiting and/or returning immediately if you can't.  Other things can go
wrong, too.  Using threads requires care.  But hopefully this'll get
you started.


Carl Banks




More information about the Python-list mailing list