multiprocessing & more

Sun Feb 13 12:34:00 EST 2011

Hi everyone, I have a few questions about my implementation, which doesn't make me totally happy.

Suppose I have a very long process, which during its executiong logs
something, and the logs are is in n different files in the same
directory.

Now in the meanwhile I want to be able to do realtime analysis in
python, so this is what I've done (simplifying):

def main():
	from multiprocessing import Value, Process
	is_over = Value('h', 0)
	Process(target=run, args=(conf, is_over)).start()
	# should also pass the directory with the results
	Process(target=analyze,
	        args=(is_over, network, events, res_dir)).start()

def run():
    sim = subprocess.Popen(TEST_PAD, shell=True, stdout=subprocess.PIPE,
                           stderr=subprocess.PIPE)
    out, err = sim.communicate()
    ret = sim.wait()
    # at this point the simulation is over, independently from the result
    print "simulation over, can also stop the others"
    is_over.value = 1

def analyze():
    ...

First of all, does it make sense to use multiprocessing and a short
value as boolean to check if the simulation is over or not?

Then the other problem is that I need to read many files, and the idea
was a sort of "tail -f", but on all of them at the same time.  Since I
have to keep track of the status for each of them I ended up with
something like this:

class LogFileReader(object):
    def __init__(self, log_file):
        self.log_file = log_file
        self.pos = 0

    def get_line(self):
        src = open(self.log_file)
        src.seek(self.pos)
        lines = src.readlines()
        self.pos = src.tell()
        return lines

which I'm also not really sure it's the best way, then in analyze()
I have a dictionary which keeps track of all the "readers"

    log_readers = {}
    for out in glob(res_dir + "/*.out"):
        node = FILE_OUT.match(out).group(1)
        nodeid = hw_to_nodeid(node)
        log_readers[nodeid] = LogFileReader(out)

Maybe having more separate processes might be more clean, but since I
have to merge the data it might be a mess...

As last thing to know when to start to analyze the data I thought about this

        while len(listdir(res_dir)) < len(network):
            sleep(0.2)

which in theory it should be correct, when there are enough files as
the number of nodes in the network everything should be written.  BUT
once every 5 times I get an error, telling me one file doens't exists.

That means that for listdir the file is already there but trying to
access to it gives error, how is that possible?

THanks a lot, and sorry for the long mail