Persistent variable in subprocess using multiprocessing?

mheavner miheavner at gmail.com
Thu Jul 16 13:37:45 EDT 2009


I realize that the Queue would be the best way of doing this, however
that involves transferring the huge amount of data for each call - my
hope was to transfer it once and have it remain in memory for the
subprocess across run() calls.

On Jul 16, 1:18 pm, Piet van Oostrum <p... at cs.uu.nl> wrote:
> >>>>> mheavner <miheav... at gmail.com> (m) wrote:
> >m> 'The process' refers to the subprocess. I could do as you say, load
> >m> the data structure each time, but the problem is that takes a
> >m> considerable amount of time compared to the the actual computation
> >m> with the data it contains. I'm using these processes within a loop as
> >m> follows:
> >m>          # Don't recreate processes or Queues
> >m>          pop1 = Queue()
> >m>          pop2 = Queue()
> >m>          pop_out = Queue()
> >m>          p1 = CudaProcess(0, args=(costf,pop1,pop_out))
> >m>          p2 = CudaProcess(1, args=(costf,pop2,pop_out))
> >m>          # Main loop
> >m>          for i in range(maxiter):
> >m>                  print 'ITERATION: '+str(i)
> >m>                  if log != None:
> >m>                          l = open(log,'a')
> >m>                  l.write('Iteration: '+str(i)+'\n')
> >m>                  l.close()
> >m>                  # Split population in two
> >m>                  pop1.putmany(pop[0:len(pop)/2])
> >m>                  pop2.putmany(pop[len(pop)/2:len(pop)])
> >m>                  # Start two processes
> >m>                  if not p1.isAlive():
> >m>                          p1.start()
> >m>                          print 'started %s'%str(p1.getPid())
> >m>                  else:
> >m>                          p1.run()
>
> That won't work. p1.run() will execute the run method in the Master
> process, not in the subprocess. And if it would your could would have a
> race condition: between the p1.isAlive() (which must be is_alive btw), and
> the p1.run() the process can have stopped.
>
> The proper way to do is to put the work in a Queue and let the processes
> get work out of the Queue. The datastructure will remain in the process
> then.
>
> >m>                  if not p2.isAlive():
> >m>                          p2.start()
> >m>                          print 'started %s'%str(p2.getPid())
> >m>                  else:
> >m>                          p2.run()
> >m>                  .
> >m>                  .
> >m>                  .
> >m> So I'd like to load that data into memory once and keep there as long
> >m> as the process is alive (ideally when the subprocess is created,
> >m> storing some sort of pointer to it), rather than loading it each time
> >m> run is called for a process within the loop. Could be my CudaProcess
> >m> class - I'll check out what Diez suggested and post back.
>
> --
> Piet van Oostrum <p... at cs.uu.nl>
> URL:http://pietvanoostrum.com[PGP 8DAE142BE17999C4]
> Private email: p... at vanoostrum.org




More information about the Python-list mailing list