multiprocessing and dictionaries

Piet van Oostrum piet at cs.uu.nl
Mon Jul 13 15:12:18 EDT 2009


>>>>> Bjorn Meyer <bjorn.m.meyer at gmail.com> (BM) wrote:

>BM> Here is what I have been using as a test.
>BM> This pretty much mimics what I am trying to do.
>BM> I put both threading and multiprocessing in the example which shows
>BM> the output that I am looking for.

>BM> #!/usr/bin/env python

>BM> import threading
>BM> from multiprocessing import Manager, Process

>BM> name = ('test1','test2','test3')
>BM> data1 = ('dat1','dat2','dat3')
>BM> data2 = ('datA','datB','datC')

[snip]

>BM> def multiprocess_test(name,data1,data2, mydict):
>BM>   for nam in name:
>BM>     for num in range(0,3):
>BM>       mydict.setdefault(nam, []).append(data1[num])
>BM>       mydict.setdefault(nam, []).append(data2[num])
>BM>   print 'Multiprocess test dic:',mydict

I guess what's happening is this:

d.setdefault(nam, []) returns a list, initially an empty list ([]). This
list gets appended to. However, this list is a local list in the
multi-process_test Process, therefore the result is not reflected in the
original list inside the manager. Therefore all your updates get lost.
You will have to do operations directly on the dictionary itself, not on
any intermediary objects. Of course with the threading the situation is
different as all operations are local.

This works:

def multiprocess_test(name,data1,data2, mydict):
  print name, data1, data2
  for nam in name:
    for num in range(0,3):
      mydict.setdefault(nam, [])
      mydict[nam] += [data1[num]]
      mydict[nam] += [data2[num]]
  print 'Multiprocess test dic:',mydict

If you have more than one process operating on the dictionary
simultaneously you have to beware of race conditions!!
-- 
Piet van Oostrum <piet at cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: piet at vanoostrum.org



More information about the Python-list mailing list