out of memory with processing module

Brian knairb01 at yahoo.com
Mon Apr 20 12:32:17 EDT 2009


On Apr 20, 9:18 am, alessiogiovanni.bar... at gmail.com wrote:
> On 20 Apr, 17:03, Brian <knair... at yahoo.com> wrote:
>
>
>
> > I'm using the third-party "processing" module in Python 2.5, which may
> > have become the "multiprocessing" module in Python 2.6, to speed up
> > the execution of a computation that takes over a week to run. The
> > relevant code may not be relevant, but it is:
>
> >             q1, q2 = processing.Queue(), processing.Queue()
> >             p1 = processing.Process(target=_findMaxMatch, args=
> > (reciprocal, user, clusters[1:(numClusters - 1)/2], questions,
> > copy.copy(maxMatch), q1))
> >             p2 = processing.Process(target=_findMaxMatch, args=
> > (reciprocal, user, clusters[(numClusters - 1)/2:], questions, copy.copy
> > (maxMatch), q2))
> >             p1.start()
> >             p2.start()
> >             maxMatch1 = q1.get()[0]
> >             maxMatch2 = q2.get()[0]
> >             p1.join()
> >             p2.join()
> >             if maxMatch1[1] > maxMatch2[1]:
> >                 maxMatch = maxMatch1
> >             else:
> >                 maxMatch = maxMatch2
>
> > This code just splits up the calculation of the cluster that best
> > matches 'user' into two for loops, each in its own process, rather
> > than one. (It's not important what the cluster is.)
>
> > The error I get is:
>
> > [21661.903889] Out of memory: kill process 14888 (python) score 610654
> > or a child
> > [21661.903930] Killed process 14888 (python)
> > Traceback (most recent call last):
> > ...etc. etc. ...
>
> > Running this process from tty1, rather than GNOME, on my Ubuntu Hardy
> > system allowed the execution to get a little further than under GNOME.
>
> > The error was surprising because with just 1 GB of memory and a single
> > for loop I didn't run into the error, but with 5 GB and two processes,
> > I do. I believe that in the 1 GB case there was just a lot of
> > painfully slow swapping going on that allowed it to continue.
> > 'processing' appears to throw its hands up immediately, instead.
>
> > Why does the program fail with 'processing' but not without it? Do you
> > have any ideas for resolving the problem? Thanks for your help.
>
> If your program crashes with more of one process, maybe you handle the
> Queue objects
> not properly? If you can, post the code of _findMaxMatch.


Thanks for your interest. Here's _findMaxMatch:

def _findMaxMatch(reciprocal, user, clusters, sources, maxMatch,
queue):
    for clusternumminusone, cluster in enumerate(clusters):
        clusterFirstData, clusterSecondData = cluster.getData(sources)
        aMatch = gum.calculateMatchGivenData(user.data, None, None,
None, user2data=clusterSecondData)[2]
        if reciprocal:
            maxMatchB = gum.calculateMatchGivenData(clusterFirstData,
None, None, None, user2data=user.secondUserData)[2]
            aMatch = float(aMatch + maxMatchB) / 2
        if aMatch > maxMatch[1]:
            maxMatch = [clusternumminusone + 1, aMatch]
    queue.put([maxMatch])




More information about the Python-list mailing list