Pool.map mongodb cursor

Christian mining.facts at gmail.com
Fri Jun 14 10:20:37 EDT 2013


Hi,

is it possible to avoid some memory overhead  with a mongodb cursor and 
multiprocessing? Regarding to the size of  the cursor,  Python consuming at first a lot of memory. However the estimation is independend 
among each other document (chunking?).

Maybe there is a better way using multiprocessing in place of
Pool?

score_proc_pool.map(scoring_wrapper,mongo_cursor,chunksize=10000)    

Inside the scoring_wrapper I'm writing estimated scores without a return
value.


def scoring_wrapper(doc):
     ........
     profiles.update({'anyid':anyid},
     {
     '$set':{'profile':value}
      },upsert=True)


Thanks in advance
Christian









More information about the Python-list mailing list