[New-bugs-announce] [issue41722] multiprocess error on large dataset
vishal rao
report at bugs.python.org
Fri Sep 4 19:14:43 EDT 2020
New submission from vishal rao <vishalraoanizer at gmail.com>:
I am processing a large pandas dataframe using pathos framework which internally uses Python multiprocess package. I get the following error when i run the code with a large dataset. The issue doesn't occur on smaller datasets.
/opt/conda/lib/python3.7/site-packages/pathos/multiprocessing.py in map(self, f, *args, **kwds)
135 AbstractWorkerPool._AbstractWorkerPool__map(self, f, *args, **kwds)
136 _pool = self._serve()
--> 137 return _pool.map(star(f), zip(*args)) # chunksize
138 map.__doc__ = AbstractWorkerPool.map.__doc__
139 def imap(self, f, *args, **kwds):
/opt/conda/lib/python3.7/site-packages/multiprocess/pool.py in map(self, func, iterable, chunksize)
266 in a list that is returned.
267 '''
--> 268 return self._map_async(func, iterable, mapstar, chunksize).get()
269
270 def starmap(self, func, iterable, chunksize=None):
/opt/conda/lib/python3.7/site-packages/multiprocess/pool.py in get(self, timeout)
655 return self._value
656 else:
--> 657 raise self._value
658
659 def _set(self, i, obj):
/opt/conda/lib/python3.7/site-packages/multiprocess/pool.py in _handle_tasks(taskqueue, put, outqueue, pool, cache)
429 break
430 try:
--> 431 put(task)
432 except Exception as e:
433 job, idx = task[:2]
/opt/conda/lib/python3.7/site-packages/multiprocess/connection.py in send(self, obj)
207 self._check_closed()
208 self._check_writable()
--> 209 self._send_bytes(_ForkingPickler.dumps(obj))
210
211 def recv_bytes(self, maxlength=None):
/opt/conda/lib/python3.7/site-packages/multiprocess/connection.py in _send_bytes(self, buf)
394 n = len(buf)
395 # For wire compatibility with 3.2 and lower
--> 396 header = struct.pack("!i", n)
397 if n > 16384:
398 # The payload is large so Nagle's algorithm won't be triggered
error: 'i' format requires -2147483648 <= number <= 2147483647
I ran the code in debug mode, and saw that the value of n was 3140852627.
----------
components: Library (Lib)
messages: 376415
nosy: vishalraoanizer
priority: normal
severity: normal
status: open
title: multiprocess error on large dataset
versions: Python 3.7
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue41722>
_______________________________________
More information about the New-bugs-announce
mailing list