Cassandra multiprocessing can't pickle _thread.lock objects

Wed Jun 22 02:13:57 EDT 2016

Daiyue Weng <daiyueweng at gmail.com> writes:
> ...
> I tried to use Cassandra and multiprocessing to insert rows (dummy data)
> concurrently based on the examples in
> ...
> self.pool = Pool(processes=process_count, initializer=self._setup,
> initargs=(session,))
>
> I am wondering how to resolve the issue.

"pickle" is used to serialize Python objects and later, usually in
a different context, recreate the object from the serialization.

Obviously, some objects can be so tightly coupled to the context that
it makes it very difficult to recreate them faithfully. Examples are
locks and files.

Apparently, your framework uses pickle internally (to get objects
from the current processor to the executing processors). As a consequence,
you must ensure that those objects contain only pickleable objects (and
especially no locks).

In your code above, the look might come from "self" or "session".

In the first case, a workaround might be that you put the look
on the class (rather than the object) level; this way, the look would
not be pickled.

The second case would be more difficult. Likely, you would need
to extract pickleable subobjects from your "session" and maybe recreate
a session object from them on the destination processor.