Why pickling (was: Traceback when using multiprocessing)

Fri Nov 22 18:50:54 EST 2013

On Sat, Nov 23, 2013 at 3:38 AM, John Ladasky
<john_ladasky at sbcglobal.net> wrote:
> On Thursday, November 21, 2013 8:24:05 PM UTC-8, Chris Angelico wrote:
>
>> Oh, that part's easy. Let's leave the multiprocessing module out of it
>> for the moment; imagine you spin up two completely separate instances
>> of Python. Create some object in one of them; now, transfer it to the
>> other. How are you going to do it?
>
> For what definition of "completely separate"?
>
> If I have two instances of the same version of the Python interpreter running on the same hardware, and the same operating system, I expect I would just copy a block of memory from one interpreter to the other, and then write some new pointers.  That kind of data sharing has to be the most common kind.  It's also the simplest.

Okay, so you copy a block of memory. Now how are you going to
guarantee that you picked up everything that object references? Python
objects frequently reference other objects:

send_me = [1.0, 2.0, 3.0]

The block of memory might have the addresses of those three floats,
but that'll be invalid in the target. Somehow you need to package up
this object and everything else you need.

Ultimately, you need some system for turning a single object reference
(a pointer, if you like) into the entire package of information needed
to recreate that object on the other side. That's what pickling is.
It's a compact (with people to fight for its compactness, there's
current discussion elsewhere about that) format that can be easily
transferred around, which refcounted blocks of memory can't.

ChrisA