Copy-on-write when forking a python process

John Connor john.theman.connor at gmail.com
Fri Apr 8 12:14:19 EDT 2011


Hi all,
Long time reader, first time poster.

I am wondering if anything can be done about the COW (copy-on-write)
problem when forking a python process.  I have found several
discussions of this problem, but I have seen no proposed solutions or
workarounds.  My understanding of the problem is that an object's
reference count is stored in the "ob_refcnt" field of the PyObject
structure itself.  When a process forks, its memory is initially not
copied. However, if any references to an object are made or destroyed
in the child process, the page in which the objects "ob_refcnt" field
is located in will be copied.

My first thought was the obvious one: make the ob_refcnt field a
pointer into an array of all object refcounts stored elsewhere.
However, I do not think that there would be a way of doing this
without adding a lot of complexity.  So my current thinking is that it
should be possible to disable refcounting for an object.  This could
be done by adding a field to PyObject named "ob_optout".  If ob_optout
is true then py_INCREF and py_DECREF will have no effect on the
object:


from refcount import optin, optout

class Foo: pass

mylist = [Foo() for _ in range(10)]
optout(mylist)  # Sets ob_optout to true
for element in mylist:
	optout(element) # Sets ob_optout to true
	
Fork_and_block_while_doing_stuff(mylist)

optin(mylist) # Sets ob_optout to false
for element in mylist:
	optin(element) # Sets ob_optout to false


Has anyone else looked into the COW problem?  Are there workarounds
and/or other plans to fix it?  Does the solution I am proposing sound
reasonable, or does it seem like overkill?  Does anyone foresee any
problems with it?

Thanks,
--jac



More information about the Python-list mailing list