Global Threading Lock 2 - Re: "RuntimeError: dictionary changed size during iteration"..

Mon Mar 13 09:14:58 EST 2006

Raymond Hettinger wrote:
>>Is a copy.deepcopy  ( -> "cPickle.dump(copy.deepcopy(obj),f)" ) an
>>atomic opertion with a guarantee to not fail?
> 
> No.  It is non-atomic.
> 
> It seems that your application design intrinsically incorporates a race
> condition -- even if deepcopying and pickling were atomic, there would
> be no guarantee whether the pickle dump occurs before or after another
> thread modifies the structure.  While that design smells of a rat, it
> may be that your apps can accept a dump of any consistent state and
> that possibly concurrent transactions may be randomly included or
> excluded without affecting the result.

Yes it is designed so with a discipline to be consistent and to allow 
many threads without much locking; and the .dump is a autosave/backup 
(=> ok at any time).

The requirement is weaker than atomicity. "Not crash" would be ok. In 
that case the file was stored half => a corrupt pickle.
( In that case the app worked with a auto-multi-backup strategy, so the 
crashed app recovered auto form the next backup at UnpicklingError, but 
  a real workaround is not possible without rewriting dump or deepcopy - 
I use this multi-try on RuntimeError so far, but thats not "legal Python 
code" )

> Python's traditional recommendation is to put all access to a resource
> in one thread and to have other threads communicate their transaction
> requests via the Queue module.  Getting results back was either done
> through other Queues or by passing data through a memory location
> unique to each thread.  The latter approach has become trivially simple
> with the advent of Py2.4's thread-local variables.

(passing through TLS? TLS are usally used for not-passing, or?)

That queue/passing-through-only-an-extra-global-var communication is 
acceptable for thin thread interaction.
( hope this extra global var is thread-safe in future Python's :-) )

But "real" thread-programming should also be possible in Python - and it 
is with the usual discipline in thread programming. This RuntimeError in 
iterations is the (necessary) only compromise, I know of. (Maybe this 
RuntimeError must not even be thrown from Python, when walking through 
variable sequences is done smartly - but smart practice may cost speed, 
so a compromise.)

It can be handled commonly by keys() and some error catching. key 
functions like deepcopy and dump (which cannot easily be subclassed) 
should fit into that "highest common factor" and not "judge" themselves 
about _how_ thread programming has to be done.

> Thinking about future directions for Python threading, I wonder if
> there is a way to expose the GIL (or simply impose a temporary
> moratorium on thread switches) so that it becomes easy to introduce
> atomicity when needed:
> 
>    gil.acquire(BLOCK=True)
>    try:
>       #do some transaction that needs to be atomic
>    finally:
>       gil.release()

Thats exectly what I requested here:

<duvp5e$2rpm$1 at ulysses.news.tiscali.de>

and here:

<dv0vmn$d8$1 at ulysses.news.tiscali.de>

That "practical hammer" (little ugly, but very practical) would enable 
to keep big threaded code VHL pythonic and keep us from putting 
thousands of trivial locks into the code in low level language manner. 
Some OS-functions like those of the socket module (on UNIX) do so anyway 
( often unwanted :-( )

In addition Python should define its time atoms, and thus also the 
definite sources of this (unavoidable?) RuntimeError - as explained in 
the later link.

>>Or can I only retry several times in case of RuntimeError?  (which would
>>apears to me as odd gambling; retry how often?)
> 
> 
> Since the app doesn't seem to care when the dump occurs,  it might be
> natural to put it in a while-loop that continuously retries until it
> succeeds; however, you still run the risk that other threads may never
> leave the object alone long enough to dump completely.

I have 5 trials max as of now. The error was about once in 3 months in 
my case: that should solve the problem for the rest of the universe ...
If not, there is another bug going on.

I may switch to a solution with subclassed deepcopy withough 
.iteritems(). But its lot of work to ensure,that it is really ok  - and 
consumes another few megs of memory and a frequent CPU peakload. So I 
may leave the loop and may probably not switch at all ...

Robert