Strange crashes

Chris Angelico rosuav at gmail.com
Thu Dec 12 01:05:37 EST 2013


On Thu, Dec 12, 2013 at 4:47 PM, Igor Korot <ikorot01 at gmail.com> wrote:
> So, when I find the culprit variable what do I do?
> Make it a part of some class? Protect it with mutex?
>
> How to solve this? And most importantly, how do _I_ verify that its solved?

>From the look of the error, protecting it with a mutex won't help.

I'm guessing you have multiple threads sharing work, which I hope
implies that your work is I/O bound rather than CPU bound (in CPython,
normally only one thread will be executing Python code at a time). If
those threads are casually sharing objects, you are going to have a
LOT of problems. What you need to do is isolate it down so there are
three groups of objects in the system:

1) Global read-only objects. This includes all your builtins and
stuff, these are fine and safe. It doesn't matter that the 'len'
function is shared across all your threads, because nobody's changing
anything in it.

2) Per-thread objects. At no time do these ever go into any sort of
global namespace; they're always function-local. Again, these are also
guaranteed safe, because there's no way any other thread can tamper
with them.

3) Shared, mutable objects. These are your dangerous ones, the ones
you need to watch carefully.

Since SQLite objects can't (apparently) be shared, it makes most sense
to lock them into group 2. A thread acquires an SQLite connection,
uses it, closes it, and that's that. Nothing shared. If another thread
wants the database, it should create its own independent connection.

If your code is too complicated to be able to figure out which
category everything's in, it might be worth de-threading it
temporarily. Call each thread-function one by one - effectively, have
one thread finish before another thread starts - and then start
refactoring from there. It might help. Alternatively, you could try
switching to multiprocessing, which might highlight problems more
visibly, but that might just introduce more confusion.

It's really a matter of code discipline. In C, the distinction would
be between stack and global/heap data; in Python, everything's stored
on the heap, but there's still the same distinction. It's worth
keeping straight which is which.

ChrisA



More information about the Python-list mailing list