[pypy-dev] Would the following shared memory model be possible?

Thu Jul 29 09:57:58 CEST 2010

On 29 July 2010 17:40, Maciej Fijalkowski <fijall at gmail.com> wrote:
> On Thu, Jul 29, 2010 at 9:32 AM, William Leslie
> <william.leslie.ttg at gmail.com> wrote:
>> On 29 July 2010 17:27, Maciej Fijalkowski <fijall at gmail.com> wrote:
>>> On Thu, Jul 29, 2010 at 7:18 AM, William Leslie
>>> <william.leslie.ttg at gmail.com> wrote:
>>>> When an object is mutable, it must be visible to at most one thread.
>>>> This means it can participate in return values, arguments and queues,
>>>> but the sender cannot keep a reference to an object it sends, because
>>>> if the receiver mutates the object, this will need to be reflected in
>>>> the sender's thread to ensure internal consistency. Well, you could
>>>> ignore internal consistency, require explicit locking, and have it
>>>> segfault when the change to the length of your list has propogated but
>>>> not the element you have added, but that wouldn't be much fun. The
>>>> alternative, implicitly writing updates back to memory as soon as
>>>> possible and reading them out of memory every time, can be hundreds or
>>>> more times slower. So you really can't have two tasks sharing mutable
>>>> objects, ever.
>>>>
>>>> --
>>>> William Leslie
>>>
>>> Hi.
>>>
>>> Do you have any data points supporting your claim?
>>
>> About the performance of programs that involve a cache miss on every
>> memory access, or internal consistency?
>>
>
> I think I lost some implication here. Did I get you right - you claim
> that per-object locking in case threads share obejcts are very
> expensive, is that correct? If not, I completely misunderstood you and
> my question makes no sense, please explain. If yes, why does it mean a
> cache miss on every read/write?

I claim that there are two alternatives in the face of one thread
mutating an object and the other observing:

0. You can give up consistency and do fine-grained locking, which is
reasonably fast but error prone, or
1. Expect python to handle all of this for you, effectively not making
a change to the memory model. You could do this with implicit
per-object locks which might be reasonably fast in the absence of
contention, but not when several threads are trying to use the object.

Queues already are in a sense your per-object-lock,
one-thread-mutating, but usually one thread has acquire semantics and
one has release semantics, and that combination actually works. It's
when you expect to have a full memory barrier that is the problem.

Come to think of it, you might be right Kevin: as long as only one
thread mutates the object, the mutating thread never /needs/ to
acquire, as it knows that it has the latest revision.

Have I missed something?

-- 
William Leslie