[Python-ideas] easy thread-safety [was: fork]

Andrew Barnert abarnert at yahoo.com
Thu Aug 20 23:26:44 CEST 2015


On Aug 20, 2015, at 13:02, Ron Adam <ron3200 at gmail.com> wrote:
> 
>> On 08/20/2015 12:51 PM, Steven D'Aprano wrote:
>>> On Thu, Aug 20, 2015 at 12:27:55PM -0400, Ron Adam wrote:
>>> 
>>> >When a bytecode to load an object is executed such as LOAD_FAST, it gets
>>> >it's reference to the object from the function's list of names in it's
>>> >code object.
>> Bytes codes are implementation, not semantics: there is no part of the
>> Python language that promises that retrieving local variables will use a
>> byte code LOAD_FAST. That's not how IronPython or Jython work, and there
>> is no stability guarantee that CPython will always use LOAD_FAST.
> 
> Yes, but semantics needs a workable implementation at some point.
> 
> The semantics is to have a way to make names (for mutable objects) in outer scopes not be visible to function defined in inner scopes.
> 
>  def foo(x):
>      """ Example of hiding a mutable object from inner scopes. """
>      mutable items
>      items = [1, 2, 3]
>      def bar(y):
>      """ can't see items here. So can't mutate it."""
>          return -y
>      return [bar(y)+x for y in items]
> 
> So foo is able to protect items from being mutated by functions defined in it's scope.   We could use localonly instead of mutable, but in the context of threading mutable may be more appropriate. (Way too soon to decide what color to paint this bike.)
> 
> It may seem like it isn't needed, because you have control over what a function has access too... ie... just don't do that.  But when you have many programmers working on large projects, things can get messy.  And this helps with that, but also helps in the case of threads.

The only case you're helping with here is the case where the race is entirely local to one function and the functions it defines--a relatively uncommon case that also gets the least messy and is the easiest to spot and debug.

Also, the "dangerous" cases are already marked today: the local function has to explicitly declare the variable nonlocal or it can't assign to it.

>>> >LOAD_FAST 0,  reads __code__.co_varnames[0]
>>> >LOAD_FAST 1,  reads __code__.co_varnames[1]
>>> >
>>> >Adding a co_mutables name list to the __code__ attribute, along with new
>>> >bytecodes to access them, it would create a way to keep private local
>>> >names without changing how the other bytecodes work.
>>> >
>>> >LOAD_MUTABLE 0,  would get the first reference in __code__.co_mutables.

As a side note, closure variables aren't accessed by LOAD_FAST and SAVE_FAST from either side; that's what cellvars and freevars are for. So, your details don't actually work. But it's not hard to s/fast/cell/ and similar in your details and understand what you mean.

But I don't see why you couldn't just implement the "mutable" keyword to mean that the variable must be in names rather than cellnames or freenames or varnames (raising a compile-time error if that's not possible) and just continue using *_FAST on them. That would be a lot simpler to implement. It's also a lot simpler to explain: declaring a variable mutable means it can't participate in closures.

> As I said it's a partial solution.  Shared & mutable names passed as function arguments will still need to be protected

The problem here is the same as in Sven's proposal: the problem is shared values, not shared variables, so any solution that just tries to limit shared variables is only a vanishingly tiny piece of the solution. 

It doesn't do anything for mutable values passed to functions, or returned or yielded, or stored on self, or stored in any other object's attributes or in any container, or even in globals.

It also doesn't prevent you from mutating them by calling a method (including __setattr__ and __setitem__ and the various __i*__ methods as well as explicit method calls).

And, even if you managed to solve all of those problems, it still wouldn't be useful, because it doesn't do anything for any case where you share a member or element of the object rather than the object itself--e.g., if I have a dict mapping sockets to Connection objects, marking the dict unshareable doesn't protect the Connection objects in any way.



More information about the Python-ideas mailing list