[Python-Dev] thread semantics for file objects

Paul Moore p.f.moore at gmail.com
Fri Mar 18 14:16:19 CET 2005


On Fri, 18 Mar 2005 07:57:25 +0100, "Martin v. Löwis"
<martin at v.loewis.de> wrote:
> The guarantee that "we" want to make is certainly stronger: if the
> threads all read from the same file, each will get a series of "chunks".
> The guarantee is that it is possible to combine the chunks in a way to
> get the original contents of the file (i.e. not only the sum of the
> bytes is correct, but also the contents).

That would be a useful property to be able to rely on, certainly.
(Although in practical terms, probably a lot less than people would
*like* to see guaranteed :-))

> However, I see little value adding this specific guarantee to the
> documentation when so many other aspects of thread interleaving
> are unspecified.

I'm not sure I agree. It's an improvement in the situation, so why not
add it? It may even encourage others, when thinking about threading
issues, to consider whether the documentation should guarantee
anything - and if so, to add that guarantee. Over time, the
documentation gets better at describing thread-related behaviour - and
correspondingly, people get (somewhat) more confident that where the
documentation doesn't guarantee things, it's because there is a good
reason.

> For example, if a thread reads a dictionary simultaneous to a write
> in another thread, and the read and the write deal with different
> keys, there is a guarantee that they won't affect each other. If they
> operate on the same key, the read either gets the old value, or the
> new value, but not both.

If this is a genuine guarantee, then let's document it! I asked about
precisely this issue on python-list a long while ago, and no-one could
provide me with a confident answer (I couldn't be sure myself, my head
explodes when I try to understand thread-related code). The only
confident answer I got was "you're safe if you use a lock", but taking
that position to extremes results in massive levels of unnecessary
serialisation.

> Writing down all these properties does little good, IMO.

Not a huge amount of good, certainly. But no harm, and a little bit of
direct good, and also some indirect good in terms of making it clear
that the issue has been thought about. I suppose what I am saying that
there is a practical difference between "undefined" and "unknown",
even if there isn't a theoretical one...

Of course, there's an implied requirement here to confirm any
documented guarantees in Jython, and IronPython, and PyPy, and... But
given that none of these (yet) implement the full Python 2.4 language
definition, as far as I am aware, it's probably not sensible to get
too hung up on this fact (although confirming that a guarantee doesn't
cause major implementation difficulties would be reasonable).

Paul.


More information about the Python-Dev mailing list