[pypy-dev] 1.4.1 threading

Mon Dec 27 07:00:30 CET 2010

(dan) random binary trees: O(log2(n)) is 7.2 for builtins, though the
constant factor for locks, etc, might make it worthwhile

(paolo) non blocking hash maps: memory barriers can be quite costly too

different options will have to be implemented and tested when we get there,
speaking on which, is there a test dict load?
does anyone have some profiling data, what portion of runtime is spent
reading and writing attributes and globals anyways?

while we are on the subject, is there a plan to provide different
levels of sparseness for different levels of name lookup?
for example globals vs builtins, first needs to be quite sparce so
that builtins show through well, second hardly, because there's
nowhere else to look if builtins don't have the name.
then the tradeoff could be more dynamic, that is frequently accessed
dicts could be e.g. more sparse and rarely accessed more compact?
(obviousely more sparse is not always better, e.g. in terms of cpu cache)
of course "frequently accessed" is not as easy as frequently ran code,
e.g. here "class A: ...; while 1: A().func()", func lookup occurs in
different objects, yet it is same logical operation.

come to think of it, is there any point in polymorphic dicts, e.g.
attribute access could be imeplemented as a near-perfect compact hash
map if attribute names change rarely, while regular dict()s are
expected to change keys often.

Anyhow, back to parallel interpretation, Is there an issue or page
that tracks what's needed for parallel interpretation?
so far  mentioned: gc, dict, c api

Btw, I think that jit is more important at the moment, but time comes
when jit juice has been mostly squeezed out ;-)

d.

On 23 December 2010 07:31, Paolo Giarrusso <p.giarrusso at gmail.com> wrote:
> On Thu, Dec 23, 2010 at 06:58, Dan Stromberg <drsalists at gmail.com> wrote:
>> On Wed, Dec 22, 2010 at 7:05 PM, Benjamin Peterson <benjamin at python.org> wrote:
>>> 2010/12/22 Dima Tisnek <dimaqq at gmail.com>:
>>>> Oh, I can't yet use alternative gc that obviates GIL then?
>>>>
>>>> Or am I totally confused and pypy still uses GIL for other reasons,
>>>> e.g. globals dict safety?
>>>
>>> All of the above.
>>
>> On the topic of parallel dict safety...  I've not played around with
>> parallel dictionaries at all, but I've heard that treaps can be
>> parallelized nicely, and they can be (and have been) set up to look
>> much like a python dictionary.  I admit, I coded one and put it at
>> http://stromberg.dnsalias.org/~strombrg/treap/ - however, the
>> implementation is not (yet?) suitable for parallel use.
>>
>> It's coded in such a way that it'll work as pure python or cython,
>> depending on which of two m4 symbols you define.  I've used the pure
>> python version on pypy.
>>
>> Treaps tend to make just about everything O(log(n)), have good average
>> performance, and are not as even performers as red-black trees (that
>> is, they give a higher standard deviation, and a better average
>> performance than red-black trees).  And they don't play that nicely
>> with locality of reference (caches) - things are intentionally
>> randomized.  But if you throw enough cores at them, they might still
>> be a win compared to a big hash table with a big lock wrapped around
>> it.
>
> That might be interesting; however, the state of the art includes many
> concurrent hash map alternatives which are way smarter than "a lock
> wrapped around". Java >=1.5 has one, by Doug Lea*. An even better (and
> more complex) one, by Cliff Click*, used in production on >500 cores,
> and presented to JavaOne, can be found by googling:
>
> nonblockinghashmap Cliff Click
>
> A C implementation can be found at:
> http://code.google.com/p/nbds/
>
> Link found through this blog post of Cliff Click:
> http://www.azulsystems.com/blog/cliff-click/2008-11-13-some-source-forge-threads-nbhm
>
> However, I don't think you can code them in nowadays Python, because
> it offers neither a Java-equivalent volatile attribute nor
> compare-and-swap nor memory barriers, and these structures can't be
> implemented with plain locks.
>
> Best regards
>
> * Doug Lea wrote the memory allocator used on Linux, is behind the 1.5
> java.util.concurrent package, and much more. Cliff Click is a
> high-performance JVM hacker.
> --
> Paolo Giarrusso - Ph.D. Student
> http://www.informatik.uni-marburg.de/~pgiarrusso/
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev