[Python-Dev] PEP 509: Add a private version to dict

Wed Jan 20 15:23:59 EST 2016

On Wed, 20 Jan 2016 at 10:46 Yury Selivanov <yselivanov.ml at gmail.com> wrote:

> Brett,
>
> On 2016-01-20 1:22 PM, Brett Cannon wrote:
> >
> >
> > On Wed, 20 Jan 2016 at 10:11 Yury Selivanov <yselivanov.ml at gmail.com
> > <mailto:yselivanov.ml at gmail.com>> wrote:
> >
> >     On 2016-01-18 5:43 PM, Victor Stinner wrote:
> >     > Is someone opposed to this PEP 509?
> >     >
> >     > The main complain was the change on the public Python API, but
> >     the PEP
> >     > doesn't change the Python API anymore.
> >     >
> >     > I'm not aware of any remaining issue on this PEP.
> >
> >     Victor,
> >
> >     I've been experimenting with the PEP to implement a per-opcode
> >     cache in ceval loop (I'll share my progress on that in a few
> >     days).  This allows to significantly speedup LOAD_GLOBAL and
> >     LOAD_METHOD opcodes, to the point, where they don't require
> >     any dict lookups at all.  Some macro-benchmarks (such as
> >     chameleon_v2) demonstrate impressive ~10% performance boost.
> >
> >
> > Ooh, now my brain is trying to figure out the design of the cache. :)
>
> Yeah, it's tricky.  I'll need some time to draft a comprehensible
> overview.  And I want to implement a couple more optimizations and
> benchmark it better.
>
> BTW, I've some updates (html5lib benchmark for py3, new benchmarks
> for calling C methods, and I want to port some PyPy benchmakrs)
> to the benchmarks suite.  Should I just commit them, or should I
> use bugs.python.org?
>

I actually emailed speed@ to see if people were interested in finally
sitting down with all the various VM implementations at PyCon and trying to
come up with a reasonable base set of benchmarks that better reflect modern
Python usage, but I never heard back.

Anyway, issues on bugs.python.org are probably best to talk about new
benchmarks before adding them (fixes and updates to pre-existing benchmarks
can just go in).

>
> >
> >     I rely on your dict->ma_version to implement cache invalidation.
> >
> >     However, besides guarding against version change, I also want
> >     to guard against the dict being swapped for another dict, to
> >     avoid situations like this:
> >
> >
> >          def foo():
> >              print(bar)
> >
> >          exec(foo.__code__, {'bar': 1}, {})
> >          exec(foo.__code__, {'bar': 2}, {})
> >
> >
> >     What I propose is to add a pointer "ma_extra" (same 64bits),
> >     which will be set to NULL for most dict instances (instead of
> >     ma_version).  "ma_extra" can then point to a struct that has a
> >     globally unique dict ID (uint64), and a version tag (unit64).
> >     A macro like PyDict_GET_ID and PyDict_GET_VERSION could then
> >     efficiently fetch the version/unique ID of the dict for guards.
> >
> >     "ma_extra" would also make it easier for us to extend dicts
> >     in the future.
> >
> >
> > Why can't you simply use the id of the dict object as the globally
> > unique dict ID? It's already globally unique amongst all Python
> > objects which makes it inherently unique amongst dicts.
>
> We have a freelist for dicts -- so if the dict dies, there
> could be a new dict in its place, with the same ma_version.
>

Ah, I figured it would be too simple to use something we already had.

>
> While the probability of such hiccups is low, we still have
> to account for it.
>

Yep.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160120/3802ec6b/attachment.html>