Towards faster Python implementations - theory

Hendrik van Rooyen mail at microcorp.co.za
Thu May 10 03:54:25 EDT 2007


"John Nagle" <nagle at animats.com> wrote:


> Paul Boddie wrote:
> > On 9 May, 08:09, "Hendrik van Rooyen" <m... at microcorp.co.za> wrote:
> > 
> >>I am relatively new on this turf, and from what I have seen so far, it
> >>would not bother me at all to tie a name's type to its first use, so that
> >>the name can only be bound to objects of the same type as the type
> >>of the object that it was originally bound to.
> > 
> > 
> > But it's interesting to consider the kinds of names you could restrict
> > in this manner and what the effects would be. In Python, the only kind
> > of name that can be considered difficult to arbitrarily modify "at a
> > distance" - in other words, from outside the same scope - are locals,
> > and even then there are things like closures and perverse
> > implementation-dependent stack hacks which can expose local namespaces
> > to modification, although any reasonable "conservative Python"
> > implementation would disallow the latter.
> 
>      Modifying "at a distance" is exactly what I'm getting at.  That's the
> killer from an optimizing compiler standpoint.  The compiler, or a
> maintenance programmer, looks at a block of code, and there doesn't seem
> to be anything unusual going on.  But, if in some other section of
> code, something does a "setattr" to mess with the first block of code,
> something unusual can be happening.  This is tough on both optimizing
> compilers and maintenance programmers.
> 
>      Python has that capability mostly because it's free in an
> "everything is a dictionary" implementation.  ("When all you have
> is a hash, everything looks like a dictionary".)  But that limits
> implementation performance.  Most of the time, nobody is using
> "setattr" to mess with the internals of a function, class, or
> module from far, far away.  But the cost for that flexibility is
> being paid, unnecessarily.
> 
>      I'm suggesting that the potential for "action at a distance" somehow
> has to be made more visible.
> 
>      One option might be a class "simpleobject", from which other classes
> can inherit.  ("object" would become a subclass of "simpleobject").
> "simpleobject" classes would have the following restrictions:
> 
> - New fields and functions cannot be introduced from outside
> the class.  Every field and function name must explicitly appear
> at least once in the class definition.  Subclassing is still
> allowed.
> - Unless the class itself uses "getattr" or "setattr" on itself,
> no external code can do so.  This lets the compiler eliminate the
> object's dictionary unless the class itself needs it.
> 
> This lets the compiler see all the field names and assign them fixed slots
> in a fixed sized object representation.  Basically, this means simple objects
> have a C/C++ like internal representation, with the performance that comes
> with that representation.
> 
> With this, plus the "Shed Skin" restrictions, plus the array features of
> "numarray", it should be possible to get computationally intensive code
> written in Python up to C/C++ levels of performance.  Yet all the dynamic
> machinery of Python remains available if needed.
> 
> All that's necessary is not to surprise the compiler.
> 
If this is all it takes, I would even be happy to have to declare which things
could be surprising - some statement like:

x can be anything

Would that help?

It kind of inverts the logic - and states that if you want what is now
the default behaviour, you have to ask for it.

- Hendrik




More information about the Python-list mailing list