Towards faster Python implementations - theory
Hendrik van Rooyen
mail at microcorp.co.za
Thu May 10 03:54:25 EDT 2007
"John Nagle" <nagle at animats.com> wrote:
> Paul Boddie wrote:
> > On 9 May, 08:09, "Hendrik van Rooyen" <m... at microcorp.co.za> wrote:
> >
> >>I am relatively new on this turf, and from what I have seen so far, it
> >>would not bother me at all to tie a name's type to its first use, so that
> >>the name can only be bound to objects of the same type as the type
> >>of the object that it was originally bound to.
> >
> >
> > But it's interesting to consider the kinds of names you could restrict
> > in this manner and what the effects would be. In Python, the only kind
> > of name that can be considered difficult to arbitrarily modify "at a
> > distance" - in other words, from outside the same scope - are locals,
> > and even then there are things like closures and perverse
> > implementation-dependent stack hacks which can expose local namespaces
> > to modification, although any reasonable "conservative Python"
> > implementation would disallow the latter.
>
> Modifying "at a distance" is exactly what I'm getting at. That's the
> killer from an optimizing compiler standpoint. The compiler, or a
> maintenance programmer, looks at a block of code, and there doesn't seem
> to be anything unusual going on. But, if in some other section of
> code, something does a "setattr" to mess with the first block of code,
> something unusual can be happening. This is tough on both optimizing
> compilers and maintenance programmers.
>
> Python has that capability mostly because it's free in an
> "everything is a dictionary" implementation. ("When all you have
> is a hash, everything looks like a dictionary".) But that limits
> implementation performance. Most of the time, nobody is using
> "setattr" to mess with the internals of a function, class, or
> module from far, far away. But the cost for that flexibility is
> being paid, unnecessarily.
>
> I'm suggesting that the potential for "action at a distance" somehow
> has to be made more visible.
>
> One option might be a class "simpleobject", from which other classes
> can inherit. ("object" would become a subclass of "simpleobject").
> "simpleobject" classes would have the following restrictions:
>
> - New fields and functions cannot be introduced from outside
> the class. Every field and function name must explicitly appear
> at least once in the class definition. Subclassing is still
> allowed.
> - Unless the class itself uses "getattr" or "setattr" on itself,
> no external code can do so. This lets the compiler eliminate the
> object's dictionary unless the class itself needs it.
>
> This lets the compiler see all the field names and assign them fixed slots
> in a fixed sized object representation. Basically, this means simple objects
> have a C/C++ like internal representation, with the performance that comes
> with that representation.
>
> With this, plus the "Shed Skin" restrictions, plus the array features of
> "numarray", it should be possible to get computationally intensive code
> written in Python up to C/C++ levels of performance. Yet all the dynamic
> machinery of Python remains available if needed.
>
> All that's necessary is not to surprise the compiler.
>
If this is all it takes, I would even be happy to have to declare which things
could be surprising - some statement like:
x can be anything
Would that help?
It kind of inverts the logic - and states that if you want what is now
the default behaviour, you have to ask for it.
- Hendrik
More information about the Python-list
mailing list