Shed Skin Python-to-C++ Compiler 0.0.21, Help needed

John Nagle nagle at animats.com
Sun Apr 1 16:34:49 EDT 2007


Kay Schluehr wrote:
> On Apr 1, 6:07 pm, John Nagle <n... at animats.com> wrote:
> 
>>Kay Schluehr wrote:
>>
>>>Indeed. The only serious problem from an acceptance point of view is
>>>that Mark tried to solve the more difficult problem first and hung on
>>>it. Instead of integrating a translator/compiler early with CPython,
>>>doing some factorization of Python module code into compilable and
>>>interpretable functions ( which can be quite rudimentary at first )
>>>together with some automatically generated glue code and *always have
>>>a running system* with monotone benefit for all Python code he seemed
>>>to stem an impossible task, namely translating the whole Python to C++
>>>and created therefore a "lesser Python".
>>
>>    Trying to incrementally convert an old interpreter into a compiler
>>is probably not going to work.
> 
> 
> I'm talking about something that is not very different from what Psyco
> does but Psyco works at runtime and makes continous measurements for
> deciding whether it can compile some bytecodes just-in-time or let the
> interpreter perform their execution.

    That can work.  That's how the Tamirin JIT compiler does Javascript
inside Mozilla.  The second time something is executed interpretively,
it's compiled.  That's a tiny JIT engine, too; it's inside the Flash
player.  Runs both JavaScript and ActionScript generated programs.
Might be able to run Python, with some work.

> A factorization always follows a certain pattern that preserves the
> general form and creates a specialization:
> 
> def func(x,y):
>     # algorithm
> 
> ====>
> 
> from native import func_int_int
> 
> def func(x,y):
>     if isinstance(x, int) and isinstance(y, int):
>        return func_int_int(x,y)  # wrapper of natively compiled
> specialized function
>     else:
>        # perform original unmodified algorithm on bytecode interpreter

     You can probably offload that decision onto the linker by creating
specializations with different type signatures and letting the C++
name resolution process throw out the ones that aren't needed at
link time.

>>>Otherwise it
>>>wouldn't be a big deal to do what is necessary here and even extend
>>>the system with perspective on Py3K annotations or other means to ship
>>>typed Python code into the compiler.
>>
>>     Shed Skin may be demonstrating that "annotations" are unnecessary
>>cruft and need not be added to Python.  Automatic type inference
>>may be sufficient to get good performance.
> 
> 
> You still dream of this, isn't it? Type inference in dynamic languages
> doesn't scale. It didn't scale in twenty years of research on
> SmallTalk and it doesn't in Python.

    I'll have to ask some of the Smalltalk people from the PARC era
about that one.

 > However there is no no-go theorem
> that prevents ambitious newbies to type theory wasting their time and
> efforts.

    Type inference analysis of Python indicates that types really don't
change all that much.  See

http://www.python.org/workshops/2000-01/proceedings/papers/aycock/aycock.html

Only a small percentage of Python variables ever experience a type change.
So type inference can work well on real Python code.

    The PyPy developers don't see type annotations as a win.  See Karl Botlz'
comments in

http://www.velocityreviews.com/forums/t494368-p3-pypy-10-jit-compilers-for-free-and-more.html

where he writes:

"Also, I fail to see how type annotations can have a huge speed-advantage
versus what our JIT and Psyco are doing."

>>The Py3K annotation model is to some extent a repeat of the old
>>Visual Basic model.  Visual Basic started as an interpreter with one
>>default type, which is now called Variant, and later added the usual types,
>>Integer, String, Boolean, etc., which were then manually declared.
>>That's where Py3K is going.
>
> This has nothing to do with VB and it has not even much to do
> with what existed before in language design.

    Type annotations, advisory or otherwise, aren't novel.  They
were tried in some LISP variants.  Take a look at this
experimental work on Self, too.

      http://www.cs.ucla.edu/~palsberg/paper/spe95.pdf

    Visual Basic started out more or less declaration-free, and
gradually backed into having declarations.  VB kept a "Variant"
type, which can hold anything and was the implicit type.
Stripped of the Python jargon, that's what's proposed for Py3K.
Just because it has a new name doesn't mean it's new.

    It's common for languages to start out untyped and "simple",
then slowly become more typed as the limits of the untyped
model are reached.

    Another thing that can go wrong with a language: if you get too hung
up on providing ultimate flexibility in the type and object system,
too much of the language design and machinery is devoted to features
that are very seldom used.  C++ took that wrong turn a few years ago,
when the language designers became carried away with their template
mechanism, to the exclusion of fixing the real problems that drive their
user base to Java or C#.

    Python, the language, is in good shape.  It's the limitations
of the CPython implementation that are holding it back.  It looks
like at least two projects are on track to go beyond the
limitations of that implementation.  This is good.

				John Nagle



More information about the Python-list mailing list