[summerofcode] High-level python optimizer.

Vova physics at list.ru
Fri Jun 10 16:29:56 CEST 2005


On Friday 10 June 2005 17:39, Mark Dufour wrote:
> Hi Vova,
>
> You're right, conversion to C++ cannot work for many Python programs.
> But how many programs are written in C++, as opposed to Python?
But how many C++ programmers do not like syntax of C++ and wants something 
different ? Python and C++ are different. Using python-to-c++ compiler one is 
allowed only to use the intersection of python and c++ features.

> -^ My 
> aim is to be able to use Python as a 'high-level C++', so that you
> both get a nice syntax, as well as very efficient code (of course
> programmers need to realize what kind of compiler they are dealing
> with, as always..)
>
> I am just worried that if you plan on doing a _global_ dataflow
> analysis, it will be difficult to remain precise, when all sort of
> dynamic things happen in the code.
What do you mean by "precise" ? Of course, maybe it will not guess all 
possible information about program, but it will never make incorrect 
assumptions by design. It will never use proof-by-contradiction when full 
information is unavailable.

> I'm not sure, but I think the 
> people of PyPy are already working heavily on the kinds of 
> optimizations as you are talking about, but applying it during
> run-time
Some kinds of this optimization simply can't be make at run-time. Moreover 
making optimization at runtime takes execution time too.
My approach is different.

> , so they can be somewhat precise for dynamic code - ie work 
> for any Python program (Armin? :)) I guess it would be cool if you
> could join their efforts, maybe by submitting a proposal for
> implementing a certain feature?
Cooperation is good. Implementing my project I want to learn more about the 
subject and test some of my assumptions. Of course I will be glad if some of 
my results will be used in PyPy or some other projects.
Anyway thanks for link.

> As for my type inference code, it will probably not be very stable for
> at least a few more months. Of course, instead of generating C++ code,
> it could be used for other purposes.
I think discussion is more important than code exchange at least for now while 
I just have no ready code to share.

> Hm, if you would like to get a 
> 'good' overview of the theoretical work done on type inference for
> dynamic languages, you might check out my survey at
> http://kascade.org/optimizing_python.pdf.
Thanks. It is quite informal.

> I spent a lot of time going 
> through the literature..


-- 
            Best regards,
              Vladimir

>
> Mark.
>
> On 6/10/05, Vova <physics at list.ru> wrote:
> > Hello !
> > Again, just to clarify. My program will do global analysis of the source
> > building data-flow graph. This graph makes possible to predict types and
> > values of some variables and check functions for side-effects (depending
> > on argument types). Using this information optimizer can make some good
> > optimizations. Being well implemented it gives the following guarantees:
> > 1. This will work for ANY python program.
> > 2. Optimized program will always work the same as the original program.
> > Optimizer will touch only parts that it can optimize. By supplying
> > optimizer with some user-created information results can be improved, but
> > this information is not required.
> >
> > The aims of this optimizer is to:
> > 1. Allow programmers not to sacrifice readability and architecture of the
> > code in favor of performance.
> > 2. Allow programmers to use some nice language features (functional
> > programming, for example) not thinking that this features are
> > inefficient.. 3. Allow improving old programs using new language features
> > (for example using generators instead of lists).
> > 4. Discover more about program-transformation algorithms.
> >
> > Compiling code to C++ does not give the same guarantees. It will work
> > only for static programs, only for subset of python. How many real-world
> > programs are in this subset ? But of course seep-up of such compilation
> > will be greater. I think the aim of such compiler is to allow using
> > static subset of python syntax to write fast programs (am I wrong?).
> >
> > Our projects is just different ones, however some correlations of course
> > present. May be we can even help each other in overlapping areas.
> >
> > --
> >           Best regards,
> >             Vladimir
> >
> > On Friday 10 June 2005 13:44, Mark Dufour wrote:
> > > Hi Vova,
> > >
> > > It's very hard to prove anything about a program without a precise
> > > type analysis. As Armin mentioned, implementing this alone, for
> > > Python, represents quite an effort. But it is the most important thing
> > > for optimizing pure Python: you can't specify types in Python (without
> > > messing up your source code, or using Pyrex), but you can optimize
> > > other things by hand. In my case, it's even more important, because
> > > conversion to C++ essentially gives you most of the optimizations you
> > > mention for free.  A really high-level optimization that remains in
> > > this case is for example trying to allocate things on the stack, or
> > > even statically preallocate them (because you do this manually in
> > > C++). Other examples of high-level optimizations imo are for example
> > > optimization of bounds checking, optimizing memory layout in many
> > > ways, and removing 'intermediate lists' as in functional language
> > > compilers.
> > >
> > > But, I don't think a global approach can work well for even moderately
> > > dynamic programs, as it's very hard to determine precisely what is
> > > going on. (Consider just the use of reflection - it's formally
> > > undecidable what kind of attribute names are involved, even without
> > > user input.) A global approach then, can imo best consider a subset of
> > > Python, and translate code into a language such as C++ (as it operates
> > > at the same abstraction level as Python, and you get many
> > > optimizations for free.) I really like the elegance of such a
> > > solution: you can reuse lots of work done in for example GCC, and you
> > > get the ultimate in speed for relatively 'static' programs - which
> > > many people, or at least myself, mostly write. I love the syntax of
> > > Python, but I don't really use for example dynamic typing for most of
> > > the time (it's just convenient to be able to leave out type
> > > declarations :D)
> > >
> > > On the other hand, a non-global approach I think cannot do many
> > > high-level optimizations, as it doesn't have enough information or
> > > (when done during run-time) does not have the time to calculate the
> > > right information. A non-global, run-time approach does seem like the
> > > best long-term solution. With Moore's law and all, it can optimize
> > > most code to an acceptable level, and still be able to work with the
> > > full Python semantics.
> > >
> > > So, I am left wondering a bit about the approach you would like to
> > > take. If you take the global route, then why not convert to C++. But
> > > that would be sort of the same thing as Starkiller and Shedskin (my
> > > compiler) aim to do. If you take the non-global route, then real
> > > high-level optimizations seem out of the question, or at least very
> > > complicated? Maybe it would be best to look into the things Armin Rigo
> > > and the others are doing with PyPy, perhaps find something you can
> > > contribute there?
> > >
> > > Finally, I have no idea whether the source code for Starkiller is
> > > available, or if so, if it's readable. The website seems gone, and I
> > > don't know what happened. Well, I know my source code is available,
> > > but not very readable -^ (just mail me, and I'll send you this 6000
> > > line can of worms) - I'll be probably still modifying it heavily for
> > > the foreseeable future..
> > >
> > >
> > > Mark.


More information about the summerofcode mailing list