From tjreedy at udel.edu  Fri Sep 24 21:06:19 2004
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 24 Sep 2004 15:06:19 -0400
Subject: [pypy-dev] List dormant?
Message-ID: <cj1r7e$t4m$1@sea.gmane.org>

Accessing the PyPy list via gmane, I have not seen anything for at least a 
couple of weeks.
 Is this a reflection of true dormancy?
 (If so, I will presume it is just temporary.)

Terry J. Reedy


From hpk at trillke.net  Fri Sep 24 21:25:17 2004
From: hpk at trillke.net (holger krekel)
Date: Fri, 24 Sep 2004 21:25:17 +0200
Subject: [pypy-dev] List dormant?
In-Reply-To: <cj1r7e$t4m$1@sea.gmane.org>
References: <cj1r7e$t4m$1@sea.gmane.org>
Message-ID: <20040924192517.GI19356@solar.trillke.net>

Hi Terry, 

[Terry Reedy Fri, Sep 24, 2004 at 03:06:19PM -0400]
> Accessing the PyPy list via gmane, I have not seen anything for at least a 
> couple of weeks.
>  Is this a reflection of true dormancy?
>  (If so, I will presume it is just temporary.)

Indeed, the last pypy-dev mails are from end of August as far
as i see.  There have been some commits and off-list
communication going on, though. Also, we had some discussions
on our pypy-funding list regarding EU funding. It seems
possible now that PyPy will get funded starting from 1st of
November which would clearly allow many of us to work more on
pypy. 

However, Armin has been making quite some progress on 
a direct C-backend (not based on our Pyrex-approach) 
lately which reflects on the pypy-svn list. I am sure
he is happy to answer questions or talk about the state
of how this is going if someone asks him.  Hum, hey 
Armin, what is the current state of the C backend? :-) 

Moreover, it's not unlikely that the next PyPy sprint will
take place in Vilnius, Lithunia, with the help of the POV
("Programmers of Vilnius") people.  The projected date 
is around 15th of November till 21st of November 2004. 

Btw, we are always very interested in places and possibilities
to do coding sprints (not neccessarily in Europe!).  Most of
the current code base has been developed in sprints so far and
it's fun events where one learns a lot. 

When all goes well i hope that we will have monthly update
reports and I guess the key point will be to really put out
releases for people to play with.  

cheers, 

    holger


From lac at strakt.com  Sat Sep 25 08:47:52 2004
From: lac at strakt.com (Laura Creighton)
Date: Sat, 25 Sep 2004 08:47:52 +0200
Subject: [pypy-dev] List dormant? 
In-Reply-To: Message from "Terry Reedy" <tjreedy@udel.edu> 
	of "Fri, 24 Sep 2004 15:06:19 EDT." <cj1r7e$t4m$1@sea.gmane.org> 
References: <cj1r7e$t4m$1@sea.gmane.org> 
Message-ID: <200409250647.i8P6lqEr028976@ratthing-b246.strakt.com>

In a message of Fri, 24 Sep 2004 15:06:19 EDT, "Terry Reedy" writes:
>Accessing the PyPy list via gmane, I have not seen anything for at least 
>a 
>couple of weeks.
> Is this a reflection of true dormancy?
> (If so, I will presume it is just temporary.)
>
>Terry J. Reedy
>

The action has been in pypy-funding, pypy-sprint and in pypy-checkins.
Our Dublin Sprint fell through.  But we are organsing another one.  Want
to come to Vilnius?

Laura


From arigo at tunes.org  Sat Sep 25 12:57:53 2004
From: arigo at tunes.org (Armin Rigo)
Date: Sat, 25 Sep 2004 11:57:53 +0100
Subject: [pypy-dev] List dormant?
In-Reply-To: <20040924192517.GI19356@solar.trillke.net>
References: <cj1r7e$t4m$1@sea.gmane.org>
	<20040924192517.GI19356@solar.trillke.net>
Message-ID: <20040925105753.GA30020@vicky.ecs.soton.ac.uk>

Hi Terry, hi Holger, hi everyone else,

On Fri, Sep 24, 2004 at 09:25:17PM +0200, holger krekel wrote:
> he is happy to answer questions or talk about the state
> of how this is going if someone asks him.  Hum, hey 
> Armin, what is the current state of the C backend? :-) 

Hum, it is a piece of experimental code that has grown quite large.  Here is a
large e-mail to explain it and why I'd rather like it to be smaller.

It produces a C extension module.  The functions' body are, literally, basic
blocks in the C sense, with labels, and jumps to each other, following
directly the control flow graph's structure.  Additionally, the end of each
function contains code that decrefs the variables that need to be decrefed in
case of error; each operation that can fail will, in case of failure, jump to
some label there.

The kind of objects supported is enumerated in genc_repr.py: ints, generic
PyObjects, tuples, classes, instances, lists, function pointers, method
pointers.  It all works reasonably well.  For example, instances can have both
instance and (read-only) class attributes.  Methods are just a particular case
of class attributes, as in regular Python.  (Class attributes are difficult to
do in Pyrex; it was one of the motivations to switch to C.)

But at the same time I am not fully satisfied with the C backend.  I have both
small and big concerns.

The biggest concern is about its overall structure.  It is really like a
classical compiler, taking graph blocks, analysing the operations inside,
using the annotations computed previously as a guide; then it produces
lower-level operations with explicit conversions, and finally a separate pass
turns these into real C code.

Here is where all this occurs:

* typer.py          turns the block's SpaceOperations into low-level ops
* genc_typeset.py   defines which low-level ops exist
* genc_op.py        classes to write the C code for each low-level op
* genc_repr.py      maps block's Variables to C types
* genc.py           calls all other modules; writes the C module
* classtyper.py     maps user-defined classes to structs

Additionally, genc.h is inserted verbatim at the beginning of the C module; it
contains macros to do the common operations.  These macro definitions are also
parsed (!) by genc_typeset.py, so that typer.py can know about their
existence.

In some sense it is fragile: it depends on the annotations being generated
exactly as expected.  The (earlier) annotation phase analyses the
SpaceOperations and somehow promizes that there is a way to do the given
operation and guarantee that it produces some results; e.g. an 'add' when
applied to two SomeInteger()s gives another SomeInteger().  But then
genc_typeset.py or genc.h must make sure that it actually implements an
operation with that signature.  There is some duplication there.

For example the backend currently expects that a list object is never
converted, i.e. if an object is created as a "list of ints" then it will
remain a "list of ints" for its whole life.  The annotations currently
produced have this property, but it's kind of accidental.  For the C backend
itself it is an implicit external assumption.  We could devise by hand
annotations that crash the C backend although they are a priori reasonable.

Well, another concern is that typer.py is quite confusing and was difficult to
get right.  I'm not too sure that the code that generates the Py_DECREF()
after an error will decref all the correct variables in all cases.

Something I don't even dare speak about too much is the way a Variable in the
flow graphs maps to potentially a list of unrelated C variables (or fields in
a struct).  For example, a Variable annotated as a SomeTuple() of two integers
will become two distinct C variables.  And Variables that are sufficiently
constant become zero C variables!  This is a nice idea but it makes typer.py,
genc_typeset.py and genc_op.py all the more obscure.

Of course, I'm always thinking about reorganizing it all...  The latest such
idea was inspired by Seo who, for the Lisp backend, put in transform.py some
code that actually modifies the control flow graph before it is passed to the
backend.  He replaces the 'newlist' and 'mul' operations corresponding to an
expression '[a] * b' with a single custom operation, 'alloc_and_set'.  After a
lot of unsuccessful efforts in implementing 'list += list' in the C backend I
remembered this idea and now transform.py turns a list-based 'inplace_add'
into a whole new bunch of control flow blocks:

#    a = inplace_add(b, c)
# becomes the following graph:
#
#  clen = len(c)
#  growlist(b, clen)     # ensure there is enough space for clen new items
#        |
#        |  (pass all variables to next block, plus i=0)
#        V
#  ,--> z = lt(i, clen)
#  |    exitswitch(z):
#  |     |          |        False
#  |     | True     `------------------>  ...sequel...
#  |     V
#  |    x = getitem(c, i)
#  |    fastappend(b, x)
#  |    i1 = add(i, 1)
#  |     |
#  `-----'  (pass all variables, with i=i1)

So now the backend only has to worry about the two simple operations
'growlist' and 'fastappend'.  The latter is an append where we can assume that
there is already enough preallocated space.

The graph transformation code itself is quite verbose, but with more developed
utility routines it could be made simpler.  The point is that it looks like a
good idea to perform as many optimizations and transformations on the flow
graph itself before passing it to the backend.  (With hindsight it's obvious.)  
So I'm now thinking about how more of the typer.py mess could be moved there.

One extreme idea would be to say that the flow graph should be transformed
much more, step by step, until it eventually contains only operations that
have an obvious direct C equivalent.  This would make the C backend much
simpler again (and also make simple non-C backends fun and quick to
implement).  This would include typing the operations: the 'add' would be
reserved for 'add two PyObject*'; another 'add_i' would add two integers.  
Conversion operations would be inserted in the flow graph as needed.

So typing would be more tightly coupled with the annotation phase, which I
think is a good idea.  Essentially, the same code that says that adding two
SomeIntegers() produces a SomeInteger() would say that to do so the correct
operation is 'add_i'.  And the code that says that by default 'add' produces a
SomeObject() would say that this requires the two inputs to be converted to
PyObject*.


That's all for now...


A bientot,

Armin


From lac at strakt.com  Sat Sep 25 18:09:48 2004
From: lac at strakt.com (Laura Creighton)
Date: Sat, 25 Sep 2004 18:09:48 +0200
Subject: [pypy-dev] after discussion with Armin about RPython
Message-ID: <200409251609.i8PG9m2b030018@ratthing-b246.strakt.com>

which cleared up some misconceptions I had,  I got to wonder.

We started out deciding to define RPython.  That bogged down. So we
decided to make it be 'the minimal set of Python that we need for things
to work' defined as 'what we have when we are done writing the translation layer'.
This _is_ the pragmatic approach.  But I now wonder if we might benefit from
trying to define it more formally again.  It might give us  (me at any rate)
a better idea of exactly what obvious direct C equivalents we need.
But perhaps the rest of you can already see this without any formal definition ....

Just a thought,
Laura


From arigo at tunes.org  Sat Sep 25 20:48:19 2004
From: arigo at tunes.org (Armin Rigo)
Date: Sat, 25 Sep 2004 19:48:19 +0100
Subject: [pypy-dev] after discussion with Armin about RPython
In-Reply-To: <200409251609.i8PG9m2b030018@ratthing-b246.strakt.com>
References: <200409251609.i8PG9m2b030018@ratthing-b246.strakt.com>
Message-ID: <20040925184819.GA29258@vicky.ecs.soton.ac.uk>

Hi Laura,

On Sat, Sep 25, 2004 at 06:09:48PM +0200, Laura Creighton wrote:
> trying to define it more formally again.  It might give us  (me at any rate)
> a better idea of exactly what obvious direct C equivalents we need.

The guidelines in svn/pypy/trunk/doc/objspace/restrictedpy.txt are still
almost up-to-date.  (I just mentioned dictionaries, which we decided to allow
with string keys during the last sprint.)  We might bit the bullet and write
down a more formally complete and less hand-wavy definition, though.


Armin


From tjreedy at udel.edu  Sun Sep 26 01:51:10 2004
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 25 Sep 2004 19:51:10 -0400
Subject: [pypy-dev] Re: List dormant?
References: <cj1r7e$t4m$1@sea.gmane.org>
	<200409250647.i8P6lqEr028976@ratthing-b246.strakt.com>
Message-ID: <cj509h$qks$1@sea.gmane.org>


"Laura Creighton" <lac at strakt.com> wrote in message
> Want to come to Vilnius?

Someday, say within the next 5 years.
If you ever want to do a sprint in Delaware USA, let me know.

Terry 


From lac at strakt.com  Sun Sep 26 07:59:20 2004
From: lac at strakt.com (Laura Creighton)
Date: Sun, 26 Sep 2004 07:59:20 +0200
Subject: [pypy-dev] http://projects.edgewall.com/qunittest/
Message-ID: <200409260559.i8Q5xKGZ031793@ratthing-b246.strakt.com>

I wonder how hard that would be to integrate with our unittest framework?

Laura


From lac at strakt.com  Sun Sep 26 08:55:58 2004
From: lac at strakt.com (Laura Creighton)
Date: Sun, 26 Sep 2004 08:55:58 +0200
Subject: [pypy-dev] Re: List dormant? 
In-Reply-To: Message from "Terry Reedy" <tjreedy@udel.edu> 
	of "Sat, 25 Sep 2004 19:51:10 EDT." <cj509h$qks$1@sea.gmane.org> 
References: <cj1r7e$t4m$1@sea.gmane.org>
	<200409250647.i8P6lqEr028976@ratthing-b246.strakt.com>
	<cj509h$qks$1@sea.gmane.org> 
Message-ID: <200409260655.i8Q6twLh031985@ratthing-b246.strakt.com>

In a message of Sat, 25 Sep 2004 19:51:10 EDT, "Terry Reedy" writes:
>
>"Laura Creighton" <lac at strakt.com> wrote in message
>> Want to come to Vilnius?
>
>Someday, say within the next 5 years.
>If you ever want to do a sprint in Delaware USA, let me know.
>
>Terry 

That sounds like fun.  How much lead-time for set-up do you need?

Laura


From ianb at colorstudy.com  Sun Sep 26 09:42:43 2004
From: ianb at colorstudy.com (Ian Bicking)
Date: Sun, 26 Sep 2004 02:42:43 -0500
Subject: [pypy-dev] utest, development and discussion
Message-ID: <415672F3.9060808@colorstudy.com>

I'm interested in using utest for a project of mine, where the tests 
have gotten a bit out of control -- utest won't control them, but as 
long as I'm revisiting everything, I figured I might move to a test 
system I liked.  Also, I want to make adding tests more accessible for 
other contributors.

First question: I'm not being dumb if I convert all my tests to utest, 
am I?  Not that utest is really a framework like unittest... but if I 
spend lots of time fiddling with test code, it would be a shame if utest 
went into disrepair, or was rewritten in a radically different way. 
Maybe that's not too big an issue, because utest doesn't really have an 
API, but since utest isn't used much outside of pypy (that I know of) I 
worry.

Anyway, I've started working with it some.  I've added one small feature 
(dropping into pdb when an exception occurs), and there's sure to be 
some more, particularly documentation.  Where should I send patches? 
Where should discussion occur?  And maybe a website?

Thanks.

-- 
Ian Bicking  /  ianb at colorstudy.com  / http://blog.ianbicking.org


From hpk at trillke.net  Sun Sep 26 09:50:05 2004
From: hpk at trillke.net (holger krekel)
Date: Sun, 26 Sep 2004 09:50:05 +0200
Subject: [pypy-dev] http://projects.edgewall.com/qunittest/
In-Reply-To: <200409260559.i8Q5xKGZ031793@ratthing-b246.strakt.com>
References: <200409260559.i8Q5xKGZ031793@ratthing-b246.strakt.com>
Message-ID: <20040926075005.GY19356@solar.trillke.net>

Hi Laura, 

[Laura Creighton Sun, Sep 26, 2004 at 07:59:20AM +0200]
> I wonder how hard that would be to integrate with our unittest framework?

Look into the source and tell us :-) 

Judging from looking at 'trac' which is also hosted at edgewell i guess
it shouldn't be hard.  

Btw, the 'std' stuff [*] i showed at EuroPython which contains
the unittest framework is soon to be renamed/refactored to the
root name 'py' and the testing part will be 'py.test'.  When
i'll get to the point of integrating all this into pypy then
i'll ask you for help with the nice renaming tool you have
written ... 

And while we are at it, does anyone have real experience with
"bicycle repair man"? I would like to try it for a
renaming/refactoring session and am interested in any
experiences. A refactoring tool supporting refactoring/renaming
would probably be of help to PyPy. 

cheers, 

    holger


[*] http://codespeak.net/svn/user/hpk/talks/std-talk.txt


From lac at strakt.com  Sun Sep 26 10:18:01 2004
From: lac at strakt.com (Laura Creighton)
Date: Sun, 26 Sep 2004 10:18:01 +0200
Subject: [pypy-dev] bug in
	http://codespeak.net/moin/pypy/moin.cgi/FrontPage?action=show
Message-ID: <200409260818.i8Q8I1mD032184@ratthing-b246.strakt.com>

If you click on the 'documentation' link you get a traceback!  ooops.  I will
look at this later unless somebody beats me to it.

Laura


From lac at strakt.com  Sun Sep 26 10:21:52 2004
From: lac at strakt.com (Laura Creighton)
Date: Sun, 26 Sep 2004 10:21:52 +0200
Subject: [pypy-dev] utest, development and discussion 
In-Reply-To: Message from Ian Bicking <ianb@colorstudy.com> 
	of "Sun, 26 Sep 2004 02:42:43 CDT." <415672F3.9060808@colorstudy.com> 
References: <415672F3.9060808@colorstudy.com> 
Message-ID: <200409260821.i8Q8LqOU032224@ratthing-b246.strakt.com>

In a message of Sun, 26 Sep 2004 02:42:43 CDT, Ian Bicking writes:
>I'm interested in using utest for a project of mine, where the tests 
>have gotten a bit out of control -- utest won't control them, but as 
>long as I'm revisiting everything, I figured I might move to a test 
>system I liked.  Also, I want to make adding tests more accessible for 
>other contributors.
>
>First question: I'm not being dumb if I convert all my tests to utest, 
>am I?  Not that utest is really a framework like unittest... but if I 
>spend lots of time fiddling with test code, it would be a shame if utest 
>went into disrepair, or was rewritten in a radically different way. 
>Maybe that's not too big an issue, because utest doesn't really have an 
>API, but since utest isn't used much outside of pypy (that I know of) I 
>worry.

The biggest way to get rid of that worry is to have more people like you
using it.  But as far as I know you will be the first person outside of
pypy to do so.  I think this would be _great_.

Holger is the one who is actually working on the utest code.  I don't think
he has any radical changes planned, but I will let him speak for himself.
If you are converting things wholesale, you might be interested in
src/pypy/tool/utestconvert.py -- from the pypy svn repository, a script I
wrote that does this automatically.  It has only been run on pypy, as far
as I know.  Warning!  The script goes off and converts 'assert raises'
to 'raises' as it lives in the respository.  I don't think that 'raises'
is currently ready for production -- Holger? -- so you will want to comment
out that line of translations.

Also, this tool makes no attempt to understand when it is in a comment,
so if you have a comment such as:

"""
blah blah and to test do:
     self.assertEquals(X, Y)
blah blah blah
"""

Then that assert will be happily converted.

The following one:
"""
blah blah and to test you cannot do:
     self.assertEquals(X, Y) blah blah overflow exception blah blah unexpected result
blah blah blah
"""

will _not_ get converted, because python's expr will not be able to find something
parseable after self.assertEquals, and in that case it just writes out exactly what
it saw.

The upshot is that there are _two_ unittest functions for utestconvert.py
pypy/trunk/src/pypy/tool/test/test_utestconvert.py and
pypy/trunk/src/pypy/tool/test/test_utestconvert2.py

utestconvert2.py uses the standard python unittest framework.  utestconvert.py
is the same set of tests, written utest style, but I had to do that by hand because
this is the cannonical file that cannot be converted by the tool itself -- it
cheerfully changes 'what you want to change' into 'what you want it changed into'.

>Anyway, I've started working with it some.  I've added one small feature 
>(dropping into pdb when an exception occurs), and there's sure to be 
>some more, particularly documentation.  Where should I send patches? 
>Where should discussion occur?  And maybe a website?

Discussion belongs here.  Patches too unless you want to get a project login,
a process that Holger handles.  There isn't a separate part of the pypy wiki -
or the website for utest.  Probably that should change.  And documentation
belongs here: http://codespeak.net/pypy/index.cgi?doc  which we generate
out of files in pypy/trunk/doc  .  You write your docs in ReST and a daemon
comes along and makes html out of them for you.

>Thanks.
>
>-- 
>Ian Bicking  /  ianb at colorstudy.com  / http://blog.ianbicking.org

Oh no, thank _you_.

Laura


From hpk at trillke.net  Sun Sep 26 10:23:34 2004
From: hpk at trillke.net (holger krekel)
Date: Sun, 26 Sep 2004 10:23:34 +0200
Subject: [pypy-dev] utest, development and discussion
In-Reply-To: <415672F3.9060808@colorstudy.com>
References: <415672F3.9060808@colorstudy.com>
Message-ID: <20040926082334.GZ19356@solar.trillke.net>

Hello Ian,

[Ian Bicking Sun, Sep 26, 2004 at 02:42:43AM -0500]
> I'm interested in using utest for a project of mine, where the tests 
> have gotten a bit out of control -- utest won't control them, but as 
> long as I'm revisiting everything, I figured I might move to a test 
> system I liked.  Also, I want to make adding tests more accessible for 
> other contributors.

hey, nice! 
 
> First question: I'm not being dumb if I convert all my tests to utest, 
> am I?  Not that utest is really a framework like unittest... but if I 
> spend lots of time fiddling with test code, it would be a shame if utest 
> went into disrepair, or was rewritten in a radically different way. 
> Maybe that's not too big an issue, because utest doesn't really have an 
> API, but since utest isn't used much outside of pypy (that I know of) I 
> worry.

It isn't even used much inside pypy right now but in some of my 
and maybe Armin's own projects mostly. Actually Laura
has written a tool for conversion from unittest.py style tests 
and once the naming/API is getting towards finalization it is
going to be applied for PyPy.  

So (i just wrote in that other posting) the biggest change
is some pending renaming.  Other than that you shouldn't 
expect any big changes to utest.  The configuration file 
(currently "utest.conf") will very likely be reworked, though ... 

> Anyway, I've started working with it some.  I've added one small feature 
> (dropping into pdb when an exception occurs), and there's sure to be 
> some more, particularly documentation.  Where should I send patches? 
> Where should discussion occur?  And maybe a website?

I have just created the 'py-dev at codespeak.net' mailing list
and will soon outline there the planned release and any
changes along with some policies for code contributions.  I
would be happy if you (and others who are interested) join and
offer your patches and opinions. 

    http://codespeak.net/mailman/listinfo/py-dev

Moreover, you can get write-access to the codespeak repository
including the 'py' and 'pypy' part.  Just drop me a note with 
username and ssh-public key. Experienced and known programmers 
in the community will usually just get such access if they want. 
And everyone hosting their projects there should know about this 
and accept this policy. 

Later in October we hopefully have the new hardware and software 
codespeak setup ready which should include 'trac' which is then to be 
used for "the py lib"'s webpages. Sorry that there isn't anything
there, yet.  

cheers, 

    holger


From hpk at trillke.net  Sun Sep 26 10:26:10 2004
From: hpk at trillke.net (holger krekel)
Date: Sun, 26 Sep 2004 10:26:10 +0200
Subject: [pypy-dev] bug in
	http://codespeak.net/moin/pypy/moin.cgi/FrontPage?action=show
In-Reply-To: <200409260818.i8Q8I1mD032184@ratthing-b246.strakt.com>
References: <200409260818.i8Q8I1mD032184@ratthing-b246.strakt.com>
Message-ID: <20040926082610.GA19356@solar.trillke.net>

[Laura Creighton Sun, Sep 26, 2004 at 10:18:01AM +0200]
> If you click on the 'documentation' link you get a traceback!  ooops.  I will
> look at this later unless somebody beats me to it.

fixed, but please don't spam the list with this but send a mail
to pypywww at codespeak.net or to me personally. 

thanks, 

    holger


From lac at strakt.com  Sun Sep 26 10:28:22 2004
From: lac at strakt.com (Laura Creighton)
Date: Sun, 26 Sep 2004 10:28:22 +0200
Subject: [pypy-dev] http://projects.edgewall.com/qunittest/ 
In-Reply-To: Message from hpk@trillke.net (holger krekel) of "Sun,
	26 Sep 2004 09:50:05 +0200." <20040926075005.GY19356@solar.trillke.net>
References: <200409260559.i8Q5xKGZ031793@ratthing-b246.strakt.com>
	<20040926075005.GY19356@solar.trillke.net> 
Message-ID: <200409260828.i8Q8SMg6032258@ratthing-b246.strakt.com>

In a message of Sun, 26 Sep 2004 09:50:05 +0200, holger krekel writes:
>Hi Laura, 
>
>[Laura Creighton Sun, Sep 26, 2004 at 07:59:20AM +0200]
>> I wonder how hard that would be to integrate with our unittest framewor
>k?
>
>Look into the source and tell us :-) 

Ok, will do.

>Judging from looking at 'trac' which is also hosted at edgewell i guess
>it shouldn't be hard.  
>
>Btw, the 'std' stuff [*] i showed at EuroPython which contains
>the unittest framework is soon to be renamed/refactored to the
>root name 'py' and the testing part will be 'py.test'.  When
>i'll get to the point of integrating all this into pypy then
>i'll ask you for help with the nice renaming tool you have
>written ... 

Ooops, I just said the wrong thing to Ian Bicking, then ... 

What is the time frame for that?  next week?
Maybe we should hack on it when I am in Berlin?

>And while we are at it, does anyone have real experience with
>"bicycle repair man"? I would like to try it for a
>renaming/refactoring session and am interested in any
>experiences. A refactoring tool supporting refactoring/renaming
>would probably be of help to PyPy. 

Shae Errison, whom I have cc'd this reply to knows about this and has real
experience.  I got it up and running, played with it for 2 hours, thought
'this was neat' and then never did anything more ....

Laura

>
>cheers, 
>
>    holger
>
>
>[*] http://codespeak.net/svn/user/hpk/talks/std-talk.txt
>_______________________________________________
>pypy-dev at codespeak.net
>http://codespeak.net/mailman/listinfo/pypy-dev


From arigo at tunes.org  Mon Sep 27 18:10:24 2004
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 27 Sep 2004 17:10:24 +0100
Subject: [pypy-dev] Rethinking genc with graph rewriting
Message-ID: <20040927161024.GA3371@vicky.ecs.soton.ac.uk>

Hi,

I thought a bit more about how to generate the low-level translation of
RPython code.  It might indeed be possible to do it by rewriting the flow
graph until it contains mostly lower-level operations, and then translating it
to C code straightforwardly, if we insert an intermediate optimization phase
between these two phases.  Here are some more details.

I'm not sure in which order to present the ideas.  I apologize for the lengthy
e-mail, it's still all a bit too early to write down as a formal ReST
documentation...

===Motivation===

See for example how W_ListObject is implemented in the stdobjspace (which is
RPython, too).  It has got a size in 'ob_size' and list of items in 'ob_item'.  
The naive implementation of a list in RPython is with a PyListObject*, i.e. as
a pointer to a structure containing (1) the length and (2) another pointer to
the actual items.  If we do it this way, then W_ListObject will end up being
implemented badly:

struct W_BadListObject {
    ...header and refcount...
    int ob_size;                      // for the ob_size field
    PySomeKindOfListObject* ob_item;  // for the ob_item field
};
struct PySomeKindOfListObject {
    int len;
    PyObject** items;
};

So it takes two indirections from a W_ListObject to its array of items,
although it only takes one in CPython's own lists.  Why?  Because the
'ob_item' list inside W_ListObjects could, maybe, escape the W_ListObject
instances and outlive them, or be modified from somewhere else.  If you think
about it this way then the extra indirection is needed.  But by looking
closely at the source of listobject.py (or its flow graph version), an
automatic analysis can figure out that this particular 'ob_item' field never
escapes the control of W_ListObject.  Thus we don't need the first indirection
in this case.  The PySomeKindOfListObject can be *inlined* into the structure
implementing W_ListObject:

struct W_ListObject {
    ...type and refcount headers...
    int ob_size;
    PySomeKindOfListObject ob_item;   // not a pointer any more!
};

Now, it looks a lot like CPython's PyListObject structure, with 'ob_item.len'
playing the role of the 'allocated' field of CPython.

I believe that this kind of "structure-inlining" optimization is important.  
It is something that cannot easily be done in the current genc_* files.  
That's why I started thinking along the lines I will described below in more
details.  I like this idea because it also works with simpler types like
integers, not just lists:  we don't have to do all the type-juggling in genc_*
any more; instead, we consider integers as PyIntObject*, and by the same
inlining mecanisms replace them with PyIntObject only -- even better,
PyIntObject without the type and refcount headers.  What remains?  The ob_ival
field only.  In other words by inlining a PyIntObject* declaration we get a
structure with only one C "long".  (From there it is easy to actually get rid
of the struct and replace it with its single field.)


===In more details===

Let's say we start with a flow graph.  For this example, let's consider the
flow graph generated for the function 'def f(x): return x+1'.


block1(x):
  add(x, 1) -> y
  goto return_block(y)


The annotation code as it is today will infer that y is a SomeInteger() if we
tell it that x is a SomeInteger().  Moreover the '1' is also a SomeInteger()  
with the attribute 'const' set to 1.

What I propose is that we add a set of rewrite rules, which could be freely
applied.  The rules say essentially that whenever we see a given operation
with the given annotations, it can be rewritten into one (or possibly several)  
other operations.  For example, a long rule could be:


method_extend(lst1: SomeList, lst2: SomeList) -> result: SomeList
-----------------------------------------------------------------
  len(lst2) -> c
  growlist(lst1, c)
  goto loop[i=0]
loop:
  lt(i, c) -> cond
  switch cond:
    case False: goto sequel
    case True:  goto body
body:
  getitem(lst2, i) -> x
  fastappend(lst1, x)
  add(i, 1) -> i1
  goto loop[i=i1]
sequel:


Maybe rules should be put in a file not in Python syntax, and parsed by the
rule-applying code.  I'm not sure if the above pseudo-syntax would do, but
maybe something better along these lines would work.  Anyway, there would also
be simpler rules like:


add(int1: SomeInteger, int2: SomeInteger) -> result: SomeInteger
----------------------------------------------------------------
add_i(int1.ob_ival, int2.ob_ival) -> result.ob_ival


Note the reference to the field ob_ival of PyIntObjects.  The above line says
that the operation 'add' (which is PyNumber_Add()), although fine for any kind
of objects, could be optimized if we knew that all three involved objects
follow the "structure" of SomeInteger.  In this case we can perform it using
the new operation 'add_i' on the ob_ival fields of the structures.  Once more
the intended meaning of the rule is: the operation above the line is perfectly
valid on its own -- it would produce a call to PyNumber_Add() in C -- but for
efficiency it can be replaced with the operation below the line.

The idea is thus to start from the basic C code generator that just writes
PyNumber_Add() and similar calls for any operation, and gradually move to
"inlined" operation like add_i which correspond just to "+" in C.  If we only
do that, the C code would still manipulate heap-built objects only, i.e.  
PyIntObjects;  we would just have replaced a call to PyNumber_Add() with
inlined code like:

   result = ...create the PyIntObject...;
   ((PyIntObject*) result)->ob_ival =
       ((PyIntObject*) int1)->ob_ival +
       ((PyIntObject*) int2)->ob_ival;

The idea is then to use structure inlining to detect and get rid of the
PyIntObject (and other heap structures).  This requires some kind of
whole-program analysis.  This analysis be done as an intermediate phase,
between the rule-rewriting and the C generation phase.  It would look at how
each field of each object is used, and based on this deduce how each structure
is best implemented.  What I'm thinking about is something like this:

* look if an object's structure is mutable or not, i.e. if we ever write to
  its fields.  (This should distinguish initialization from later rewrites.)

* look if an object is shared or not: if references to an object don't escape
  too far into various parts of the program then we don't need to refcount it.
  More precisely, we might be able to assign to the object a "parent" object,
  a container which is guaranteed to exist at least as long as its child.
  Thus the "parent" has got the only reference (no refcount needed) and others
  can borrow it.  The parent can be the frame of a function, too, for objects
  that never outlive the function in which they were created.

The point is that both non-shared and shared but immutable objects can be
inlined into their parent, so that we can get rid of the heap allocation and
the access indirection.  For example, as PyIntObjects are all immutable, they
can all be inlined: any "PyObject*" field or variable known to point to a
PyIntObject can be replaced with a headerless PyIntObject structure in-place,
or even directly with its single "long" field.

For the above example we'd get code like this:

   result.ob_ival = int1.ob_ival + int2.ob_ival;

Or after inlining the single field:

   result__ob_ival = int1__ob_ival + int2__ob_ival;

Apart from the long variable names (ugh! __ob_ival after each RPython variable
containing an integer :-), this is good !


Sorry for the long messages, I hope I could get some of my motivations
through.

Armin


From arigo at tunes.org  Mon Sep 27 19:06:39 2004
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 27 Sep 2004 18:06:39 +0100
Subject: [pypy-dev] Rethinking genc with graph rewriting
In-Reply-To: <20040927161024.GA3371@vicky.ecs.soton.ac.uk>
References: <20040927161024.GA3371@vicky.ecs.soton.ac.uk>
Message-ID: <20040927170639.GA12735@vicky.ecs.soton.ac.uk>

PS.

I realize my previous e-mail carried a high risk of confusion.  In there,
W_ListObject was used as an example of RPython code; it was just an example,
unrelated to the fact that the example discusses how to implement lists like
the W_ListObject.ob_item field.  I could have chosen W_TupleObject or
W_DictObject as the example, and it would still have been about how to
implement their fields that are RPython lists (e.g. W_TupleObject.wrappeditems
or W_DictObject.data).

In the particular example of W_ListObject, we would like (it is our goal) that
the W_ListObject class be translated into a C structure with three fields that
look like "size", "allocated" and "items_ptr".  We want this because it is the
best C-ish version of the W_ListObject class, among less efficient variants
incurring more indirections; CPython uses this most efficient variant too in
its C structure called PyListObject.

This is not to be confused with lists as understood by RPython.  The field
W_ListObject.ob_item is such a list.  They are marked with a SomeList()
annotation.  The RPython-to-C translator needs to know precisely what such
lists are, and how to turn them into C code.  We decided earlier that such
lists would be translated as some sort of simple C array with no
over-allocation, so that they look like a pointer to a structure with a
"length" and an "items_ptr" field.  This, the translator knows.  It is so with
any usage of lists in RPython code, not just in W_ListObject.

So the goal in the example was how to have the translator automatically turn a
class like W_ListObject into a C structure that would have fields that look
like "size", "allocated" and "items_ptr", and doing so using the knowledge
that W_ListObject has two fields, ob_size and ob_item, with annotations
SomeInteger() and SomeList() respectively.


A bient?t,

Armin.


From arigo at tunes.org  Mon Sep 27 21:25:41 2004
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 27 Sep 2004 20:25:41 +0100
Subject: [pypy-dev] Python 2.4a3
Message-ID: <20040927192541.GA26837@vicky.ecs.soton.ac.uk>

Hello,

Just out of interest (I haven't investigated yet), py.py loads in Python 2.4a3
but 'import dis' apparently sends it into an infinite loop.  It also prints
'faking <type 'module'>' just after 'import dis', which it doesn't do with
Python 2.3.3.

Funny.


Armin


From arigo at tunes.org  Wed Sep 29 22:13:35 2004
From: arigo at tunes.org (Armin Rigo)
Date: Wed, 29 Sep 2004 21:13:35 +0100
Subject: [pypy-dev] Python 2.4a3
In-Reply-To: <20040927192541.GA26837@vicky.ecs.soton.ac.uk>
References: <20040927192541.GA26837@vicky.ecs.soton.ac.uk>
Message-ID: <20040929201335.GA29008@vicky.ecs.soton.ac.uk>

Hi,

On Mon, Sep 27, 2004 at 08:25:41PM +0100, Armin Rigo wrote:
> Just out of interest (I haven't investigated yet), py.py loads in Python 2.4a3
> but 'import dis' apparently sends it into an infinite loop.  It also prints
> 'faking <type 'module'>' just after 'import dis', which it doesn't do with
> Python 2.3.3.

This is due to opcode.py, which in 2.4 uses string formatting to build opcode
names, while in 2.3 it uses concatenation.  From the diff:

< for op in range(256): opname[op] = '<' + `op` + '>'
---
> for op in range(256): opname[op] = '<%r>' % (op,)

As it happens, string formatting is *really* *slow* in PyPy now.  Every one
takes about 1 second!  So importing opcode.py takes several minutes.

Quoting Michael, "time to make string formatting faster".


Armin


From hpk at trillke.net  Wed Sep 29 22:21:25 2004
From: hpk at trillke.net (holger krekel)
Date: Wed, 29 Sep 2004 22:21:25 +0200
Subject: [pypy-dev] Python 2.4a3
In-Reply-To: <20040929201335.GA29008@vicky.ecs.soton.ac.uk>
References: <20040927192541.GA26837@vicky.ecs.soton.ac.uk>
	<20040929201335.GA29008@vicky.ecs.soton.ac.uk>
Message-ID: <20040929202125.GF19356@solar.trillke.net>

[Armin Rigo Wed, Sep 29, 2004 at 09:13:35PM +0100]
> < for op in range(256): opname[op] = '<' + `op` + '>'
> ---
> > for op in range(256): opname[op] = '<%r>' % (op,)

is there a deeper reason for this change, btw? 

> As it happens, string formatting is *really* *slow* in PyPy now.  Every one
> takes about 1 second!  So importing opcode.py takes several minutes.
> 
> Quoting Michael, "time to make string formatting faster".

which probably means to reimplement it at interpreter level ...
which might be cumbersome but then it might be straight forward ... 

    holger


From ianb at colorstudy.com  Thu Sep 30 02:46:45 2004
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed, 29 Sep 2004 19:46:45 -0500
Subject: [pypy-dev] utest, development and discussion
In-Reply-To: <200409260821.i8Q8LqOU032224@ratthing-b246.strakt.com>
References: <415672F3.9060808@colorstudy.com>
	<200409260821.i8Q8LqOU032224@ratthing-b246.strakt.com>
Message-ID: <415B5775.3040402@colorstudy.com>

Laura Creighton wrote:
>>Anyway, I've started working with it some.  I've added one small feature 
>>(dropping into pdb when an exception occurs), and there's sure to be 
>>some more, particularly documentation.  Where should I send patches? 
>>Where should discussion occur?  And maybe a website?
> 
> 
> Discussion belongs here.  Patches too unless you want to get a project login,
> a process that Holger handles.  There isn't a separate part of the pypy wiki -
> or the website for utest.  Probably that should change.  And documentation
> belongs here: http://codespeak.net/pypy/index.cgi?doc  which we generate
> out of files in pypy/trunk/doc  .  You write your docs in ReST and a daemon
> comes along and makes html out of them for you.

Since std (I guess to be named py) isn't under pypy, I assume 
documentation should go in std/trunk/doc?  Can it be set up that this is 
also turned into HTML?

-- 
Ian Bicking  /  ianb at colorstudy.com  / http://blog.ianbicking.org


From mwh at python.net  Thu Sep 30 13:55:07 2004
From: mwh at python.net (Michael Hudson)
Date: Thu, 30 Sep 2004 12:55:07 +0100
Subject: [pypy-dev] Re: Python 2.4a3
References: <20040927192541.GA26837@vicky.ecs.soton.ac.uk>
	<20040929201335.GA29008@vicky.ecs.soton.ac.uk>
	<20040929202125.GF19356@solar.trillke.net>
Message-ID: <2mbrformxw.fsf@starship.python.net>

hpk at trillke.net (holger krekel) writes:

> [Armin Rigo Wed, Sep 29, 2004 at 09:13:35PM +0100]
>> < for op in range(256): opname[op] = '<' + `op` + '>'
>> ---
>> > for op in range(256): opname[op] = '<%r>' % (op,)
>
> is there a deeper reason for this change, btw? 
>
>> As it happens, string formatting is *really* *slow* in PyPy now.  Every one
>> takes about 1 second!  So importing opcode.py takes several minutes.
>> 
>> Quoting Michael, "time to make string formatting faster".
>
> which probably means to reimplement it at interpreter level ...
> which might be cumbersome but then it might be straight forward ... 

Or work out why it's so painfully slow currently, via hotshot or
whatever.  One could create a half-assed interpreter level
implementation by just executing the current code at interpreter level
(and probably changing a few little things, like calling space.str
instead of str).  It almost certainly wouldn't be RPython though (esp.
the floating point stuff which uses long arithmetic; the rest might be
close).

Not a lot of fun, though.

Cheers,
mwh

-- 
  Important data should not be entrusted to Pinstripe, as it may
  eat it and make loud belching noises.
   -- from the announcement of the beta of "Pinstripe" aka. Redhat 7.0