From gward@mems-exchange.org  Wed Sep  5 16:22:26 2001
From: gward@mems-exchange.org (Greg Ward)
Date: Wed, 5 Sep 2001 11:22:26 -0400
Subject: [Types-sig] ANNOUNCE: Grouch 0.1
Message-ID: <20010905112226.A2236@mems-exchange.org>

Grouch 0.1
==========

Grouch is a system for describing and enforcing a Python object schema.
That is, it provides you with a language for describing the intended
type signatures of your objects (collectively, the "object schema"), and
tools to walk an object graph, checking that every value found matches
your object schema.

In the current version, your object schema is described in
specially-formatted class docstrings.  (I have vague plans for a
standalone schema language but don't really have a need for it myself,
so haven't done any work in that direction.)  The gen_schema script
walks a directory tree looking for Python modules, parses any class
docstrings it finds, and writes the resulting (pickled) object schema to
a file.

The second phase is to type-check some existing data -- i.e. make sure
that every value in an object graph conforms to the object schema
extracted from your class docstrings.  This is done with the check_data
script, which knows about a couple of popular Python persistence
mechanisms (just ZODB and pickle so far).  If you just want to check an
object graph in memory, you'll have to write a few lines of code to take
the place of running check_data on a persistent object store (this is
not yet covered by the documentation).

[Grouch was pre-announced to the types-sig in late August when it was
 still called Oscar.  The only change is the name -- there are too many
 things called Oscar already in the world.]


REQUIREMENTS
------------

Grouch requires Python 2.0 or greater, with Jeremy Hylton's "compiler"
package installed.  At least in Python 2.0 .. 2.1.1, this package is
included in Python's source distribution, but not installed as part of
the standard library.

Grouch also utilizes the SPARK parser framework by John Aycock; for your
convenience, a copy is included with Grouch.


AUTHOR & AVAILABILITY
---------------------

Grouch was written by Greg Ward <gward@mems-exchange.org>.  Includes
code (lib/spark.py) written by John Aycock, which is licensed
separately.

For the latest version, visit
  http://www.mems-exchange.org/software/grouch/


-- 
Greg Ward - software developer                gward@mems-exchange.org
MEMS Exchange                            http://www.mems-exchange.org


From jriehl@spaceship.com  Thu Sep 13 20:49:32 2001
From: jriehl@spaceship.com (Jonathan Riehl)
Date: Thu, 13 Sep 2001 14:49:32 -0500 (CDT)
Subject: [Types-sig] Re: PEP 269
In-Reply-To: <200109111019.MAA16642@pandora.informatik.hu-berlin.de>
Message-ID: <Pine.BSF.4.33.0109131325380.78082-100000@saucer.spaceship.com>

Howdy all,
	I'm afraid Martin's attention to the PEP list has outted me
before I was able to post about this myself.  Anyway, for those
interested, I wrote a PEP for the exposure of pgen to the Python
interpreter.  You may view it at:

http://python.sourceforge.net/peps/pep-0269.html

	I am looking for comments on this PEP, and below, I address some
interesting issues raised by Martin.  Furthermore, I already have a
parially functioning reference implementation, and should be pestered to
make it available shortly.

Thanks,
-Jon

On Tue, 11 Sep 2001, Martin von Loewis wrote:

> Hi Jonathan,
>
> With interest I noticed your proposal to include Pgen into the
> standard library. I'm not sure about the scope of the proposed change:
> Do you view pgen as a candidate for a general-purpose parser toolkit,
> or do you "just" contemplate using that for variations of the Python
> grammar?

I am thinking of going for the low hanging fruit first (a Python centric
pgen module), and then adding more functionality for later releases of
Python (see below.)

> If the former, I think there should be a strategy already how
> to expose pgen to the application; the proposed API seems
> inappropriate. In particular:
>
> - how would I integrate an alternative tokenizer?
> - how could I integrate semantic actions into the parse process,
>   instead of creating the canonical AST?

The current change proposed is somewhat restrained by the Python 2.2
release schedule, and will initially only address building parsers that
use the Python tokenizer.  If the module misses 2.2 release, I'd like to
make it more functional and provide the ability to override the Python
tokenizer.  I may also add methods to export all the data found in the DFA
structure.

I am unsure what the purpose of integration of semantics into the parse
process buys us besides lower memory overhead.  In C/C++ such coupling is
needed because of the TYPEDEF/IDENTIFIER tokenization problem, but I
don't see Python and future Python-like, LL(1), languages needing such
hacks.  Finally, I am prone to enforce the separation of the backend
actions from the AST.  This allows the AST to be used for a variety of
purposes, rather than those intended by the initial parser developer.

> Of course, these questions are less interesting if the scope is to
> parse Python: in that case, Python tokenization is fine, and everybody
> is used to getting the Python AST.

An interesting note to make about this is that the since the nonterminal
integer values are generated by pgen, pgen AST's are not currently
compatible with the parser module AST's.  Perhaps such unification may be
slated for future work (I know Fred left room in the parser AST datatype
for identification of the grammar that generated the AST using an integer
value, but using this would be questionable in a "rapid parser
development" environment.)

> On the specific API, I think you should drop the File functions
> (parseGrammarFile, parseFile). Perhaps you can also drop the String
> functions, and provide only functions that expect file-like objects.

I am open to further discussion on this, but I would note that filename
information is used (and useful) when reporting syntax errors.  I think
that the "streaming" approach to parsing is another hold over from days
where memory constraints ruled (much like binding semantics to the parser
itself.)

> On the naming of the API functions: I propose to use an underscore
> style instead of the mixedCaps style, or perhaps to leave out any
> structure (parsegrammar, buildparser, parse, symbol2string,
> string2symbolmap). That would be more in line with the parser module.

I would like to hear more about this from the Pythonati.  I am currently
following the naming conventions I use at work, which of course is most
natural for me at home. :)

>
> Regards,
> Martin
>
>


From jriehl@spaceship.com  Fri Sep 14 22:43:58 2001
From: jriehl@spaceship.com (Jonathan Riehl)
Date: Fri, 14 Sep 2001 16:43:58 -0500 (CDT)
Subject: [Types-sig] Re: PEP 269
In-Reply-To: <200109140932.LAA28452@pandora.informatik.hu-berlin.de>
Message-ID: <Pine.BSF.4.33.0109141553510.64840-100000@saucer.spaceship.com>

On Fri, 14 Sep 2001, Martin von Loewis wrote: (In response to Samuele
Pedroni)
> > - if it's purpose is to offer a framework for small languages
> > support, there are already modules around that support that
> > (SPARK, PLY ...), the only advantage of PEP 269 being speed
> > wrt to the pure python solutions, because of the use of the internal
> > CPython parser, OTOH the other solutions are more flexible...
>
> I agree. I'd like to see (or perhaps write myself) a proposal for
> adding one or two of these packages to the Python core (two are
> probably better, since there is no one-size-fits-all parser framework,
> and adding two avoids the impression that there is a single "blessed"
> parser).

I would like to note that integration with these other systems are in
the plans for my Basil project (http://wildideas.org/basil/).  I just felt
that a pgen integration would be better suited to the native code base
rather than copying the code over to another project and building it as an
extension module (which was a route being explored by Mobius Python.)

> > It should be considered that Jython does not contain a parser
> > similar to CPython one. Because of this jython does not offer parser
> > module support. So implementing the PEP for Jython would require
> > writing a Java or pure python equivalent of the CPython parser.

I am all for writing a pgen implementation in pure Python.  The reason I
am not going this route from the get go is to do what is easy before we do
what is less easy.  If, for example, a reference implementation of a
Python type system was to be adopted as standard, I would think that
making the new system easy to add to Jython would be a prerequisite.
Hence, we would need to develop a Jython parser that uses the grammar
from CPython.

> If the goal is to play with extensions to the Python grammar, I think
> this is less of an issue. Of course, anybody wanting to extend the C
> grammar could easily modify the Python interpreter itself.

> So I think I'm -1 on this PEP, on the basis that this is code bloat
> (i.e. new functionality used too rarely).

As stated in the PEP, one of the primary motiviations for the proposal is
to allow grammar extensions to be prototyped in Python (esp. optional
static typing.)  I would argue that making actual changes to CPython is
much more expensive than writing a front end in Python.  By adding a pgen
module to Python, I feel that we are not bloating Python so much as we are
exposing funtionality already built into Python.

Thanks,
-Jon


From Samuele Pedroni <pedroni@inf.ethz.ch>  Fri Sep 14 23:54:44 2001
From: Samuele Pedroni <pedroni@inf.ethz.ch> (Samuele Pedroni)
Date: Sat, 15 Sep 2001 00:54:44 +0200 (MET DST)
Subject: [Types-sig] Re: PEP 269
Message-ID: <200109142254.AAA22101@core.inf.ethz.ch>

[jriehl]
> 
> On Fri, 14 Sep 2001, Martin von Loewis wrote: (In response to Samuele
> Pedroni)
> > > - if it's purpose is to offer a framework for small languages
> > > support, there are already modules around that support that
> > > (SPARK, PLY ...), the only advantage of PEP 269 being speed
> > > wrt to the pure python solutions, because of the use of the internal
> > > CPython parser, OTOH the other solutions are more flexible...
> >
> > I agree. I'd like to see (or perhaps write myself) a proposal for
> > adding one or two of these packages to the Python core (two are
> > probably better, since there is no one-size-fits-all parser framework,
> > and adding two avoids the impression that there is a single "blessed"
> > parser).
> 
> I would like to note that integration with these other systems are in
> the plans for my Basil project (http://wildideas.org/basil/).
Which is not in the scope of the PEP.

>  I just felt
> that a pgen integration would be better suited to the native code base
> rather than copying the code over to another project and building it as an
> extension module (which was a route being explored by Mobius Python.)
I see, but as you can see there are other issues that come into play.
Jython and first the appropriateness of the interface.

> 
> > > It should be considered that Jython does not contain a parser
> > > similar to CPython one. Because of this jython does not offer parser
> > > module support. So implementing the PEP for Jython would require
> > > writing a Java or pure python equivalent of the CPython parser.
> 
> I am all for writing a pgen implementation in pure Python.  The reason I
> am not going this route from the get go is to do what is easy before we do
> what is less easy.  
:)

> If, for example, a reference implementation of a
> Python type system was to be adopted as standard, I would think that
> making the new system easy to add to Jython would be a prerequisite.
> Hence, we would need to develop a Jython parser that uses the grammar
> from CPython.
Of course Jython has a (different and not run-time configurable) parser that
can be extended in case, so I don't get the point.

> 
> > If the goal is to play with extensions to the Python grammar, I think
> > this is less of an issue. Of course, anybody wanting to extend the C
> > grammar could easily modify the Python interpreter itself.
> 
> > So I think I'm -1 on this PEP, on the basis that this is code bloat
> > (i.e. new functionality used too rarely).
> 
> As stated in the PEP, one of the primary motiviations for the proposal is
> to allow grammar extensions to be prototyped in Python (esp. optional
> static typing.)  I would argue that making actual changes to CPython is
> much more expensive than writing a front end in Python.
But you could write that using one of the SPARK, PLY ... tools
And in any case the PEP is ignoring the part about how to produce
the actual code from a Python front-end. And how to add possibly necessary
new bytecodes...


>  By adding a pgen
> module to Python, I feel that we are not bloating Python so much as we are
> exposing funtionality already built into Python.
> 
Yes but how much is worth to expose such functionality depends on the whole
picture: how you want to concretely use the exposed functionality?

It is all but clear, how you can exploit a python exposed pgen and parser
in order to make as easy as possible for a casual user to experiment
with a grammar extension?
In an ideal scenario the user would install somefile in his python installation
and start python having access to the extension. How this can work
is unanswered by the PEP.

I think that a PEP that adress the whole problem would make more sense and
would be more easy to evaluate.

regards.


From loewis@informatik.hu-berlin.de  Mon Sep 17 08:13:01 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Mon, 17 Sep 2001 09:13:01 +0200 (MEST)
Subject: [Types-sig] Re: PEP 269
In-Reply-To: <Pine.BSF.4.33.0109141553510.64840-100000@saucer.spaceship.com>
 (message from Jonathan Riehl on Fri, 14 Sep 2001 16:43:58 -0500 (CDT))
References: <Pine.BSF.4.33.0109141553510.64840-100000@saucer.spaceship.com>
Message-ID: <200109170713.JAA17225@pandora.informatik.hu-berlin.de>

> As stated in the PEP, one of the primary motiviations for the proposal is
> to allow grammar extensions to be prototyped in Python (esp. optional
> static typing.)  I would argue that making actual changes to CPython is
> much more expensive than writing a front end in Python.  By adding a pgen
> module to Python, I feel that we are not bloating Python so much as we are
> exposing funtionality already built into Python.

The potential problem is that this new module must then be supported
for a long time. People will propose extensions to it, which must be
evaluated, and every change must be reviewed carefully for
incompatibilities.

I'm not opposed to changes. However, I fail to see the value of
prototyping the grammar, since you'll need subsequent changes as well,
to the byte code generation, and perhaps evaluation. Also, I still
doubt anybody interested in changing the grammar couldn't easily
recompile Python.

Regards,
Martin