[Web-SIG] Entry points and import maps (was Re: Scarecrow deployment config

Mon Jul 25 23:54:08 CEST 2005

Hi All,

I'm a bit late coming to all this and didn't really see the benefits of 
the new format over what we already do so I set out to contrast new and 
old to demonstrate why it wasn't *that* useful. I've since changed my 
mind and think it is great but here is the contrasting I did anyway. I'd 
be pleased to hear all the glaring errors :-)

Here is a new example: we want to have an application that returns a 
GZip encoded "hello world" string after it has been made lowercase by 
case changer middleware taking a parameter newCase. The GZip middleware 
is an optional feature of the modules in wsgiFilters.egg and the 
CaseChanger middleware and HelloWorld application are in the helloworld.egg.

The classes look like this:

class HelloWorld:
    def __call__(self, environ, start_response):
        start_response('200 OK', [('Content-type','text/plain')])
        return ['Hello World']

class CaseChanger:
    def __init__(self, app, newCase):
        self.app = app
        self.newCase = newCase

    def __call__(self, environ, start_response):
        for chunk in self.app(environ, start_response):
            if self.newCase == 'lower':
                yield chunk.lower()
            else: 
                yield chunk

Class GZip:
    def __init__(self, app):
        self.app = app

    def __call__(self, environ, start_response):
        # Do clever things with headers here (omitted)
	for chunk in self.app(environ, start_response):
            yeild gzip(chunk)

The way we would write our application at the moment is as follows:

from pkg_resources import require

require('helloworld >= 0.2')
from helloworld import Helloworld

require('wsgiFilters[GZip] == 1.4.3')
from wsgiFilters import GZip

pipeline =  GZip(
                app = CaseChanger(
                    app = HelloWorld(),
                    newCase = 'lowercase',
                )
            )

With pipeline itself somehow being executed as a WSGI application.

The new way is like this (correct me if I'm wrong)

The modules have egg_info files like this respectively defining the 
"entry points":

wsgiFilters.egg:

[wsgi.middleware]
gzipper = GZip:GZip

helloworld.egg:

[wsgi.middleware]
cs = helloworld:CaseChanger

[wsgi.app]
myApp = helloworld:HelloWorld

We would then write an "import map" (below) based on the "deployment 
descriptors" in the .eggs used to describe the "entry points" into the 
eggs. The order the "pipeline" would be built is the same as in the 
Python example eg middleware first then application.

[gzipper from wsgiFilters[GZip] == 1.4.3]
[cs from helloworld  >= 0.2 ]
newCase = 'lower'
[myApp from helloworld >= 0.2]

It is loaded using an as yet unwritten modules which uses a factory 
returning a middleware pipeline equivalent to what would be produced in 
the Python example (is this very last bit correct?)

Doing things this new way has the following advantages:
* We have specified explicitly in the setup.py of the eggs that the 
middleware and applications we are importing are actually middleware and 
an application
* It is simpler for a non-technical user.
* There are lots of other applications for the ideas being discussed

It has the following disadvantages:
* We are limited as to what we can use as variable names. Existing 
middleware would need customising to only accept basic parameters.
* We require all WSGI coders to use the egg format.
* Users can't customise the middleware in the configuration file (eg by 
creating a derived class etc and you lose flexibility).
* If we use a Python file we can directly import and manipulate the 
pipeline (I guess you can do this anyway once your factory has returned 
the pipeline)

Both methods are the same in that
* We have specified the order of the pipeline and the middleware and 
applications involved
* Auto-downloading and installation of middleware and applications based 
on version requirements is possible (thanks to PJE's eggs)
* We have specified which versions of modules we require.
* Both could call a script such as wsgi_CGI.py wsgi_mod_python.py etc to 
execute the WSGI pipeline so both method's files could be distributed as 
a single file and would auto download their own dependencies.

Other ideas:

Is it really necessary to be able to give an entry point a name? If not 
because we know what we want to import anyway, we can combine the 
deployment descriptor into the import map:

[GZip:GZip from wsgiFilters[GZip] == 1.4.3]

We can then simplify the deployment descriptor like this:

[wsgi.middleware]
GZip:GZip

And then remove the colons and give a fully qualified Python-style path:

[GZip.GZip from wsgiFilters[GZip] == 1.4.3]

and

[wsgi.middleware]
GZip.GZip

Is this not better? Why do you need to assign names to entry points?

Although writing a middleware chain is dead easy for a Python 
programmer, it isn't for the end user and if you compare the end user 
files from this example I know which one I'd rather explain to someone. 
So although this deployment format seemed at first like overkill, I'm 
now very much in favour. I was personally considering YAML for doing my 
own configuration using a factory but frankly the new format is much 
cleaner and you don't need all the power of YAML anyway! Count me in!

James