[Web-SIG] Standardized configuration

Sun Jul 24 03:01:25 CEST 2005

Chris McDonough wrote:
> On Fri, 2005-07-22 at 17:26 -0500, Ian Bicking wrote:
>>>  To do this, we use a ConfigParser-format config file named
>>>  'myapplication.conf' that looks like this::
>>>
>>>    [application:sample1]
>>>    config = sample1.conf
>>>    factory = wsgiconfig.tests.sample_components.factory1
>>>
>>>    [application:sample2]
>>>    config = sample2.conf
>>>    factory = wsgiconfig.tests.sample_components.factory2
>>>
>>>    [pipeline]
>>>    apps = sample1 sample2
>>
>>I think it's confusing to call both these applications.  I think 
>>"middleware" or "filter" would be better.  I think people understand 
>>"filter" far better, so I'm inclined to use that.  So...
> 
> 
> The reason I called them applications instead of filters is because all
> of them implement the WSGI "application" API (they all implement "a
> callable that accepts two parameters, environ and start_response").
> Some happen to be gateways/filters/middleware/whatever but at least one
> is just an application and does no delegation.  In my example above,
> "sample2" is not a filter, it is the end-point application.  "sample1"
> is a filter, but it's of course also an application too.

Well, the difference I see is that a filter accepts a next-application, 
where a plain application does not.  From the perspective of this 
configuration file, those seem ver different.  In fact, it could 
actually be:

   [application:sample1]
   config = sample1.conf
   factory = ...

   ...

   [application:real_sample1]
   pipeline = printdebug_app sample1

That is, a "pipeline" simply describes a new application.  And then -- 
perhaps with a conventional name, or through some more global 
configuration -- we indicate which application we are going to serve.

Hmm... thinking about it, this seems much more general, in a very useful 
way, since anyone can plugin in ways to compose applications. 
"pipeline" is just one use case for how to compose applications.

> Would you maybe rather make it more explicit that some apps are also
> gateways, e.g.:
> 
> [application:bleeb]
> config = bleeb.conf
> factory = bleeb.factory
> 
> [filter:blaz]
> config = blaz.conf
> factory = blaz.factory
> 
> ?  I don't know that there's any way we could make use of the
> distinction between the two types in the configurator other than
> disallowing people to place an application "before" a filter in a
> pipeline through validation.  Is there something else you had in mind?

I have forgotten what the actual factory interface was, but I think it 
should be different for the two.  Well, I think it *is* different, and 
passing in a next-application of None just covers up that difference.

>>[application:sample2]
>># What is this relative to?  I hate both absolute paths and
>># paths relative to pwd equally...
>>config = sample1.conf
>>factory = wsgiconfig...
> 
> 
> This was from a doctest I wrote so I could rely on relative paths,
> sorry.  You're right.  Ummmm... we could probably cause use the
> environment as "defaults" to ConfigParser inerpolation and set whatever
> we need before the configurator is run:
> 
> $ export APP_ROOT=/home/chrism/myapplication
> $ ./wsgi-configurator.py myapplication.conf
> 
> And in myapplication.conf:
> 
> [application:sample1]
> config = %(APP_ROOT)s/sample1.conf
> factory = myapp.sample1.factory

I hate %(APP_ROOT)s as a syntax; I think it's okay to simply say that 
the configuration loader (in some fashion) should determine the root 
(maybe with an environmental variable or command line parameter).

Though, realistically, there might be several app roots.  Apache's root 
directory configuration (for relative paths) isn't very useful to me, in 
practice, because it's not flexible enough nor allow more than one root.

>>But this is reasonably easy to resolve -- there's a perfectly good 
>>configuration section sitting there, waiting to be used:
>>
>>   [filter:profile]
>>   factory = paste.profilemiddleware.ProfileMiddleware
>>   # Show top 50 functions:
>>   limit = 50
>>
>>This in no way precludes 'config', which is just a special case of this 
>>general configuration.  The only real problem is a possible conflict if 
>>we wanted to add new special names to the configuration, i.e., 
>>meta-filter-configuration.
> 
> 
> I think I'd maybe rather see configuration settings for apps that don't
> require much configuration to come in as environment variables (maybe
> not necessarily in the "environ" namespace that is implied by the WSGI
> callable interface but instead in os.environ).  Envvars are
> uncontroversial, so they don't cost us any coding time, PEP time, or
> brain cycles.

Yikes!  Were you like the ZConfig holdout or something?  os.environ is 
way, way, way too inflexible.

Just the other day I was able to deploy a single application I wrote 
with two configurations in the same process, without having thought 
about that possibility ahead of time, and without doing any extra work 
or avoiding any particular shortcuts.  It worked absolutely seamlessly, 
because I wasn't using any global variables, and I had stuck to a 
convention where Paste nests configurations in a safe manner. 
os.environ is very global, very hard to work with from a UI perspective, 
and very invisible.  These configuration files should be totally 
encapsulated, and easy to nest.

There's a small number of places where I might be open to using 
environmental variables as an *optional* way to feed information, like 
APP_ROOT (but even there I feel strongly there should be a 
configuration-file-based way to say the same thing).  For middleware 
configuration it makes no sense at all -- configuration must be 
encapsulated in the file itself (or the files that are referenced).

>>Another thing this could allow is recursive configuration, like:
>>
>>[application:urlmap]
>>factory = paste.urlmap.URLMapBuilder
>>app1 = blog
>>app1.url = /
>>app2 = statview
>>app2.url = /stats
>>app3 = cms
>>app3.host = dev.*
>>
>>[application:blog]
>>factory = leonardo.wsgifactory
>>config = myblog.conf
>>
>>[application:statview]
>>factory = statview
>>log_location = /var/logs/apache2
>>
>>[application:cms]
>>factory = proxy
>>location = http://localhost:8080
>>map = / /cms.php
>>
>>[pipeline]
>>app = urlmap
>>
>>
>>So URLMapBuilder needs the entire configuration file passed in, along 
>>with the name of the section it is building.  It then reads some keys, 
>>and builds some named applications, and creates an application that 
>>delegates based on patterns.  That's the kind of configuration file I 
>>could really use.
> 
> 
> Maybe one other (less flexible, but declaratively configurable and
> simpler to code) way to do this might be by canonizing the idea of
> "decision middleware", allowing one component in an otherwise static
> pipeline to decide which is the "next" one by executing a Python
> expression which runs in a context that exposes the WSGI environment.
> 
> [application:blog]
> factory = leonardo.wsgifactory
> config = myblog.conf
> 
> [application:statview]
> factory = statview
> 
> [application:cms]
> factory = proxy
> 
> [decision:urlmapper]
> cms = environ['PATH_INFO'].startswith('/cms')
> statview = environ['PATH_INFO'].startswith('/statview')
> blog = environ['PATH_INFO'].startswith('/blog')

Well, that's hard to imagine working.  First, you'd need a way to import 
new functions, since a large number of use cases can't be handled 
without imports (like re).  But even then, these transformations 
typically modify the environment.  For instance, if you map /cms to an 
application, you have to put /cms onto SCRIPT_NAME, and take it off of 
PATH_INFO.  This keeps URL introspection sane.

But the example I gave seems just as declarative to me (moreso, even), 
and not hard to implement.  It just requires that the factory get a 
reference to the full parsed configuration file.

> [environment]
> statview.log_location = /var/logs/apache2
> cms.location = http://localhost:8080
> cms.map = / /cms.php
> 
> [pipeline]
> apps = urlmapper

> Yes.  OTOH, when a certain level of dynamicism is reached, it's no
> longer possible to configure things declaratively because it becomes a
> programming task, and this proposal is (so far) just about being able to
> configure things declaratively so I think we need some sort of
> compromise.
> 
> 
>>I think this can be achieved simply by defining a standard based on the 
>>object interface, where the configuration file itself is a reference 
>>implementation (that we expect people will usually use).  Semantics from 
>>the configuration file will leak through, but it's lot easier to deal 
>>with (for example) a system that can only support string configuration 
>>values, than a system based on concrete files in a specific format.
> 
> 
> Sorry, I can't parse that paragraph.

I mean that a standard should be in terms of what interface the 
factories must implement, and what objects they are given.  The actual 
implementation of a loader based on an INI configuration file is a 
useful reference library (and maybe the only library we need), but 
shouldn't be part of the standard.

>>> - If elements in the pipeline depend on "services" (ala
>>>   Paste-as-not-a-chain-of-middleware-components), it may be
>>>   advantageous to create a "service manager" instead of deploying
>>>   each service as middleware.  The "service manager" idea is not a
>>>   part of the deployment spec.  The service manager would itself
>>>   likely be implemented as a piece of middleware or perhaps just a
>>>   library.
>>
>>That might be best.  It's also quite possible for the factory to 
>>instantiate more middleware.
> 
> 
> Which factory?

The object referenced by the "factory" key in the configuration file.

-- 
Ian Bicking  /  ianb at colorstudy.com  / http://blog.ianbicking.org