[Web-SIG] PEP 444 / WSGI2 Proposal: Filters to supplimentmiddleware.

Mon Dec 13 20:42:02 CET 2010

> That looks amazingly like the code for CherryPy Filters circa 2005. In 
> version 2 of CherryPy, "Filters" were the canonical extension method 
> (for the framework, not WSGI, but the same lessons apply). It was still 
> expensive in terms of stack allocation overhead, because you had to 
> call () each filter to see if it was "on". It would be much better to 
> find a way to write something like:
> 
> 
> 
>     for f in ingress_filters:
> 
>         if f.on:
> 
>             f(environ)

.on will need to be an @property in most cases, still not avoiding 
stack allocation and, in fact, doubling the overhead per filter.  
Statically disabled filters should not be added to the filter list.

> It was also fiendishly difficult to get executed in the right order: if 
> you had a filter that was both ingress and egress, the natural tendency 
> for core developers and users alike was to append each to each list, 
> but this is almost never the correct order.

If something is both an ingress and egress filter, it should be 
implemented as middleware instead.  Nothing can prevent developers from 
doing bad things if they really try.  Appending to ingress and 
prepending to egress would be the "right" thing to simulate middleware 
behaviour with filters, but again, don't do that.  ;)

> But even if you solve the issue of static composition, there's still a 
> demand for programmatic composition ("if X then add Y after it"), and 
> even decomposition ("find the caching filter my framework added 
> automatically and turn it off"), and list.insert()/remove() isn't 
> stellar at that.

I have plans (and partial implementation) of a init.d-style 
"needs/uses/provides" declaration and automatic dependency graphing.  
WebCore, for example, adds the declarations to existing middleware 
layers to sort the middleware.

> Calling the filter to ask it whether it is "on" also leads filter 
> developers down the wrong path; you really don't want to have Filter A 
> trying to figure out if some other, conflicting Filter B has already 
> run (or will run soon) that demands Filter A return without executing 
> anything. You really, really want the set of filters to be both 
> statically defined and statically analyzable.

Unfortunately, most, if not all filters need to check for request 
headers and response headers to determine the capability to run.  E.g. 
compression checks environ.get('HTTP_ACCEPT_ENCODING', '').lower() for 
'gzip', and checks the response to determine if a 'Content-Encoding' 
header has already been specified.

> Finally, you want the execution of filters to be configurable per URI 
> and also configurable per controller. So the above should be rewritten 
> again to something like:
> 
> 
> 
>     for f in ingress_filters(controller):
> 
>         if f.on(environ['path_info']):
> 
>             f(environ)
> 
> 
> 
> It was for these reasons that CherryPy 3 ditched its version 2 
> "filters" and replaced them with "hooks and tools" in version 3.

This is possible by wrapping multiple applications, say, in the filter 
middleware adapter with differing filter setups, then using the 
separate wrapped applications with some form of dispatch.  You could 
also utilize filters as decorators.  This is an implementation detail 
left up to the framework utilizing WSGI2, however.  WSGI2 itself has no 
concept of "controllers".

None of this prevents the simplified stack from being useful during 
exception handling, though.  ;)  What I was really trying to do is 
reduce the level of nesting on each request and make what used to be 
middleware more explicit in its purpose.

> You might find more insight by studying the latest cherrypy/_cptools.py

I'll give it a gander, though I firmly believe filter management (as 
middleware stack management) is the domain of a framework on top of 
WSGI2, not the domain of the protocol.

	— Alice.