From ianb at colorstudy.com  Tue Aug  2 06:28:51 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon, 01 Aug 2005 23:28:51 -0500
Subject: [Web-SIG] WSGI: Another level of indirection
Message-ID: <42EEF683.6040306@colorstudy.com>

Maybe a way to handle this configuration is to put in another level of 
abstraction, sad as that is.

I'm thinking configuration files could have something like PEP 263's 
encodings, except that it would be an indication of who knows how to 
build the WSGI application from the file.  So it might look like:

# -*- wsgi-build: paste.wsgi_deploy:DeploymentConfig -*-

Which would work with the experimental stuff I mentioned before.  It 
should also work with .ini files, Python source, and probably other 
configuration file syntaxes.  At some point perhaps we'll come up with a 
standard (aka default) builder, but this could remain useful despite 
that.  It also means I can go forward with this right now and still be 
future compatible.

-- 
Ian Bicking  /  ianb at colorstudy.com  / http://blog.ianbicking.org

From pje at telecommunity.com  Tue Aug  2 06:46:19 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 02 Aug 2005 00:46:19 -0400
Subject: [Web-SIG] WSGI: Another level of indirection
In-Reply-To: <42EEF683.6040306@colorstudy.com>
Message-ID: <5.1.1.6.0.20050802003254.026944e0@mail.telecommunity.com>

At 11:28 PM 8/1/2005 -0500, Ian Bicking wrote:
>Maybe a way to handle this configuration is to put in another level of
>abstraction, sad as that is.
>
>I'm thinking configuration files could have something like PEP 263's
>encodings, except that it would be an indication of who knows how to
>build the WSGI application from the file.  So it might look like:
>
># -*- wsgi-build: paste.wsgi_deploy:DeploymentConfig -*-
>
>Which would work with the experimental stuff I mentioned before.  It
>should also work with .ini files, Python source, and probably other
>configuration file syntaxes.  At some point perhaps we'll come up with a
>standard (aka default) builder, but this could remain useful despite
>that.

Now you're *really* scaring me.  Honestly, there's no difference between 
this proposal and saying that we'll use "#!" lines to operationally 
determine the format by specifying an interpreter for it.  There's really 
no *abstraction* taking place here.


>   It also means I can go forward with this right now and still be
>future compatible.

I can understand the desire, but I think it would be a bad idea to give 
this any kind of official standing or allow it to warp the process of 
getting to a workable deployment standard.  Better for you to develop your 
format(s) and try to make them that convinces everyone they're worth 
standardizing on, knowing that if you fail, your format will be a dead 
end.  :)  That should provide you with extra motivation to make it a really 
good format for the rest of us.  ;)

I haven't had a chance to have a serious look in detail at your last format 
proposal, but hope to soon.


From ianb at colorstudy.com  Tue Aug  2 18:12:29 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 02 Aug 2005 11:12:29 -0500
Subject: [Web-SIG] WSGI: Another level of indirection
In-Reply-To: <5.1.1.6.0.20050802003254.026944e0@mail.telecommunity.com>
References: <5.1.1.6.0.20050802003254.026944e0@mail.telecommunity.com>
Message-ID: <42EF9B6D.1080906@colorstudy.com>

Phillip J. Eby wrote:
> At 11:28 PM 8/1/2005 -0500, Ian Bicking wrote:
> 
>> Maybe a way to handle this configuration is to put in another level of
>> abstraction, sad as that is.
>>
>> I'm thinking configuration files could have something like PEP 263's
>> encodings, except that it would be an indication of who knows how to
>> build the WSGI application from the file.  So it might look like:
>>
>> # -*- wsgi-build: paste.wsgi_deploy:DeploymentConfig -*-
>>
>> Which would work with the experimental stuff I mentioned before.  It
>> should also work with .ini files, Python source, and probably other
>> configuration file syntaxes.  At some point perhaps we'll come up with a
>> standard (aka default) builder, but this could remain useful despite
>> that.
> 
> 
> Now you're *really* scaring me.  Honestly, there's no difference between 
> this proposal and saying that we'll use "#!" lines to operationally 
> determine the format by specifying an interpreter for it.  There's 
> really no *abstraction* taking place here.

Well, #! is constrained to executable paths, unlike the comment.  But if 
it wasn't, sure this is just like that... but I don't see the problem 
(or the reason for such shock ;).  Deep down #! is a good feature for 
executables, and this is just the analog.

The API would go:

app = load_wsgi_app_from_file('foo.ini')

def load_wsgi_app_from_file(filename):
     f = open(filename)
     for line in f:
         if not line.strip(): continue
         assert line.startswith('#'), "No interpreter found"
         if '-*-' in line:
             interp_spec = line.split('-*-')[1].strip()
             break
     if ' ' in interp_spec:
         interp = pkg_resources.load_entry_point(interp_spec.split()[0],
             'wsgi.config_interpreter', interp_spec.split()[1])
     else:
         interp = pkg_resources.parse('x='+interp_spec).load(False)
     f.close()
     return interp(filename)


>>   It also means I can go forward with this right now and still be
>> future compatible.
> 
> 
> I can understand the desire, but I think it would be a bad idea to give 
> this any kind of official standing or allow it to warp the process of 
> getting to a workable deployment standard.  Better for you to develop 
> your format(s) and try to make them that convinces everyone they're 
> worth standardizing on, knowing that if you fail, your format will be a 
> dead end.  :)  That should provide you with extra motivation to make it 
> a really good format for the rest of us.  ;)

I bring this up because I'm not sure there is a One Best Way for the 
deployment.  This is also something I can apply to the deployment 
configuration I already have in Paste (not the experimental stuff, but 
the configuration files in Paste).  I think other legacy systems (and 
*every* current framework has something like this) can very possibly be 
handled the same way, requiring only the addition of one comment line to 
current configurations.  This also leaves the possibility of flattening 
the configuration some without trying to jam incompatible features in.

And load_wsgi_app_from_file, under whatever name, is a function that 
needs to exist in any spec.  Standardizing it first doesn't seem that 
strange to me.

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From ianb at colorstudy.com  Tue Aug  2 18:16:29 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 02 Aug 2005 11:16:29 -0500
Subject: [Web-SIG] WSGI: Another level of indirection
In-Reply-To: <42EF9B6D.1080906@colorstudy.com>
References: <5.1.1.6.0.20050802003254.026944e0@mail.telecommunity.com>
	<42EF9B6D.1080906@colorstudy.com>
Message-ID: <42EF9C5D.8090601@colorstudy.com>

Ian Bicking wrote:
> I think other legacy systems (and 
> *every* current framework has something like this) can very possibly be 
> handled the same way, requiring only the addition of one comment line to 
> current configurations.  

It occurs to me that # is a comment most places, but not in XML files, 
so some alternate way of annotating XML files is also necessary.

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From jjinux at gmail.com  Thu Aug  4 02:02:42 2005
From: jjinux at gmail.com (Shannon -jj Behrens)
Date: Wed, 3 Aug 2005 17:02:42 -0700
Subject: [Web-SIG] Fwd: ANN: cssutils 0.8a2 (alpha release)
In-Reply-To: <42ECF61C.808@t-online.de>
References: <42ECF61C.808@t-online.de>
Message-ID: <c41f67b905080317027478fae6@mail.gmail.com>

Hmm, I thought we *didn't* have a way to parse CSS.  I guess that's no
longer true.

-jj

---------- Forwarded message ----------
From: Christof <csad7 at t-online.de>
Date: Jul 31, 2005 9:02 AM
Subject: ANN: cssutils 0.8a2 (alpha release)
To: python-announce-list at python.org


what is it
----------
A Python package to parse and build CSS Cascading Style Sheets. Partly
implements the DOM Level 2 Stylesheets  and DOM Level 2 CSS interfaces.

The implementation uses some Python standard features like standard
lists for classes like css.CSSRuleList and is hopefully a bit easier to use.


changes since the last release
------------------------------
**MAJOR API CHANGE**
reflecting DOM Level 2 Stylesheets and DOM Level 2 CSS
see http://cthedot.de/cssutils/ for a complete list of changes,
examples, etc.


license
-------
cssutils is published under the LGPL.


download
--------
download cssutils 0.8a2 alpha - 050731 from http://cthedot.de/cssutils/

This is an alpha release so use at your own risk! Some parts will not
work as expected... Any bug report is welcome.

cssutils needs
* Python 2.3 (tested with Python 2.4.1 on Windows XP only)
* maybe PyXML (tested with PyXML 0.8.4 installed)


any comment will be appreciated, thanks
christof hoeke


<P><A HREF="http://cthedot.de/cssutils/">cssutils 0.8a2</A> - cssutils -
CSS Cascading Style Sheets library for Python (31-Jul-05)


--
http://mail.python.org/mailman/listinfo/python-announce-list

        Support the Python Software Foundation:
        http://www.python.org/psf/donations.html


-- 
I have decided to switch to Gmail, but messages to my Yahoo account will
still get through.

From james at pythonweb.org  Sun Aug  7 18:23:59 2005
From: james at pythonweb.org (James Gardner)
Date: Sun, 07 Aug 2005 17:23:59 +0100
Subject: [Web-SIG] WSGI deployment config
In-Reply-To: <42EA57D2.1060902@colorstudy.com>
References: <42E95EC4.9040906@colorstudy.com> <42EA57D2.1060902@colorstudy.com>
Message-ID: <42F6359F.9000504@pythonweb.org>

Hi All,

Have there been any more developments off-list about the format of the 
config file for WSGI deployment?

I'd like to apply the entry points labeling idea to sections in the 
pipeline config file and propose the following extensions to the format. 
Here is an example to start with:

[database: connection from database == 0.6.0]
host = 'mysqldb.pythonweb.org'
user = 'foo'
password = 'bar'

[connection from database == 0.6.0]
extra-non-standard-params = 'params'

[application: testApplication from web == 0.6.0]
message = 'Hello World!'


Each section represents the configuration of one piece of middleware as 
before. Standard configuration sections are labeled and non-standard 
extensions to standard sections use the same deployment string but with 
no label so in the example extra-non-standard-params = 'params' is 
considered a non-standard extension to database configuration.

This has three advantages:

1. Standardisation between similar WSGI middleware components becomes 
easier because we could all agree to name standard database connection 
parameters as database so middleware can be more interoperable. 
Non-standard extensions can be named in a similar config section but 
without the database label so that we define an extensible base standard.

2. Configuration can be accessed in code by name eg 
config.get('database') or config.getAll('database') to get custom 
extensions too. This means that whatever version of a package you are 
using you can still refer to the correct configuration easily and also 
use the configuration file in external scripts eg. to setup necessary 
database tables etc without creating the full middleware chain.

3. It allows us to create a configuration hierarchy. I've written a WSGI 
framework named Bricks http://www.pythonweb.org/bricks/ and the way it 
works is to have a global config file for all applications at a site and 
then a local config file if the application needs to override global 
settings or provide extra middleware. The logic behind this is that 
things like database connections are likely to be used by all 
applications across a site and a new application you have installed from 
a third party is not going to have the correct database settings so you 
would want to use the settings defined in the global config file. Using 
the new config file format we could simply say that if a global 
configuration does not already have a named config section which appears 
in a local config file then the local configuration is added below the 
last piece of global configuration that matched (or at the end if no 
matches were found).

We can also define an extension to this basic format: always include and 
always exclude determined by a + or - sign just before the entry point 
name so that we can also override global settings in a local config file 
and provide a flexible configuration chain of as many config files as we 
liked.

Here is an example to illustrate. We have an application which doesn't 
need authorisation middleware but does need a session store. It needs a 
database connection but is only capable of interacting with the one it 
specifies, it also needs some configuration of its own. It is installed 
on a site with other applications which use database, session and auth 
middleware. The site administrator wants all applications to use GZip 
encoded output.

global.wsgi:

[gzip: gzip from compression==0.1.0]

[database: connection from database== 0.6.0]
adapter = 'mysql'
database = 'test'
user = 'foo'
password = 'bar'

[auth: auth from database==0.6.0]
extra-non-standard-params = 'params'

[session: session from web==0.6.0]
params = 'interesting'

local.wsgi:

# Override any other database definitions
[+database: connection from database==0.6.0]
adapter = 'engine'
database = 'default'

# Define a session configuration to be use if no other is available
[session: session from web==0.6.0]
params = 'default'

[application: appSettings from app==0.1.0]
name = 'foo'

# The user installing the application wants to specifically
# exclude auth middleware since they know it isn't needed
[-auth]

If the global configuration file wasn't present, the application 
configuration would look like this:

[database: connection from database==0.6.0]
adapter = 'engine'
database = 'default'

[session: session from web==0.6.0]
params = 'default'

[application: appSettings from app==0.1.0]
name = 'foo'

But when it is installed on a site with the global config file it looks 
like this:

[gzip: gzip from compression==0.1.0]

[database: connection from database==0.6.0]
adapter = 'engine'
database = 'default'

[session: session from web==0.6.0]
params = 'interesting'

[application: appSettings from app==0.1.0]
name = 'foo'

So as you can see this allows a flexible deployment heirachy. What do 
you think?

On a broader point I'd like to see pipeline configuration (what we are 
talking about here) quite separate from the actual deployment details 
such as where the application is going to be installed on a URL. I don't 
think there is any need to standardise the latter as long as all 
frameworks are capable of using the basic pipeline and deploying it in a 
way they see fit, otherwise you start making the main application 
configuration too framework-specific.

James

From pje at telecommunity.com  Sun Aug  7 19:16:53 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 07 Aug 2005 13:16:53 -0400
Subject: [Web-SIG] WSGI deployment config
In-Reply-To: <42F6359F.9000504@pythonweb.org>
References: <42EA57D2.1060902@colorstudy.com> <42E95EC4.9040906@colorstudy.com>
	<42EA57D2.1060902@colorstudy.com>
Message-ID: <5.1.1.6.0.20050807124830.026a44c0@mail.telecommunity.com>

At 05:23 PM 8/7/2005 +0100, James Gardner wrote:
>This has three advantages:
>
>1. Standardisation between similar WSGI middleware components becomes 
>easier because we could all agree to name standard database connection 
>parameters as database so middleware can be more interoperable. 
>Non-standard extensions can be named in a similar config section but 
>without the database label so that we define an extensible base standard.

The point isn't to have a standardized format or globally accessible 
configuration, it's to *hide* configuration so that other objects don't 
have to know about it.


>2. Configuration can be accessed in code by name eg config.get('database') 
>or config.getAll('database') to get custom extensions too. This means that 
>whatever version of a package you are using you can still refer to the 
>correct configuration easily and also use the configuration file in 
>external scripts eg. to setup necessary database tables etc without 
>creating the full middleware chain.

That doesn't require access to the data as data; the database should just 
be a service.  An example of why: PEAK uses database connection URLs like 
"postgres://foo:bar at example.com/dbname" to designate databases, so it would 
be a step back to force PEAK users to use your user/password/etc. 
configuration scheme in order to be able to interoperate.

It makes more sense, therefore, to have configuration be private to 
components unless those components *want* to share that 
configuration.  However, the way they share it might be different than the 
input.  For example, PEAK has a URL connection class that has 
user/password/etc. attributes on it, so it could certainly implement an 
interface to provide that information to components that want it.  But that 
doesn't mean that the *source* configuration was done that way.  Preserving 
a separation between interface and implementation is vital to the 
maintainability of the overall system.


>3. It allows us to create a configuration hierarchy. I've written a WSGI 
>framework named Bricks http://www.pythonweb.org/bricks/ and the way it 
>works is to have a global config file for all applications at a site and 
>then a local config file if the application needs to override global 
>settings or provide extra middleware. The logic behind this is that things 
>like database connections are likely to be used by all applications across 
>a site and a new application you have installed from a third party is not 
>going to have the correct database settings so you would want to use the 
>settings defined in the global config file. Using the new config file 
>format we could simply say that if a global configuration does not already 
>have a named config section which appears in a local config file then the 
>local configuration is added below the last piece of global configuration 
>that matched (or at the end if no matches were found).

I'm -1 on exposing the data as direct configuration.  It should be opaque, 
and accessed as *services*.  Otherwise you're just reinventing the worst 
problems of Zope 2-era design.

We probably *do* need a way to declare services (like your database 
example), and a service discovery API.  We *don't* want to make deployment 
data into directly-accessible configuration.  This doesn't mean you can't 
create service objects whose whole job is to provide configuration data in 
some way, it just means that the deployment parameters themselves should be 
opaque.

The reason for this is that without encapsulation, you get spaghetti 
dependencies, and it becomes difficult to change things programmatically if 
you have no way to influence data dynamically.  This was a really big 
problem in older versions of Zope 2, that encouraged acquisition of random 
configuration properties.  There's really no point in us repeating that 
mistake.

Here's what I'd suggest as an alternative, using a slight syntax tweak:

    [sql service from somedbpackage]
    conn = "some://url"    # or you can do it the awkward way instead
    # ... etc.

So "service" or "service from" are the keywords to define a service.  For 
"service from", the first part is looked up in a wsgi.service_factories 
entry point group.  For "service", it's just imported.  Either way, the 
factory is invoked with the previous service provider to create a kind of 
"service chain".  The current head of the service chain is passed into 
middleware and application factories as the first parameter, so they can 
use it to find services.

We then define a simple API for walking the service chain and locating 
services by name or other keys.  This approach is capable of doing 
everything you've proposed, except that it doesn't provide access to the 
private configuration data of individual services.  It would be possible, 
however, to load the service chain from a deployment file without 
instantiating applications or middleware, in order to e.g. run utility 
programs.  You can still include arbitrary configuration if you want, just 
by creating a service whose job is to provide such information.

The only other piece I think we're missing is a way to handle branching, 
because our pipeline configuration is quite linear.  There's no obvious way 
to branch at the moment, except by having a way to configure a middleware 
component to refer to other pipelines.


From james at pythonweb.org  Sun Aug  7 20:33:53 2005
From: james at pythonweb.org (James Gardner)
Date: Sun, 07 Aug 2005 19:33:53 +0100
Subject: [Web-SIG] WSGI deployment config
In-Reply-To: <5.1.1.6.0.20050807124830.026a44c0@mail.telecommunity.com>
References: <42EA57D2.1060902@colorstudy.com> <42E95EC4.9040906@colorstudy.com>
	<42EA57D2.1060902@colorstudy.com>
	<5.1.1.6.0.20050807124830.026a44c0@mail.telecommunity.com>
Message-ID: <42F65411.4040901@pythonweb.org>

Phillip J. Eby wrote:

> This approach is capable of doing everything you've proposed, except 
> that it doesn't provide access to the private configuration data of 
> individual services.  It would be possible, however, to load the 
> service chain from a deployment file without instantiating 
> applications or middleware, in order to e.g. run utility programs.  
> You can still include arbitrary configuration if you want, just by 
> creating a service whose job is to provide such information. 

OK, fair point and I'm perfectly happy with this.

> The point isn't to have a standardized format or globally accessible 
> configuration, it's to *hide* configuration so that other objects 
> don't have to know about it. 

What exactly are you defining as a service then? A service would have to 
have some way of providing its useful code to utilities etc as well as 
deploying middleware. In the original model each WSGI middleware 
component might rely on other ones, both the middleware component and 
middleware it relies on might need configuration. We can describe the 
whole middleware chain in a config file so that it can all be configured 
at once. I might be missing the point but is your idea of services that 
by passing them the service chain they have an opportunity to decide 
what services to load based on services they rely on and thereby bypass 
some of the configuration? Surely almost all middleware would need at 
least some configuration so you are unlikely to make the config file 
much shorter?

> The only other piece I think we're missing is a way to handle 
> branching, because our pipeline configuration is quite linear.  
> There's no obvious way to branch at the moment, except by having a way 
> to configure a middleware component to refer to other pipelines.

I don't think I've quite caught your full vision here. Using the 
services idea my understanding is just that an application needs certain 
services to function and also certain configuration for those services 
before it can run, since many applications on the same site may need the 
same services configured in the same way it is useful to be able to 
share configuration and to do that it is helpful for a local application 
to inherit configuration from another source of components, possibly in 
the way I suggested. I don't think branching really fits into that model 
so how are you envisaging deployments?

Cheers,

James


From pje at telecommunity.com  Sun Aug  7 21:45:07 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 07 Aug 2005 15:45:07 -0400
Subject: [Web-SIG] WSGI deployment config
In-Reply-To: <42F65411.4040901@pythonweb.org>
References: <5.1.1.6.0.20050807124830.026a44c0@mail.telecommunity.com>
	<42EA57D2.1060902@colorstudy.com> <42E95EC4.9040906@colorstudy.com>
	<42EA57D2.1060902@colorstudy.com>
	<5.1.1.6.0.20050807124830.026a44c0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050807145244.0269bd38@mail.telecommunity.com>

At 07:33 PM 8/7/2005 +0100, James Gardner wrote:
>Phillip J. Eby wrote:
>
>>This approach is capable of doing everything you've proposed, except that 
>>it doesn't provide access to the private configuration data of individual 
>>services.  It would be possible, however, to load the service chain from 
>>a deployment file without instantiating applications or middleware, in 
>>order to e.g. run utility programs.
>>You can still include arbitrary configuration if you want, just by 
>>creating a service whose job is to provide such information.
>
>OK, fair point and I'm perfectly happy with this.
>
>>The point isn't to have a standardized format or globally accessible 
>>configuration, it's to *hide* configuration so that other objects don't 
>>have to know about it.
>
>What exactly are you defining as a service then?

A component that needs to be available to one or more other components, 
based on some lookup key (like a name or an interface).


>  A service would have to have some way of providing its useful code to 
> utilities etc as well as deploying middleware.

I think maybe you're confusing something here.  I'm suggesting that there 
be a chain of service providers, and that the WSGI API to load a pipeline 
should return both a top-down middleware-to-app chain, and a bottom-up 
service-to-service chain.  Thus, a utility program could load a WSGI file 
and gain access to the service chain, ignoring the middleware.

But, I'm not saying that services are *part* of the middleware chain; 
middleware components get created with access to the middleware chain, but 
the services themselves are not middleware.


>  In the original model each WSGI middleware component might rely on other 
> ones, both the middleware component and middleware it relies on might 
> need configuration. We can describe the whole middleware chain in a 
> config file so that it can all be configured at once. I might be missing 
> the point but is your idea of services that by passing them the service 
> chain they have an opportunity to decide what services to load based on 
> services they rely on and thereby bypass some of the configuration?

I don't understand you.  They just get what services they need from the 
chain.  They don't "bypass" configuration they never cared about in the 
first place.


>  Surely almost all middleware would need at least some configuration so 
> you are unlikely to make the config file much shorter?

But their configuration is in the parameters that get passed to their 
factory, e.g.

      [fooware from blah]
      something1 = "feh"
      # etc.


>>The only other piece I think we're missing is a way to handle branching, 
>>because our pipeline configuration is quite linear.
>>There's no obvious way to branch at the moment, except by having a way to 
>>configure a middleware component to refer to other pipelines.
>
>I don't think I've quite caught your full vision here. Using the services 
>idea my understanding is just that an application needs certain services 
>to function and also certain configuration for those services before it 
>can run, since many applications on the same site may need the same 
>services configured in the same way it is useful to be able to share 
>configuration and to do that

In which case, there should be a mechanism for configuring things based on 
other service lookups, e.g.

     [spazware from spiz]
     fidgety = lookup("fizzit.ping")

If we allowed 'lookup()' to mean, "search the service chain above me for a 
configuration service and return the value of 'fizzit.ping'.

My point here isn't to propose that this be the API, I'm just presenting a 
general concept.  "Wiring" of configuration by simply acquiring values from 
a global namespace doesn't work well even for applications developed 
entirely by a single person; it definitely doesn't scale to plug-and-play 
of components developed by an entire community.


>it is helpful for a local application to inherit configuration from 
>another source of components, possibly in the way I suggested. I don't 
>think branching really fits into that model so how are you envisaging 
>deployments?

The branching was for saying things like "/foo goes to pipeline A, and /bar 
goes to pipeline B".

It's becoming clear to me, though, that we need to *ban* the word 
"configuration" from this discussion, because it's way too overloaded, and 
everybody brings unique baggage to it.  If we don't use that word, we'll 
have to actually explain what we really mean.  :)

So, in that spirit, I will now rephrase my proposal so as not use the word 
"configuration".

A "pipeline spec" describes how to deploy a WSGI application, optionally 
with middleware filters and services, by providing parameters to designated 
factories.  There are three kinds of factories: application, middleware, 
and service.  All three kinds are invoked with the parameters defined in 
the spec and the most-recently specified service object.  Middleware 
factories also receive the *next* middleware or application component 
defined below them in the spec.

An example middleware factory signature:

      def make_middleware(last_service, application_to_wrap, **params):

Example application and service factory signatures:

      def make_app(last_service, **params):

      def make_service(last_service, **params):

Just as the middleware-to-application links form a "downward" chain of 
responsibility for handling WSGI requests, the service-to-service links 
form an "upward" chain of responsibility for acquiring service 
components.  There needs to be a specification for how to search the chain; 
for example we could have a 'get_service(key)' method required on service 
components, and if the service doesn't recognize the key it just calls 
'last_service.get_service()'.

In some circumstances, the same value or object is needed as a factory 
parameter for more than one component.  In these cases, it would be useful 
to be able to have a way to specify shared parameters in the 
specification.  Ordinarily, these shared parameters will be defined at some 
"high level" of the overall system, such as in a server-wide pipeline spec, 
and then acquired in "low level" pipeline specs for specific areas of the 
server or individual application components.  We can thus envision a 
"shared parameter service" interface for publishing values that need to be 
used often, and an API in the pipeline spec to indicate that a parameter 
should be retrieved from the nearest shared parameter service that offers a 
value for a given name.

This approach is superior to using a common namespace for parameters, 
because the level of abstraction at which shared parameters are defined is 
more likely to be concepts like "system administrator e-mail", but that 
value might then be used for more specific component parameters like "email 
errors to" and "administrator login ID".  So, being able to say that the 
"email_errors_to" parameter for a given component should be looked up from 
"sysadmin_email" in the shared parameter service allows for parameters to 
be cleanly shared between components.


From ianb at colorstudy.com  Mon Aug  8 19:47:56 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon, 08 Aug 2005 12:47:56 -0500
Subject: [Web-SIG] WSGI deployment config
In-Reply-To: <5.1.1.6.0.20050807145244.0269bd38@mail.telecommunity.com>
References: <5.1.1.6.0.20050807124830.026a44c0@mail.telecommunity.com>	<42EA57D2.1060902@colorstudy.com>
	<42E95EC4.9040906@colorstudy.com>	<42EA57D2.1060902@colorstudy.com>	<5.1.1.6.0.20050807124830.026a44c0@mail.telecommunity.com>
	<5.1.1.6.0.20050807145244.0269bd38@mail.telecommunity.com>
Message-ID: <42F79ACC.7080706@colorstudy.com>

OK, this is starting to become a bit more clear to me...

Phillip J. Eby wrote:
>> A service would have to have some way of providing its useful code to 
>>utilities etc as well as deploying middleware.
> 
> 
> I think maybe you're confusing something here.  I'm suggesting that there 
> be a chain of service providers, and that the WSGI API to load a pipeline 
> should return both a top-down middleware-to-app chain, and a bottom-up 
> service-to-service chain.  Thus, a utility program could load a WSGI file 
> and gain access to the service chain, ignoring the middleware.
> 
> But, I'm not saying that services are *part* of the middleware chain; 
> middleware components get created with access to the middleware chain, but 
> the services themselves are not middleware.

So, thinking back to the transaction middleware I speculated about: 
http://blog.ianbicking.org/more-perfect-app-server-wsgi-transactions.html

In your model with services, I think you are suggesting some middleware 
like this will still exist.  In fact, it would look very close to the 
way it looks in that example, except that instead of putting the Manager 
in the WSGI environment, some service would create the manager, and both 
the middleware and a transaction-user would use this service to get the 
manager.  (In case it creates confusion, I think Zope uses a different 
term for the manager; maybe it is simply a "transaction", I can't 
remember now)

So for many services some middleware would still be necessary, if the 
service was able to do anything to the request.  That middleware would 
be mostly a shell.  That's fine with me -- that's how I'm writing most 
of my middleware anyway, except that the "service" part is relatively ad 
hoc, and if you use it outside of the web environment you have to wire 
up the configuration on your own.  Which isn't what I want either.

If you *don't* want a middleware for every request/response-modifying 
service, then you'd need some uber-middleware like I mentioned back in 
http://mail.python.org/pipermail/web-sig/2005-July/001532.html -- in 
addition to saving some frames in the call stack, that would probably 
make pipeline specification easier.  But maybe not a whole lot easier, 
as there's usually additional details (like ordering) that are necessary 
to specify in the context of a web request.

>>it is helpful for a local application to inherit configuration from 
>>another source of components, possibly in the way I suggested. I don't 
>>think branching really fits into that model so how are you envisaging 
>>deployments?
> 
> 
> The branching was for saying things like "/foo goes to pipeline A, and /bar 
> goes to pipeline B".

The spec I gave in "WSGI deployment: an experiment" 
(http://mail.python.org/pipermail/web-sig/2005-July/001598.html) handles 
arbitrary kinds of branching, basically by naming both applications and 
middleware filters, and allowing application factories to call back into 
the configuration file.  So pipeline is just another application 
factory, just like urlmap or other kinds of branching.

Maybe this could be handled with an application-building service since 
we're passing services around anyway.

> A "pipeline spec" describes how to deploy a WSGI application, optionally 
> with middleware filters and services, by providing parameters to designated 
> factories.  There are three kinds of factories: application, middleware, 
> and service.  All three kinds are invoked with the parameters defined in 
> the spec and the most-recently specified service object.  Middleware 
> factories also receive the *next* middleware or application component 
> defined below them in the spec.
> 
> An example middleware factory signature:
> 
>       def make_middleware(last_service, application_to_wrap, **params):

It might add to the consistency if make_middleware takes the same 
parameters as the other two factories, except it builds "middleware" (or 
"middleware filters" to make the term less enterprisy) which are 
functions that, when passed in an application, return an application 
that wraps that application.  Though I would not object to a method 
instead of just calling the factory; I think we risk a maze of function 
calls, all looking the same.

Then the higher-level operation is "build something of type foo", where 
foo is a WSGI application, a WSGI middleware filter, a service, or 
something else.

> Example application and service factory signatures:
> 
>       def make_app(last_service, **params):
> 
>       def make_service(last_service, **params):
> 
> Just as the middleware-to-application links form a "downward" chain of 
> responsibility for handling WSGI requests, the service-to-service links 
> form an "upward" chain of responsibility for acquiring service 
> components.  There needs to be a specification for how to search the chain; 
> for example we could have a 'get_service(key)' method required on service 
> components, and if the service doesn't recognize the key it just calls 
> 'last_service.get_service()'.
> 
> In some circumstances, the same value or object is needed as a factory 
> parameter for more than one component.  In these cases, it would be useful 
> to be able to have a way to specify shared parameters in the 
> specification.  Ordinarily, these shared parameters will be defined at some 
> "high level" of the overall system, such as in a server-wide pipeline spec, 
> and then acquired in "low level" pipeline specs for specific areas of the 
> server or individual application components.  We can thus envision a 
> "shared parameter service" interface for publishing values that need to be 
> used often, and an API in the pipeline spec to indicate that a parameter 
> should be retrieved from the nearest shared parameter service that offers a 
> value for a given name.
 >
> This approach is superior to using a common namespace for parameters, 
> because the level of abstraction at which shared parameters are defined is 
> more likely to be concepts like "system administrator e-mail", but that 
> value might then be used for more specific component parameters like "email 
> errors to" and "administrator login ID".  So, being able to say that the 
> "email_errors_to" parameter for a given component should be looked up from 
> "sysadmin_email" in the shared parameter service allows for parameters to 
> be cleanly shared between components.

I think this would address some of the configuration concerns I've had. 
  I don't mind being very explicit in my code about how configuration is 
acquired; I just don't want to push that work onto the person doing the 
configuration, and I want sensible (and possibly derivative) defaults.

While Zope 2 gets hairy in its use of Acquisition -- essentially adding 
dynamic scoping to the core of the system -- the basic technique is not 
necessary correct.  Lisps get by okay with dynamic scopes, but they 
clearly mark variables as being so typed (like *current-output-stream*). 
  If we add dynamic-scope-like-functionality, we just need to make sure 
it's clear where it's being used, and that it's not the default so it 
isn't used when not necessary.

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From pje at telecommunity.com  Tue Aug  9 03:20:33 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 08 Aug 2005 21:20:33 -0400
Subject: [Web-SIG] WSGI deployment config
In-Reply-To: <42F79ACC.7080706@colorstudy.com>
References: <5.1.1.6.0.20050807145244.0269bd38@mail.telecommunity.com>
	<5.1.1.6.0.20050807124830.026a44c0@mail.telecommunity.com>
	<42EA57D2.1060902@colorstudy.com> <42E95EC4.9040906@colorstudy.com>
	<42EA57D2.1060902@colorstudy.com>
	<5.1.1.6.0.20050807124830.026a44c0@mail.telecommunity.com>
	<5.1.1.6.0.20050807145244.0269bd38@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050808151235.025bf4b8@mail.telecommunity.com>

I imAt 12:47 PM 8/8/2005 -0500, Ian Bicking wrote:
>OK, this is starting to become a bit more clear to me...

Cool.  :)  Sometimes the best way to get rid of confusing communication is 
to make the communication even more difficult.  :)  (e.g., banning the word 
"configuration")


>In your model with services, I think you are suggesting some middleware 
>like this will still exist.  In fact, it would look very close to the way 
>it looks in that example, except that instead of putting the Manager in 
>the WSGI environment, some service would create the manager, and both the 
>middleware and a transaction-user would use this service to get the manager.

Yes.


>So for many services some middleware would still be necessary, if the 
>service was able to do anything to the request.

Well yeah, if you want to wrap an app rather than just use the service.


>   That middleware would be mostly a shell.  That's fine with me -- that's 
> how I'm writing most of my middleware anyway, except that the "service" 
> part is relatively ad hoc, and if you use it outside of the web 
> environment you have to wire up the configuration on your own.  Which 
> isn't what I want either.

Note that you can use pipeline specs to configure arbitrary service chains, 
without WSGI even being involved.  So, to a certain extent, services can 
stand on their own.  What's interesting about that (to me anyway) is that 
if there are bridges that allow PEAK or Zope services to be used as WSGI 
services, then pipelines can be used to bridge various frameworks' service 
systems - without a web application in sight.


>If you *don't* want a middleware for every request/response-modifying 
>service, then you'd need some uber-middleware like I mentioned back in 
>http://mail.python.org/pipermail/web-sig/2005-July/001532.html -- in 
>addition to saving some frames in the call stack, that would probably make 
>pipeline specification easier.  But maybe not a whole lot easier, as 
>there's usually additional details (like ordering) that are necessary to 
>specify in the context of a web request.

Well, to me, the "uber middleware" is just an object with a generic 
function for its __call__ method, that has "before", "after", and "around" 
methods registered to do stuff like transaction wrapping, error handling, 
and any other sort of middleware-ish things.  So, it's not very "uber" in 
implementation complexity from my POV to have such a thing, and it takes 
care of many of the stacking issues.


>The spec I gave in "WSGI deployment: an experiment" 
>(http://mail.python.org/pipermail/web-sig/2005-July/001598.html) handles 
>arbitrary kinds of branching, basically by naming both applications and 
>middleware filters, and allowing application factories to call back into 
>the configuration file.  So pipeline is just another application factory, 
>just like urlmap or other kinds of branching.
>
>Maybe this could be handled with an application-building service since 
>we're passing services around anyway.

Hm.  An interesting point.  I haven't yet seen a branching/alternatives 
syntax I like though.  The big problem IMO is that a branching mechanism 
requires nesting ability, whereas pipelines are "flat and happy".  :)

Unfortunately, .ini syntax rapidly breaks down when nesting begins, which 
makes me tend to think that we should have a separate "site map" file that 
maps locations and other rules to groups of pipelines.


>>A "pipeline spec" describes how to deploy a WSGI application, optionally 
>>with middleware filters and services, by providing parameters to 
>>designated factories.  There are three kinds of factories: application, 
>>middleware, and service.  All three kinds are invoked with the parameters 
>>defined in the spec and the most-recently specified service 
>>object.  Middleware factories also receive the *next* middleware or 
>>application component defined below them in the spec.
>>An example middleware factory signature:
>>       def make_middleware(last_service, application_to_wrap, **params):
>
>It might add to the consistency if make_middleware takes the same 
>parameters as the other two factories, except it builds "middleware" (or 
>"middleware filters" to make the term less enterprisy) which are functions 
>that, when passed in an application, return an application that wraps that 
>application.  Though I would not object to a method instead of just 
>calling the factory; I think we risk a maze of function calls, all looking 
>the same.

I don't see a problem with the signature being different, to be 
honest.  Making it the same implies a similarity that doesn't exist.  If we 
were to change for consistency's sake, we should instead change the 
application factory signature to match that of middleware, and use 'None' 
for the 'application_to_wrap' in that case.  Applications and middleware 
are more alike than either of them are like services.


>I think this would address some of the configuration concerns I've had.  I 
>don't mind being very explicit in my code about how configuration is 
>acquired; I just don't want to push that work onto the person doing the 
>configuration, and I want sensible (and possibly derivative) defaults.

You can do that, sure.


>While Zope 2 gets hairy in its use of Acquisition -- essentially adding 
>dynamic scoping to the core of the system -- the basic technique is not 
>necessary correct.  Lisps get by okay with dynamic scopes, but they 
>clearly mark variables as being so typed (like 
>*current-output-stream*).  If we add dynamic-scope-like-functionality, we 
>just need to make sure it's clear where it's being used, and that it's not 
>the default so it isn't used when not necessary.

Right - explicit indirection or redirection avoids a lot of problems here.


From ianb at colorstudy.com  Tue Aug  9 05:17:45 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon, 08 Aug 2005 22:17:45 -0500
Subject: [Web-SIG] WSGI deployment config
In-Reply-To: <5.1.1.6.0.20050808151235.025bf4b8@mail.telecommunity.com>
References: <5.1.1.6.0.20050807145244.0269bd38@mail.telecommunity.com>
	<5.1.1.6.0.20050807124830.026a44c0@mail.telecommunity.com>
	<42EA57D2.1060902@colorstudy.com> <42E95EC4.9040906@colorstudy.com>
	<42EA57D2.1060902@colorstudy.com>
	<5.1.1.6.0.20050807124830.026a44c0@mail.telecommunity.com>
	<5.1.1.6.0.20050807145244.0269bd38@mail.telecommunity.com>
	<5.1.1.6.0.20050808151235.025bf4b8@mail.telecommunity.com>
Message-ID: <42F82059.2060402@colorstudy.com>

Phillip J. Eby wrote:
> I imAt 12:47 PM 8/8/2005 -0500, Ian Bicking wrote:
> 
>> OK, this is starting to become a bit more clear to me...
> 
> 
> Cool.  :)  Sometimes the best way to get rid of confusing communication 
> is to make the communication even more difficult.  :)  (e.g., banning 
> the word "configuration")

Well, it wasn't really that per se; after reading through the latest 
thread between you and James it became a bit clearer what you meant by 
services.  You've been a bit vague about services up until now (and I'm 
not familiar with Zope or PEAK services, so I've just been guessing at 
what you've meant).

>> So for many services some middleware would still be necessary, if the 
>> service was able to do anything to the request.
> 
> 
> Well yeah, if you want to wrap an app rather than just use the service.

I'm thinking of any service that needs to modify the request and 
response, or watch the request in some way (e.g., a transaction service 
that needs to watch for unexpected exceptions, or a session service that 
needs to add a cookie to the response when starting a new session). 
Services certainly have a much larger scope than that, but then most of 
that larger scope is workable as mere "libraries" (except for the 
configuration problem, which services do address).

>> If you *don't* want a middleware for every request/response-modifying 
>> service, then you'd need some uber-middleware like I mentioned back in 
>> http://mail.python.org/pipermail/web-sig/2005-July/001532.html -- in 
>> addition to saving some frames in the call stack, that would probably 
>> make pipeline specification easier.  But maybe not a whole lot easier, 
>> as there's usually additional details (like ordering) that are 
>> necessary to specify in the context of a web request.
> 
> 
> Well, to me, the "uber middleware" is just an object with a generic 
> function for its __call__ method, that has "before", "after", and 
> "around" methods registered to do stuff like transaction wrapping, error 
> handling, and any other sort of middleware-ish things.  So, it's not 
> very "uber" in implementation complexity from my POV to have such a 
> thing, and it takes care of many of the stacking issues.

Uber in that it doesn't have any specific purpose, and really leads into 
the direction of framework instead of library.  All the middleware to 
date are targetted at providing one bit of functionality; there's a 
certain clarity to that.  A single more powerful middleware is 
interesting; but it's also harder to imagine it being complete.

>> The spec I gave in "WSGI deployment: an experiment" 
>> (http://mail.python.org/pipermail/web-sig/2005-July/001598.html) 
>> handles arbitrary kinds of branching, basically by naming both 
>> applications and middleware filters, and allowing application 
>> factories to call back into the configuration file.  So pipeline is 
>> just another application factory, just like urlmap or other kinds of 
>> branching.
>>
>> Maybe this could be handled with an application-building service since 
>> we're passing services around anyway.
> 
> 
> Hm.  An interesting point.  I haven't yet seen a branching/alternatives 
> syntax I like though.  The big problem IMO is that a branching mechanism 
> requires nesting ability, whereas pipelines are "flat and happy".  :)

I find the use of named applications/filters to make the nesting 
reasonable.  I'm happy enough with the syntax I propose.  But then, I 
also thing that there's still something to the idea of another layer of 
indirection, and configuration files that self-identify.

I really see no reason to think we can fully identify the Right Way to 
configure applications (including all meanings of "configure") here and 
now.  I'm happy with One Good Way, and future extensibility.  So I still 
think "# -*- config.loader:ref -*-" is a good idea.

> Unfortunately, .ini syntax rapidly breaks down when nesting begins, 
> which makes me tend to think that we should have a separate "site map" 
> file that maps locations and other rules to groups of pipelines.

Nesting is one way of looking at it; but then mere references work okay 
as well.  I think at any point we want to be able to say "get the thing 
from this file" instead of "get the thing from this section".  Given 
that it doesn't seem like nesting is a good fit.  Are there any specific 
problems you have with my previous proposal?

-- 
Ian Bicking  /  ianb at colorstudy.com  / http://blog.ianbicking.org

From jjinux at gmail.com  Fri Aug 12 12:11:25 2005
From: jjinux at gmail.com (Shannon -jj Behrens)
Date: Fri, 12 Aug 2005 03:11:25 -0700
Subject: [Web-SIG] and now for something completely different!
Message-ID: <c41f67b90508120311563b72de@mail.gmail.com>

Hey guys,

Maybe I'm just ignorant (highly probable), but I'm really having a
hard time keeping up with the "configuration" emails, especially when
each of you is using slightly different definitions and trying to
reach slightly different goals.  Please forgive me for coming out and
stating this.

With the number of participants in the conversations, it doesn't seem
like we're making a huge amount of progress, although perhaps I should
shut up and be patient.

In the meantime, I'd like to propose that we framework authors try to
start sharing our backend session code.  Let's just create a library
like Apache::Session
<http://directory.fsf.org/webauth/misc/apache-session.html>.  As much
as possible, I think we can make it framework agnostic, relying on the
framework itself to respond to callbacks for doing things like setting
session cookies and creating a database cursor.  Just like with WSGI,
the frameworks need not change their external APIs.  Let's keep it
simple and just make it a library.

(I'm not sure the Twisted folks can participate because things on the
Twisted side are always so different, but hopefully I'm wrong.)

In any case, it's just a proposal to try to share more code.  If I can
get two other major frameworks to say they'll commit to working with
me and using/contributing to the library, I'll start the endeavor and
give them CVS commit rights.  We need not write much new code.  I'd
like to reuse code that each of us already has.  This will have the
benefit of a lot of peer review.

Perhaps this will make for a slightly better (Python Web) world :-D

Best Regards,
-jj

-- 
I have decided to switch to Gmail, but messages to my Yahoo account will
still get through.

From fumanchu at amor.org  Fri Aug 12 13:14:49 2005
From: fumanchu at amor.org (Robert Brewer)
Date: Fri, 12 Aug 2005 04:14:49 -0700
Subject: [Web-SIG] and now for something completely different!
Message-ID: <3A81C87DC164034AA4E2DDFE11D258E3772790@exchange.hqamor.amorhq.net>

Shannon -jj Behrens wrote:
> ...I'd like to propose that we framework authors try to
> start sharing our backend session code.  Let's just
> create a library like Apache::Session
> <http://directory.fsf.org/webauth/misc/apache-session.html>.
> As much as possible, I think we can make it framework
> agnostic, relying on the framework itself to respond
> to callbacks for doing things like setting session
> cookies and creating a database cursor.  Just like
> with WSGI, the frameworks need not change their
> external APIs.  Let's keep it simple and just make
> it a library.

Sounds great. Let's see what we can come up with.


Robert Brewer
CherryPy Team
fumanchu at amor.org

From ianb at colorstudy.com  Fri Aug 12 18:41:56 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Fri, 12 Aug 2005 11:41:56 -0500
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <c41f67b90508120311563b72de@mail.gmail.com>
References: <c41f67b90508120311563b72de@mail.gmail.com>
Message-ID: <42FCD154.6010001@colorstudy.com>

Shannon -jj Behrens wrote:
> Maybe I'm just ignorant (highly probable), but I'm really having a
> hard time keeping up with the "configuration" emails, especially when
> each of you is using slightly different definitions and trying to
> reach slightly different goals.  Please forgive me for coming out and
> stating this.

No, not at all; it's not been going that fast, and I myself feel 
simultaneously over- and underwhelmed by the discussion -- it's dense 
hard to follow, yet indecisive :-/

At this point I'm going to try to do some more formal refactoring in 
Paste of the configuration experiments I've done so far, and maybe bring 
it up again when that's more complete.  Or something; I'll keep reading 
if other people put out ideas.

> In the meantime, I'd like to propose that we framework authors try to
> start sharing our backend session code.  Let's just create a library
> like Apache::Session
> <http://directory.fsf.org/webauth/misc/apache-session.html>.  As much
> as possible, I think we can make it framework agnostic, relying on the
> framework itself to respond to callbacks for doing things like setting
> session cookies and creating a database cursor.  Just like with WSGI,
> the frameworks need not change their external APIs.  Let's keep it
> simple and just make it a library.

I think that would be useful.  Flup has a fairly decoupled session store 
(http://www.saddi.com/software/flup/ in 
http://svn.saddi.com/flup/trunk/flup/middleware/session.py).  Is there 
other current work that should be considered?  PythonWeb has a session 
module, but I don't know what its insides look like: 
http://www.pythonweb.org/projects/webmodules/doc/0.5.3/html_multipage/lib/session.html

Paste has one too, but it's Not Very Good ;)  I started using the flup 
session, but I got lazy and never flipped the switch to make it the 
default.  There's been some discussion about sessions in the last few 
months on the Quixote list as well.

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From jjinux at gmail.com  Fri Aug 12 19:28:33 2005
From: jjinux at gmail.com (Shannon -jj Behrens)
Date: Fri, 12 Aug 2005 10:28:33 -0700
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <42FCD154.6010001@colorstudy.com>
References: <c41f67b90508120311563b72de@mail.gmail.com>
	<42FCD154.6010001@colorstudy.com>
Message-ID: <c41f67b9050812102825ee28eb@mail.gmail.com>

If we get CherryPy (awesome, Robert!), Quixote, and Paste onboard,
I'll consider it a huge success.

-jj

On 8/12/05, Ian Bicking <ianb at colorstudy.com> wrote:
> Shannon -jj Behrens wrote:
> > Maybe I'm just ignorant (highly probable), but I'm really having a
> > hard time keeping up with the "configuration" emails, especially when
> > each of you is using slightly different definitions and trying to
> > reach slightly different goals.  Please forgive me for coming out and
> > stating this.
> 
> No, not at all; it's not been going that fast, and I myself feel
> simultaneously over- and underwhelmed by the discussion -- it's dense
> hard to follow, yet indecisive :-/
> 
> At this point I'm going to try to do some more formal refactoring in
> Paste of the configuration experiments I've done so far, and maybe bring
> it up again when that's more complete.  Or something; I'll keep reading
> if other people put out ideas.
> 
> > In the meantime, I'd like to propose that we framework authors try to
> > start sharing our backend session code.  Let's just create a library
> > like Apache::Session
> > <http://directory.fsf.org/webauth/misc/apache-session.html>.  As much
> > as possible, I think we can make it framework agnostic, relying on the
> > framework itself to respond to callbacks for doing things like setting
> > session cookies and creating a database cursor.  Just like with WSGI,
> > the frameworks need not change their external APIs.  Let's keep it
> > simple and just make it a library.
> 
> I think that would be useful.  Flup has a fairly decoupled session store
> (http://www.saddi.com/software/flup/ in
> http://svn.saddi.com/flup/trunk/flup/middleware/session.py).  Is there
> other current work that should be considered?  PythonWeb has a session
> module, but I don't know what its insides look like:
> http://www.pythonweb.org/projects/webmodules/doc/0.5.3/html_multipage/lib/session.html
> 
> Paste has one too, but it's Not Very Good ;)  I started using the flup
> session, but I got lazy and never flipped the switch to make it the
> default.  There's been some discussion about sessions in the last few
> months on the Quixote list as well.
> 
> --
> Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org
> 


-- 
I have decided to switch to Gmail, but messages to my Yahoo account will
still get through.

From james at pythonweb.org  Fri Aug 12 19:40:32 2005
From: james at pythonweb.org (James Gardner)
Date: Fri, 12 Aug 2005 18:40:32 +0100
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <42FCD154.6010001@colorstudy.com>
References: <c41f67b90508120311563b72de@mail.gmail.com>
	<42FCD154.6010001@colorstudy.com>
Message-ID: <42FCDF10.1030103@pythonweb.org>

Ian Bicking wrote:

>PythonWeb has a session 
>module, but I don't know what its insides look like: 
>http://www.pythonweb.org/projects/webmodules/doc/0.5.3/html_multipage/lib/session.html
>
I was going to suggest it might be worth looking at the PythonWeb 
web.session module as a basis. The version in 0.5.3 is fairly well 
developed after long discussions with Felix Schwarz on the pythonweb 
mailing list. The API is separate from the implementation so you can 
write different drivers for different storage mechanisms. I wrote a 
driver to use an SQL database engine and that driver itself uses the 
PythonWeb database module which provides an abstraction layer to work on 
multiple engines. Osvaldo Santana Neto kindly donated a file based 
driver. There is also a WSGI implementation to use the session module at:

http://www.pythonweb.org/projects/webmodules/doc/0.5.3/html_multipage/lib/example-wsgiSession.html

The module uses the concept of a manager to manage multiple stores. The 
idea is that different applications have different stores so that their 
keys don't over-write each others by mistake but that all those 
applications can share the same session cookie and expire at the same 
time. You can also set the cookie properties and have the time the 
session stores expire different from the time the cookie expires if you 
really want too. I think it makes a good starting point anyway, the docs 
are quite comprehensive and I'd also be happy to give CVS access to 
anyone who wanted it. Unfortunately I don't use any other session 
software so I don't know how well web.session compares to others. If we 
base the new session module on something else I'd also be happy to 
update the web modules and bricks to use the new session module 
(possibly as a driver) instead if it provided the same features.

Sharing code is definitely a good idea, but I'd also like to agree a new 
WSGI standard because apart from end user benefits I think that will 
massively speed up the rate at which different framework authors use 
each other's code in their own projects and the more that happens the 
more things will get naturally integrated anyway.

James

P.S. I'm currently updating all the components on pythonweb.org to use 
the new Eggs format at 
http://peak.telecommunity.com/DevCenter/PythonEggs . They are a very 
exciting technology and if you are keen on experimenting with them and 
want to have a go with web.session you can test the 0.6.0 alpha of the 
web module (which includes web.session) by installing the latest version 
of setuptools and running the following command:

python easy_install.py web

If that doesn't work you'll have to use the old 0.5.3 web modules (the 
session module is actually unchanged). The eggs themselves are at 
http://www.pythonweb.org/pythonweb/release/ for those who are interested.


From mso at oz.net  Fri Aug 12 23:08:31 2005
From: mso at oz.net (mso@oz.net)
Date: Fri, 12 Aug 2005 14:08:31 -0700 (PDT)
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <42FCD154.6010001@colorstudy.com>
References: <c41f67b90508120311563b72de@mail.gmail.com>   
	<42FCD154.6010001@colorstudy.com>
Message-ID: <33071.161.55.66.150.1123880911.squirrel@www.oz.net>

Ian Bicking wrote:
> Paste has one too, but it's Not Very Good ;)  I started using the flup
session, but I got lazy and never flipped the switch to make it the
default.  There's been some discussion about sessions in the last few
months on the Quixote list as well.

session2 is at http://quixote.idyll.org/ .  It was made due to the lack of
persistent session stores in Quixote.  There's a threefold structure:

Session: Copy of Quixote's session class.  You can set attributes but not
keys.  DictSession also allows keys.  There's a .user attribute (default
None) and a .set_user(user) method, but those can be ignored.

SessionManager: Interface between the framework and store.  The
implementation is Quixote-specific, but one could probably make an
abstract superclass or WSGI class.

SessionStore: Base class of storage backends: DirectorySessionStore,
DurusSessionStore, MySQLSessionStore, PostgresSessionStore,
ShelveSessionStore.

If a Quixote application were installed in Paste and used a third-party
session manager, the session object would have to:
  - allow arbitary attributes.
  - default .user to None.
  - have a .set_user(user) method that merely sets .user.
Otherwise people would have to modify their applications.

-- 
-- Mike Orr <mso at oz.net>


From ksenia at ksenia.nl  Fri Aug 12 23:42:52 2005
From: ksenia at ksenia.nl (Ksenia Marasanova)
Date: Fri, 12 Aug 2005 23:42:52 +0200
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <33071.161.55.66.150.1123880911.squirrel@www.oz.net>
References: <c41f67b90508120311563b72de@mail.gmail.com>
	<42FCD154.6010001@colorstudy.com>
	<33071.161.55.66.150.1123880911.squirrel@www.oz.net>
Message-ID: <C22823B3-25D2-4217-BAB4-35A7592DE5DA@ksenia.nl>


Op 12-aug-2005, om 23:08 heeft mso at oz.net het volgende geschreven:
> If a Quixote application were installed in Paste and used a third- 
> party
> session manager, the session object would have to:
>   - allow arbitary attributes.
>   - default .user to None.
>   - have a .set_user(user) method that merely sets .user.
> Otherwise people would have to modify their applications.


Actually I migrated lately few old applications from Quixote1  
"native" sessions to Flup Session middleware :)
Except from arbitrary attributes that I don't have, this is it:

from flup import session
from quixote import publish

def _get_user(self):
     if hasattr(self._user):
         if self._user is not None:
             # some app-specific code to get user from db
             return user

def _set_user(self, user):
     # user is SQLObject instance, we can only store ID
     if user is None:
         self._user = None
     else:
         self._user = user.id

def set_user(self, user):
     self.user = user

session.Session.user = property(_get_user, _set_user)
session.Session.set_user = set_user

class MyPublisher(publish.Publisher):
     def start_request(self, request):
         request.session = request.environ 
['com.saddi.service.session'].session
         publish.Publisher.start_request(self, request)


From renesd at gmail.com  Sat Aug 13 06:24:06 2005
From: renesd at gmail.com (Rene Dudfield)
Date: Sat, 13 Aug 2005 14:24:06 +1000
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <c41f67b90508120311563b72de@mail.gmail.com>
References: <c41f67b90508120311563b72de@mail.gmail.com>
Message-ID: <64ddb72c05081221246c60c756@mail.gmail.com>

Ok, here's my super list of wanted session features.


Multiple reader, single writer locking.  Or MVCC would be nice :) 
Otherwise if you use it for multiple requests at once(as in with ajax
apps) everything slows way down.

Having in the api a way to say 'I am just opening this for reading'
would be really nice.  Then backends that can implement this
functionality can implement it.  Backends that can't can implement
locking however they want and ignore the read/write options passed.

Performance, and locking for session objects is quite hard to get
right if they are to be used by lots of different people, apps, and
frameworks.

Also having a specific close() method, rather than relying on garbage
collection is important.

Lazy opening of sessions is also good.  So if it isn't touched then
don't bother opening it.

Support for cookie based, and url based sessions is also very
important.  It is also important to be able to chose which method you
want to use.

Security features like ip address, and referer checking can probably
be implemented separately.  As well as only allowing a user to get a
session on one computer.  These are optional things, but should be
possible to do with whatever the session design is.

Allowing a single browser to have multiple sessions open at once would
also be good.  This way you can avoid name clashes when mixing
applications.  Or for having separate session configurations for
different parts of your application.  Eg.  database sessions for admin
section, and memory based ones for your front end.


Cheers.


On 8/12/05, Shannon -jj Behrens <jjinux at gmail.com> wrote:
> Hey guys,
> 
> Maybe I'm just ignorant (highly probable), but I'm really having a
> hard time keeping up with the "configuration" emails, especially when
> each of you is using slightly different definitions and trying to
> reach slightly different goals.  Please forgive me for coming out and
> stating this.
> 
> With the number of participants in the conversations, it doesn't seem
> like we're making a huge amount of progress, although perhaps I should
> shut up and be patient.
> 
> In the meantime, I'd like to propose that we framework authors try to
> start sharing our backend session code.  Let's just create a library
> like Apache::Session
> <http://directory.fsf.org/webauth/misc/apache-session.html>.  As much
> as possible, I think we can make it framework agnostic, relying on the
> framework itself to respond to callbacks for doing things like setting
> session cookies and creating a database cursor.  Just like with WSGI,
> the frameworks need not change their external APIs.  Let's keep it
> simple and just make it a library.
> 
> (I'm not sure the Twisted folks can participate because things on the
> Twisted side are always so different, but hopefully I'm wrong.)
> 
> In any case, it's just a proposal to try to share more code.  If I can
> get two other major frameworks to say they'll commit to working with
> me and using/contributing to the library, I'll start the endeavor and
> give them CVS commit rights.  We need not write much new code.  I'd
> like to reuse code that each of us already has.  This will have the
> benefit of a lot of peer review.
> 
> Perhaps this will make for a slightly better (Python Web) world :-D
> 
> Best Regards,
> -jj
> 
> --
> I have decided to switch to Gmail, but messages to my Yahoo account will
> still get through.
> _______________________________________________
> Web-SIG mailing list
> Web-SIG at python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/renesd%40gmail.com
>

From titus at caltech.edu  Sun Aug 14 19:54:25 2005
From: titus at caltech.edu (Titus Brown)
Date: Sun, 14 Aug 2005 10:54:25 -0700
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <42FCD154.6010001@colorstudy.com>
References: <c41f67b90508120311563b72de@mail.gmail.com>
	<42FCD154.6010001@colorstudy.com>
Message-ID: <20050814175425.GF17009@caltech.edu>

-> I think that would be useful.  Flup has a fairly decoupled session store 
-> (http://www.saddi.com/software/flup/ in 
-> http://svn.saddi.com/flup/trunk/flup/middleware/session.py).  Is there 
-> other current work that should be considered?  PythonWeb has a session 
-> module, but I don't know what its insides look like: 
-> http://www.pythonweb.org/projects/webmodules/doc/0.5.3/html_multipage/lib/session.html
-> 
-> Paste has one too, but it's Not Very Good ;)  I started using the flup 
-> session, but I got lazy and never flipped the switch to make it the 
-> default.  There's been some discussion about sessions in the last few 
-> months on the Quixote list as well.

I've been decoupled from Web-SIG e-mails for the last two months, but
Mike Orr and I built a simple session store for Quixote that has a
fairly simple and generic storage API:

http://cafepy.com/quixote_extras/titus/session2/session2/store/SessionStore.py

With the comments deleted, here's the core API:

class SessionStore:
	def load_session(self, id, default=None):
		pass
	
	def save_session(self, session):
		pass

	def delete_session(self, session):
		pass

	def has_session(self, id):
		return self.load_session(id, None)

The only constraint is that 'id' must be a string in order for it to
work with all of the session stores.

We have implemented stores for postgres, durus, mysql, directory/file,
and shelve persistence mechanisms.

cheers,
--titus

From speno at isc.upenn.edu  Sun Aug 14 22:05:26 2005
From: speno at isc.upenn.edu (John Speno)
Date: Sun, 14 Aug 2005 16:05:26 -0400
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <20050814175425.GF17009@caltech.edu>
References: <c41f67b90508120311563b72de@mail.gmail.com>
	<42FCD154.6010001@colorstudy.com>
	<20050814175425.GF17009@caltech.edu>
Message-ID: <EB4DBA36-4D95-418D-A82F-2A449A49DD45@isc.upenn.edu>

Another session related wish:

A few CherryPy users have requested[1] that there be an API for  
registering
callbacks on sessions with the intent that those callbacks are  
invoked when
a session is destroyed. Apparently this is something they are  
familiar with
in the java servlet world.


[1]http://www.cherrypy.org/ticket/250

From jjinux at gmail.com  Mon Aug 15 19:17:41 2005
From: jjinux at gmail.com (Shannon -jj Behrens)
Date: Mon, 15 Aug 2005 10:17:41 -0700
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <20050814175425.GF17009@caltech.edu>
References: <c41f67b90508120311563b72de@mail.gmail.com>
	<42FCD154.6010001@colorstudy.com> <20050814175425.GF17009@caltech.edu>
Message-ID: <c41f67b905081510176cfd455f@mail.gmail.com>

Heh, I'm overwhelmed by too much code and not enough direction. 
Naturally, I've got nice session code in Aquarium as well.  *Sigh*
this Python Web thing is going to be the death of me!

-jj

On 8/14/05, Titus Brown <titus at caltech.edu> wrote:
> -> I think that would be useful.  Flup has a fairly decoupled session store
> -> (http://www.saddi.com/software/flup/ in
> -> http://svn.saddi.com/flup/trunk/flup/middleware/session.py).  Is there
> -> other current work that should be considered?  PythonWeb has a session
> -> module, but I don't know what its insides look like:
> -> http://www.pythonweb.org/projects/webmodules/doc/0.5.3/html_multipage/lib/session.html
> ->
> -> Paste has one too, but it's Not Very Good ;)  I started using the flup
> -> session, but I got lazy and never flipped the switch to make it the
> -> default.  There's been some discussion about sessions in the last few
> -> months on the Quixote list as well.
> 
> I've been decoupled from Web-SIG e-mails for the last two months, but
> Mike Orr and I built a simple session store for Quixote that has a
> fairly simple and generic storage API:
> 
> http://cafepy.com/quixote_extras/titus/session2/session2/store/SessionStore.py
> 
> With the comments deleted, here's the core API:
> 
> class SessionStore:
>         def load_session(self, id, default=None):
>                 pass
> 
>         def save_session(self, session):
>                 pass
> 
>         def delete_session(self, session):
>                 pass
> 
>         def has_session(self, id):
>                 return self.load_session(id, None)
> 
> The only constraint is that 'id' must be a string in order for it to
> work with all of the session stores.
> 
> We have implemented stores for postgres, durus, mysql, directory/file,
> and shelve persistence mechanisms.
> 
> cheers,
> --titus
> _______________________________________________
> Web-SIG mailing list
> Web-SIG at python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/jjinux%40gmail.com
> 


-- 
I have decided to switch to Gmail, but messages to my Yahoo account will
still get through.

From chrism at plope.com  Mon Aug 15 19:25:33 2005
From: chrism at plope.com (Chris McDonough)
Date: Mon, 15 Aug 2005 13:25:33 -0400
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <c41f67b905081510176cfd455f@mail.gmail.com>
References: <c41f67b90508120311563b72de@mail.gmail.com>
	<42FCD154.6010001@colorstudy.com> <20050814175425.GF17009@caltech.edu>
	<c41f67b905081510176cfd455f@mail.gmail.com>
Message-ID: <1124126733.30493.17.camel@localhost.localdomain>

I've also got reams of code in Zope for sessions.

Maybe we should just wait til the next PyCon and have a consolidation
sprint.

- C


On Mon, 2005-08-15 at 10:17 -0700, Shannon -jj Behrens wrote:
> Heh, I'm overwhelmed by too much code and not enough direction. 
> Naturally, I've got nice session code in Aquarium as well.  *Sigh*
> this Python Web thing is going to be the death of me!
> 
> -jj
> 
> On 8/14/05, Titus Brown <titus at caltech.edu> wrote:
> > -> I think that would be useful.  Flup has a fairly decoupled session store
> > -> (http://www.saddi.com/software/flup/ in
> > -> http://svn.saddi.com/flup/trunk/flup/middleware/session.py).  Is there
> > -> other current work that should be considered?  PythonWeb has a session
> > -> module, but I don't know what its insides look like:
> > -> http://www.pythonweb.org/projects/webmodules/doc/0.5.3/html_multipage/lib/session.html
> > ->
> > -> Paste has one too, but it's Not Very Good ;)  I started using the flup
> > -> session, but I got lazy and never flipped the switch to make it the
> > -> default.  There's been some discussion about sessions in the last few
> > -> months on the Quixote list as well.
> > 
> > I've been decoupled from Web-SIG e-mails for the last two months, but
> > Mike Orr and I built a simple session store for Quixote that has a
> > fairly simple and generic storage API:
> > 
> > http://cafepy.com/quixote_extras/titus/session2/session2/store/SessionStore.py
> > 
> > With the comments deleted, here's the core API:
> > 
> > class SessionStore:
> >         def load_session(self, id, default=None):
> >                 pass
> > 
> >         def save_session(self, session):
> >                 pass
> > 
> >         def delete_session(self, session):
> >                 pass
> > 
> >         def has_session(self, id):
> >                 return self.load_session(id, None)
> > 
> > The only constraint is that 'id' must be a string in order for it to
> > work with all of the session stores.
> > 
> > We have implemented stores for postgres, durus, mysql, directory/file,
> > and shelve persistence mechanisms.
> > 
> > cheers,
> > --titus
> > _______________________________________________
> > Web-SIG mailing list
> > Web-SIG at python.org
> > Web SIG: http://www.python.org/sigs/web-sig
> > Unsubscribe: http://mail.python.org/mailman/options/web-sig/jjinux%40gmail.com
> > 
> 
> 


From titus at caltech.edu  Mon Aug 15 19:32:45 2005
From: titus at caltech.edu (Titus Brown)
Date: Mon, 15 Aug 2005 10:32:45 -0700
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <1124126733.30493.17.camel@localhost.localdomain>
References: <c41f67b90508120311563b72de@mail.gmail.com>
	<42FCD154.6010001@colorstudy.com>
	<20050814175425.GF17009@caltech.edu>
	<c41f67b905081510176cfd455f@mail.gmail.com>
	<1124126733.30493.17.camel@localhost.localdomain>
Message-ID: <20050815173245.GA19517@caltech.edu>

-> I've also got reams of code in Zope for sessions.
-> 
-> Maybe we should just wait til the next PyCon and have a consolidation
-> sprint.
-> 
-> On Mon, 2005-08-15 at 10:17 -0700, Shannon -jj Behrens wrote:
-> > Heh, I'm overwhelmed by too much code and not enough direction. 
-> > Naturally, I've got nice session code in Aquarium as well.  *Sigh*
-> > this Python Web thing is going to be the death of me!

I'd be surprised if the session *storage* code turned out to be all that
different between these frameworks.  I'm willing to change function &
class names if it means I'd be using/testing/building on other people's
work.

Session code itself is a much stickier wicket, as far as I can tell.

--titus

From ianb at colorstudy.com  Mon Aug 15 19:33:44 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon, 15 Aug 2005 12:33:44 -0500
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <c41f67b905081510176cfd455f@mail.gmail.com>
References: <c41f67b90508120311563b72de@mail.gmail.com>	
	<42FCD154.6010001@colorstudy.com>
	<20050814175425.GF17009@caltech.edu>
	<c41f67b905081510176cfd455f@mail.gmail.com>
Message-ID: <4300D1F8.8090007@colorstudy.com>

Shannon -jj Behrens wrote:
> Heh, I'm overwhelmed by too much code and not enough direction. 
> Naturally, I've got nice session code in Aquarium as well.  *Sigh*
> this Python Web thing is going to be the death of me!

If everyone is reasonably comfortable with what sessions should do, can 
we just design an API and figure out the implementation later?


-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From jjinux at gmail.com  Mon Aug 15 20:22:33 2005
From: jjinux at gmail.com (Shannon -jj Behrens)
Date: Mon, 15 Aug 2005 11:22:33 -0700
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <4300D1F8.8090007@colorstudy.com>
References: <c41f67b90508120311563b72de@mail.gmail.com>
	<42FCD154.6010001@colorstudy.com> <20050814175425.GF17009@caltech.edu>
	<c41f67b905081510176cfd455f@mail.gmail.com>
	<4300D1F8.8090007@colorstudy.com>
Message-ID: <c41f67b90508151122cf0af33@mail.gmail.com>

The only thing I'm still concerned about is the locking.  I lock
access to the set of sessions when creating or deleting one, but I
don't bother locking access to a single session.  I think other people
may have more strict requirements.  I agree with Titus that we should
stick to worrying about the backend storage at this point since it's
less of a monster.

-jj

On 8/15/05, Ian Bicking <ianb at colorstudy.com> wrote:
> Shannon -jj Behrens wrote:
> > Heh, I'm overwhelmed by too much code and not enough direction.
> > Naturally, I've got nice session code in Aquarium as well.  *Sigh*
> > this Python Web thing is going to be the death of me!
> 
> If everyone is reasonably comfortable with what sessions should do, can
> we just design an API and figure out the implementation later?

-- 
I have decided to switch to Gmail, but messages to my Yahoo account will
still get through.

From fumanchu at amor.org  Mon Aug 15 20:25:54 2005
From: fumanchu at amor.org (Robert Brewer)
Date: Mon, 15 Aug 2005 11:25:54 -0700
Subject: [Web-SIG] and now for something completely different!
Message-ID: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>

Ian Bicking wrote:
> Shannon -jj Behrens wrote:
> > Heh, I'm overwhelmed by too much code and not enough direction. 
> > Naturally, I've got nice session code in Aquarium as well.  *Sigh*
> > this Python Web thing is going to be the death of me!
> 
> If everyone is reasonably comfortable with what sessions 
> should do, can we just design an API and figure out the
> implementation later?

That depends on where you draw the line between the two. ;) It's pretty
easy to define an "implementation-less" API that consists of: create,
read, update, delete.

The first critical implementation discussion (which affects the API)
should be around concurrency, and if multiple locking strategies need to
be supported. In flup, for example, the entire session store is locked
if the same session is requested more than once simultaneously.
Pythonweb doesn't seem to mention concurrency at all. Paste mentions
it's not supported. ;) Quixote's session2 stores have flags for
multithreading/multiprocess but seem to not actually do anything with
those flags.

The concern is not only response time, but atomicity. In the comments
for Aquarium's SessionContainer:

    "Concerning locking:  in general, a global lock (of some sort)
    should be used so that creating, deleting, reading, and writing
    sessions is serialized.  However, it is not necessary to have
    a lock for each session. If a user wishes to use two browser
    windows at the same time, the last writer wins."

That is a design decision which not all frameworks (or other consumers
of our session lib) might share. Apparently, given the current Python
session modules out there, it's common to survive without caring? I know
Mike Robinson has worked many long nights trying to make a session
module for CherryPy which can consistently pass simple hit-counter
tests. ;) Personally, I'd like to pursue an MROW solution.

It would be nice if our final product supported multiple concurrency
strategies. The decision about which strategy to use could be left to
framework authors (who would wish to begin migration by maintaining
maximum backward-compatibility), or to their users, if those options can
be described simply enough.


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org

From mso at oz.net  Mon Aug 15 22:40:11 2005
From: mso at oz.net (mso@oz.net)
Date: Mon, 15 Aug 2005 13:40:11 -0700 (PDT)
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
Message-ID: <33026.161.55.66.150.1124138411.squirrel@www.oz.net>

Robert Brewer wrote:
> Quixote's session2 stores have flags for
> multithreading/multiprocess but seem to not actually do anything with
> those flags.

Correct, the flags are just indications to the caller.  The caller might
raise an exception if a thread-unsafe store is paired with a multithreaded
server.    There's no database locking code, although Postgres uses a
transaction for the immediate operation.

> Apparently, given the current Python
> session modules out there, it's common to survive without caring?

I haven't seen locking in any of the modules I've used, nor any particular
errors caused by this.  Is it defined what behavior the server should have
if the user has the same site opened in two tabs and clicks back and
forth?


-- 
-- Mike Orr <mso at oz.net>


From ianb at colorstudy.com  Mon Aug 15 22:46:19 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon, 15 Aug 2005 15:46:19 -0500
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
Message-ID: <4300FF1B.7080308@colorstudy.com>

Robert Brewer wrote:
>>If everyone is reasonably comfortable with what sessions 
>>should do, can we just design an API and figure out the
>>implementation later?
> 
> 
> That depends on where you draw the line between the two. ;) It's pretty
> easy to define an "implementation-less" API that consists of: create,
> read, update, delete.

Yes, but we're all clever enough to know that's incomplete ;)

> The first critical implementation discussion (which affects the API)
> should be around concurrency, and if multiple locking strategies need to
> be supported. In flup, for example, the entire session store is locked
> if the same session is requested more than once simultaneously.
> Pythonweb doesn't seem to mention concurrency at all. Paste mentions
> it's not supported. ;) Quixote's session2 stores have flags for
> multithreading/multiprocess but seem to not actually do anything with
> those flags.

I think it definitely is wrong to lock the session for concurrent reads 
-- that's a likely case, and can unnecessarily serialize access to 
things like images, or block a website during a long download (if that 
download uses the session, which is quite possible if the download 
requires authentication information).

> The concern is not only response time, but atomicity. In the comments
> for Aquarium's SessionContainer:
> 
>     "Concerning locking:  in general, a global lock (of some sort)
>     should be used so that creating, deleting, reading, and writing
>     sessions is serialized.  However, it is not necessary to have
>     a lock for each session. If a user wishes to use two browser
>     windows at the same time, the last writer wins."
> 
> That is a design decision which not all frameworks (or other consumers
> of our session lib) might share. Apparently, given the current Python
> session modules out there, it's common to survive without caring? I know
> Mike Robinson has worked many long nights trying to make a session
> module for CherryPy which can consistently pass simple hit-counter
> tests. ;) Personally, I'd like to pursue an MROW solution.

In practice race conditions are very uncommon.  Simultaneous requests 
from the same session are uncommon, since what few simultaneous requests 
that occur are likely to be for boring resources like images.  If you 
have an image bug on a page that also writes the session, maybe you'd 
have a problem.  I'd be okay saying "don't do that" because usually 
people don't do that, so it's not very compelling.

It's possible that Ajax techniques would make concurrency more likely, 
but I'm not sure.  One realistic case might be an upload-notification 
system, where the file is uploaded into a hidden iframe and the 
resources being submitted to could write to the session to signal when 
the upload was finished; but the user might be doing something in 
another frame at the same time.  For that case I think you could just 
not use the session (I don't think it's a good communication medium for 
stuff like that).  But with frames and multiple windows at least it's 
vaguely possible concurrent writes could happen.  OTOH conflict errors 
are the wrong answer to concurrent writes in a signficant number of 
cases, where a little lossiness is preferable.

Generally it becomes more complex/interesting if you have transactional 
sessions.

> It would be nice if our final product supported multiple concurrency
> strategies. The decision about which strategy to use could be left to
> framework authors (who would wish to begin migration by maintaining
> maximum backward-compatibility), or to their users, if those options can
> be described simply enough.

I'm -1 on multiple strategies, unless there's a really good reason for 
it.  I'd like to see if we can do the Best Most Complete strategy 
without making compromises or creating a too-difficult API; if so, then 
why not use that?

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From jonathan at carnageblender.com  Mon Aug 15 22:51:42 2005
From: jonathan at carnageblender.com (Jonathan Ellis)
Date: Mon, 15 Aug 2005 13:51:42 -0700
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <4300FF1B.7080308@colorstudy.com>
References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
	<4300FF1B.7080308@colorstudy.com>
Message-ID: <1124139102.28016.240738111@webmail.messagingengine.com>

On Mon, 15 Aug 2005 15:46:19 -0500, "Ian Bicking" <ianb at colorstudy.com>
said:
> > That is a design decision which not all frameworks (or other consumers
> > of our session lib) might share. Apparently, given the current Python
> > session modules out there, it's common to survive without caring? I know
> > Mike Robinson has worked many long nights trying to make a session
> > module for CherryPy which can consistently pass simple hit-counter
> > tests. ;) Personally, I'd like to pursue an MROW solution.
> 
> In practice race conditions are very uncommon.  Simultaneous requests 
> from the same session are uncommon, since what few simultaneous requests 
> that occur are likely to be for boring resources like images.  If you 
> have an image bug on a page that also writes the session, maybe you'd 
> have a problem.  I'd be okay saying "don't do that" because usually 
> people don't do that, so it's not very compelling.

I wouldn't be okay with non-threadsafe sessions.

-Jonathan

From ianb at colorstudy.com  Mon Aug 15 22:57:55 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon, 15 Aug 2005 15:57:55 -0500
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <1124139102.28016.240738111@webmail.messagingengine.com>
References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
	<4300FF1B.7080308@colorstudy.com>
	<1124139102.28016.240738111@webmail.messagingengine.com>
Message-ID: <430101D3.4010001@colorstudy.com>

Jonathan Ellis wrote:
> On Mon, 15 Aug 2005 15:46:19 -0500, "Ian Bicking" <ianb at colorstudy.com>
> said:
> 
>>>That is a design decision which not all frameworks (or other consumers
>>>of our session lib) might share. Apparently, given the current Python
>>>session modules out there, it's common to survive without caring? I know
>>>Mike Robinson has worked many long nights trying to make a session
>>>module for CherryPy which can consistently pass simple hit-counter
>>>tests. ;) Personally, I'd like to pursue an MROW solution.
>>
>>In practice race conditions are very uncommon.  Simultaneous requests 
>>from the same session are uncommon, since what few simultaneous requests 
>>that occur are likely to be for boring resources like images.  If you 
>>have an image bug on a page that also writes the session, maybe you'd 
>>have a problem.  I'd be okay saying "don't do that" because usually 
>>people don't do that, so it's not very compelling.
> 
> 
> I wouldn't be okay with non-threadsafe sessions.

Non-threadsafe in what manner?  Certainly they should be usable in 
threaded environments, and should never blow up or anything.  I just 
assume that.

The question is whether, if there's two concurrent writers (threaded or 
multiprocess), they should be serialized (and how), or if one of them 
simply clobbers the other.  Threads or multiprocess, it's really the 
same issue.  Except perhaps for isolation -- threads could *potentially* 
see changes in other threads, but that's not possible for multiple 
processes.  So probably they should always be isolated; not a big deal, 
but something to consider.

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From pje at telecommunity.com  Mon Aug 15 23:11:23 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 15 Aug 2005 17:11:23 -0400
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <4300FF1B.7080308@colorstudy.com>
References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
	<3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
Message-ID: <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com>

At 03:46 PM 8/15/2005 -0500, Ian Bicking wrote:
>Robert Brewer wrote:
> >>If everyone is reasonably comfortable with what sessions
> >>should do, can we just design an API and figure out the
> >>implementation later?
> >
> >
> > That depends on where you draw the line between the two. ;) It's pretty
> > easy to define an "implementation-less" API that consists of: create,
> > read, update, delete.
>
>Yes, but we're all clever enough to know that's incomplete ;)

Personally, I think the most important part of session services is just 
managing the session itself; start, begin, timeout, and getting an 
identifier in and out of the request/response.  For me, 
create/read/update/delete/persist/GC responsibility belongs entirely to the 
application.  To put it another way: I don't believe in session variables, 
only session-specific application objects.  An ecommerce application should 
have persistent carts and items and the like; the only purpose of a session 
is to find out which cart to look at.

In this way, concurrency and all the other questions being raised here are 
irrelevant.  Or at least they're irrelevant to the session management part, 
anyway.  :)

So I'd personally prefer that any session service standards distinguish 
between management of the session itself, from storage of data associated 
with the session.  The latter is just a standard object-persistence or 
object-relational problem and can easily be dealt with as such, distinct 
from session management issues like cookies vs. URLs, timeouts, ID 
generation, and so forth.  (Note that even GC of abandoned sessions is 
highly subject to business rules, and it would be crazy for us to try and 
encompass the possible rule variations within a relatively simple component 
specification.)

While it may be nice to have persistence services that are optimized for 
session-like use cases, it doesn't make a lot of sense to tightly couple 
them to session management.  Just like WSGI splits things into application 
and server, I think a session spec should split them into 
client-state-management and server-state-storage, so that we can mix and 
match from the best of both worlds.

Of course, I personally prefer to use whatever the application's storage is 
for my session management, so I'll probably have little reason to get 
involved in the "storage" side of the session equation.  Indeed, I'd argue 
that applications that *don't* put their session data in the application's 
main DB should have very very good reasons for doing so, and I've never 
heard a good enough reason yet.  :)  Well, there's, "my application's DB 
suxors", but that means you ought to upgrade the application DB instead if 
you can.  :)


From jonathan at carnageblender.com  Mon Aug 15 23:41:07 2005
From: jonathan at carnageblender.com (Jonathan Ellis)
Date: Mon, 15 Aug 2005 14:41:07 -0700
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <430101D3.4010001@colorstudy.com>
References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
	<4300FF1B.7080308@colorstudy.com>
	<1124139102.28016.240738111@webmail.messagingengine.com>
	<430101D3.4010001@colorstudy.com>
Message-ID: <1124142067.1070.240740982@webmail.messagingengine.com>

On Mon, 15 Aug 2005 15:57:55 -0500, "Ian Bicking" <ianb at colorstudy.com>
said:
> Jonathan Ellis wrote:
> > On Mon, 15 Aug 2005 15:46:19 -0500, "Ian Bicking" <ianb at colorstudy.com>
> >>In practice race conditions are very uncommon.  Simultaneous requests 
> >>from the same session are uncommon, since what few simultaneous requests 
> >>that occur are likely to be for boring resources like images.  If you 
> >>have an image bug on a page that also writes the session, maybe you'd 
> >>have a problem.  I'd be okay saying "don't do that" because usually 
> >>people don't do that, so it's not very compelling.
> > 
> > 
> > I wouldn't be okay with non-threadsafe sessions.
> 
> Non-threadsafe in what manner?  Certainly they should be usable in 
> threaded environments, and should never blow up or anything.  I just 
> assume that.
> 
> The question is whether, if there's two concurrent writers (threaded or 
> multiprocess), they should be serialized (and how), or if one of them 
> simply clobbers the other.

Well, if your goal is "usable in [concurrent] environments," you're
really talking about serializing anyway.

Consider some hypothetical API:

def session_for_user(uname):
  if not session_exists(uname):
    create_session(uname):
  return session_retrieve(uname)

Depending on how soon session_exists can tell that a session is being
created, if two requests for the same session come in close enough
together (and it's worth remembering that this could easily be the
result of a single browser hitting refresh on a very heavily loaded
machine), the second request could get either an incompletely
initialized session object, or a different session object entirely.

-Jonathan

From ianb at colorstudy.com  Tue Aug 16 00:08:12 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon, 15 Aug 2005 17:08:12 -0500
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com>
References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
	<3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
	<5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com>
Message-ID: <4301124C.7040708@colorstudy.com>

Phillip J. Eby wrote:
> Of course, I personally prefer to use whatever the application's storage 
> is for my session management, so I'll probably have little reason to get 
> involved in the "storage" side of the session equation.  Indeed, I'd 
> argue that applications that *don't* put their session data in the 
> application's main DB should have very very good reasons for doing so, 
> and I've never heard a good enough reason yet.  :)  Well, there's, "my 
> application's DB suxors", but that means you ought to upgrade the 
> application DB instead if you can.  :)

There's useful reasons for non-application code to store things in the 
session, and the particulars of the application storage aren't really 
applicable.  For instance, with this pattern: 
http://blog.ianbicking.org/web-application-patterns-status-notification.html 
-- you put transient messages in the session.  But there's no point to 
using a fancy application session storage which means documentation and 
configuration and whatnot.  Maybe you have no impediments to throwing 
random data into your application data stores, but I do.

I think there's quite a few other use cases for this same kind of thing 
which I think implies that there should be a standard generic location 
to store session information.  Or you can ignore that and use the 
session ID only.

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From pje at telecommunity.com  Tue Aug 16 00:52:35 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 15 Aug 2005 18:52:35 -0400
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <4301124C.7040708@colorstudy.com>
References: <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com>
	<3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
	<3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
	<5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com>

At 05:08 PM 8/15/2005 -0500, Ian Bicking wrote:
>Phillip J. Eby wrote:
>>Of course, I personally prefer to use whatever the application's storage 
>>is for my session management, so I'll probably have little reason to get 
>>involved in the "storage" side of the session equation.  Indeed, I'd 
>>argue that applications that *don't* put their session data in the 
>>application's main DB should have very very good reasons for doing so, 
>>and I've never heard a good enough reason yet.  :)  Well, there's, "my 
>>application's DB suxors", but that means you ought to upgrade the 
>>application DB instead if you can.  :)
>
>There's useful reasons for non-application code to store things in the 
>session, and the particulars of the application storage aren't really 
>applicable.  For instance, with this pattern: 
>http://blog.ianbicking.org/web-application-patterns-status-notification.html 
>-- you put transient messages in the session.

If I needed to do what you're doing on that page, I'd probably just put the 
message in a cookie, and reset it once it was used.  In other words, a 
session isn't necessary just to have client-specific state, especially for 
something so short-lived as that example.


>   But there's no point to using a fancy application session storage which 
> means documentation and configuration and whatnot.  Maybe you have no 
> impediments to throwing random data into your application data stores, 
> but I do.

The reason I enforce this particular discipline is specifically to 
*prevent* "random data" from being added *anywhere*.  A session object that 
you can just throw any old data into is sloppy from my POV, because scaling 
most session backend systems well is a hard problem.  If you are making a 
small-scale quick-and-dirty system, okay, whatever, but in the 
megahits/month range and up, I think session variable design needs to be 
much more systematic to ensure it can be scaled.

Therefore, my philosophy is that every bit of client-specific state goes 
either in the application DB, or it goes in the browser.  Anywhere 
in-between the two is a liability from my perspective, because it 
introduces a new tier that needs to be factored into design of the app's 
transaction model, scaling and reliability plans, etc.  Ergo, there darn 
well better be a really good reason for introducing that extra tier.  (And 
you'll notice the existence of this tier produces exactly the problems

I know that it's "common wisdom" that sessions are supposed to be an 
important thing to have, especially since ASP and PHP provide them out of 
the box.  (And at least PHP lets you implement the storage however you 
like!)  But I view sessions of that kind with roughly the same disdain as I 
view Perl or Tcl's weak typing; they mask problems that I want to know 
about.  I'm well aware that I'm in the minority on this point, but that 
doesn't mean I'm not still right.  :)

(And I'm also aware that "scaling down" is important, but the rule that all 
state goes either in the browser or the application DB scales down just as 
well as it scales up.)


From mso at oz.net  Tue Aug 16 01:05:43 2005
From: mso at oz.net (mso@oz.net)
Date: Mon, 15 Aug 2005 16:05:43 -0700 (PDT)
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com>
References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
	<3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
	<5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com>
Message-ID: <33293.161.55.66.150.1124147143.squirrel@www.oz.net>

Phillip J. Eby wrote:
> So I'd personally prefer that any session service standards distinguish
> between management of the session itself, from storage of data associated
> with the session.

Yes, but the web-sig needs to define both APIs, and encourage generic
implementations of both.  Otherwise every framework or every user has to
write their own storage backends.  Speaking from experience with Quixote,
which has no persistent sessions out of the box.

Also, the serialization method(s) need to be documented.  That's a
property of the storage object.  All existing ones I know of use pickle
(sometimes encapsulated by Durus or shelve), but that may not be the case
forever.  Plus there's pickle vs cPickle; I've heard the latter has
Unicode problems.

> Of course, I personally prefer to use whatever the application's storage
> is for my session management

That's what I've been doing too.  session2 is made to play nicely with
your application's database, sticking to whatever table you designate for
it.

> To put it another way: I don't believe in session variables,
> only session-specific application objects.  An ecommerce application
> should
> have persistent carts and items and the like; the only purpose of a
> session
> is to find out which cart to look at.

We already have some frameworks with dict-like sessions and others with a
standard session object.  Assuming we had a hybrid object that accepts
both, I don't know why any application *has* to have a custom session
object.  But there's no reason to arbitrarily preclude it either.

> Indeed, I'd argue
> that applications that *don't* put their session data in the application's
> main DB should have very very good reasons for doing so, and I've never
> heard a good enough reason yet.

Ian Bicking wrote:
> There's useful reasons for non-application code to store things in the
> session, and the particulars of the application storage aren't really
> applicable.  For instance, with this pattern:
> http://blog.ianbicking.org/web-application-patterns-status-notification.html
> -- you put transient messages in the session.  But there's no point to
> using a fancy application session storage which means documentation and
> configuration and whatnot.  Maybe you have no impediments to throwing
> random data into your application data stores, but I do.

I wouldn't call that example "non-application" code.  Setting a message in
the session for the subsequent request to display is very useful.  "Record
added", "Add cancelled", "logged out", etc.  I'm not sure third-party code
(middleware) should be able to add a message directly, but that may turn
out to be a significant feature of certain middleware.

-- 
-- Mike Orr <mso at oz.net>


From fumanchu at amor.org  Tue Aug 16 01:47:57 2005
From: fumanchu at amor.org (Robert Brewer)
Date: Mon, 15 Aug 2005 16:47:57 -0700
Subject: [Web-SIG] and now for something completely different!
Message-ID: <3A81C87DC164034AA4E2DDFE11D258E37727AC@exchange.hqamor.amorhq.net>

Me:
> It would be nice if our final product supported multiple
> concurrency strategies. The decision about which strategy
> to use could be left to framework authors (who would wish
> to begin migration by maintaining maximum
> backward-compatibility), or to their users, if those
> options can be described simply enough.
> 
> ...
> 
> The concern is not only response time, but atomicity. In 
> the comments for Aquarium's SessionContainer:
> 
>  "Concerning locking:  in general, a global lock (of some sort)
>  should be used so that creating, deleting, reading, and writing
>  sessions is serialized.  However, it is not necessary to have
>  a lock for each session. If a user wishes to use two browser
>  windows at the same time, the last writer wins."
> 
> That is a design decision which not all frameworks (or 
> other consumers of our session lib) might share.
> Apparently, given the current Python session modules
> out there, it's common to survive without caring?
> I know Mike Robinson has worked many long nights
> trying to make a session module for CherryPy which
> can consistently pass simple hit-counter tests. ;)
> Personally, I'd like to pursue an MROW solution.

Ian:
> In practice race conditions are very uncommon.
> Simultaneous requests from the same session are
> uncommon, since what few simultaneous requests
> that occur are likely to be for boring resources
> like images.  If you have an image bug on a page
> that also writes the session, maybe you'd have a
> problem.  I'd be okay saying "don't do that"
> because usually people don't do that, so it's
> not very compelling.

Images are only boring until they're not--a Google-style map server for
example.

> It's possible that Ajax techniques would make
> concurrency more likely, but I'm not sure.

Most definitely. As page composition swings back to a client-side-pull
model, I expect more pages to be written in a lot of javascript querying
RESTful HTTP wrappers around data stores, and much of that "pull" will
be concurrent. One GET pulls a minimal HTML page; that page includes
Javascript that then populates the page with data via multiple AJAX
requests.

> ...with frames and multiple windows
> at least it's vaguely possible concurrent writes
> could happen.  OTOH conflict errors are the wrong
> answer to concurrent writes in a signficant number
> of cases, where a little lossiness is preferable.
> Generally it becomes more complex/interesting if you have 
> transactional sessions.
> 
> ...
> 
> I'm -1 on multiple strategies, unless there's a really good 
> reason for it.  I'd like to see if we can do the Best Most
> Complete strategy without making compromises or creating
> a too-difficult API; if so, then why not use that?

I'd be -1 on them too, except that a see a "really good reason":
expectations differ wildly because application needs differ wildly.
Conflict errors are the right answer in a significant number of cases.
Lossiness is unacceptable in many. If we can do "the Best Most Complete
strategy", great! But I won't hold my breath. If our common session
module meets 75% of the needs of existing frameworks, we've made no
progress whatsoever, in my mind. Let's shoot for 90%+.


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org

From grosser.meister.morti at gmx.net  Tue Aug 16 03:27:26 2005
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Tue, 16 Aug 2005 03:27:26 +0200
Subject: [Web-SIG] httplib ICY support (3 lines of code)
Message-ID: <430140FE.3070808@gmx.net>

Hi.

I hope this is the right place to poste this. To add ICY support to the
httplib module you just have to add 2 lines and 2 charactesr! ;)
ICY is a streaming protocoll developed by nullsoft (shoutcast and
winamp). It's identically to HTTP/1.0 but the server sends ICY instead
of HTTP/1.0. Other differences are additional header fields, but that
hasn't to bother httplib.

To add ICY support to httplib simple replace at line 308 in httplib.py:
        if not version.startswith('HTTP/'):
with:
        if version == 'ICY':
            version = "HTTP/1.0"
        elif not version.startswith('HTTP/'):

Now I can write a little stream-dumping ICY proxy. ;)

	-panzi

From mike_mp at zzzcomputing.com  Tue Aug 16 17:48:36 2005
From: mike_mp at zzzcomputing.com (mike bayer)
Date: Tue, 16 Aug 2005 11:48:36 -0400 (EDT)
Subject: [Web-SIG] and now for something completely different!
Message-ID: <45421.66.192.34.8.1124207316.squirrel@66.192.34.8>

if I may throw my hat in the ring here, the session object I have built
for Myghty accomplishes the following things, which were the important
facets of a session for me:

- it is neutral of its backend storage system.  I developed a simple
"storage" API that currently has DBM, memory and plain file-based systems
and people have also been clamoring for a memcached version which is easy
enough to add.  Myghty uses this backend containment system both for its
page caching and session libraries.

- the backend storage system supplies locking which locks amongst threads
and processes; the session implementation insures that this lock is only
against its own session ID.  I was basically going for an improvement over
mod_python's session, which locks all sessions against a single apache
global mutex, and stores everyone's session in one huge DBM file.  my
session object, when using file-based containment, always keeps every
session's information in separate files and was modeled after
Apache::Session in this regard.

- because a "read" operation also registers a "last accessed time" data
member, its not using multiple reader/single writer style locking,
everyone is a writer.  However, since I am sensitive to iframes, ajax
calls, and dynamic image calls hitting the same session concurrently
within a request which I'd rather not slow down, I do something less than
optimal which is I open the session store and read the full thing into
memory first when its accessed, and then immediately unlock.  This
obviously can create problems for an application that is storing huge
amounts of data in its session which is not required in full for any one
request.

Two improvements to this behavor would be to either make the "last
accessed time" be written out just once per request and then to allow
multiple readers, or to improve the containment API to supply "last
accessed time" automatically.

I mostly was using Apache::Session as a guide to the architectural
features I wanted to see, which include flexibility of containment and
locking systems as well as a separation between individual sessions.

- mike

From jonathan at carnageblender.com  Tue Aug 16 18:08:13 2005
From: jonathan at carnageblender.com (Jonathan Ellis)
Date: Tue, 16 Aug 2005 09:08:13 -0700
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <45421.66.192.34.8.1124207316.squirrel@66.192.34.8>
References: <45421.66.192.34.8.1124207316.squirrel@66.192.34.8>
Message-ID: <1124208493.11438.240803964@webmail.messagingengine.com>

On Tue, 16 Aug 2005 11:48:36 -0400 (EDT), "mike bayer"
<mike_mp at zzzcomputing.com> said:
> - because a "read" operation also registers a "last accessed time" data
> member, its not using multiple reader/single writer style locking,
> everyone is a writer.  However, since I am sensitive to iframes, ajax
> calls, and dynamic image calls hitting the same session concurrently
> within a request which I'd rather not slow down, I do something less than
> optimal which is I open the session store and read the full thing into
> memory first when its accessed, and then immediately unlock.  This
> obviously can create problems for an application that is storing huge
> amounts of data in its session which is not required in full for any one
> request.

I don't think read/write locking for sessions is a Must Have, either. 
It's nice if it's easy to do (which it is, in a threaded situation), but
fundamentally the session is not the right place for caching Lots Of
Stuff.  

-Jonathan

From ianb at colorstudy.com  Tue Aug 16 18:10:44 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 16 Aug 2005 11:10:44 -0500
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <45421.66.192.34.8.1124207316.squirrel@66.192.34.8>
References: <45421.66.192.34.8.1124207316.squirrel@66.192.34.8>
Message-ID: <43021004.4090407@colorstudy.com>

mike bayer wrote:
> I mostly was using Apache::Session as a guide to the architectural
> features I wanted to see, which include flexibility of containment and
> locking systems as well as a separation between individual sessions.

Is there a good API guide to Apache::Session somewhere?

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From ianb at colorstudy.com  Tue Aug 16 18:28:04 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 16 Aug 2005 11:28:04 -0500
Subject: [Web-SIG] Session interface
Message-ID: <43021414.9080102@colorstudy.com>

I wrote a possible interface for sessions: 
http://aspn.activestate.com/ASPN/CodeDoc/Apache-mod_perl_guide/src/modules.html

It's not my most thoughtful effort, but maybe it can be a discussion 
point.  Feel free to offer completely different APIs if you think this 
one sucks.  I basically just threw in properties and methods for all the 
functionality I've thought of by reading a couple APIs and the 
discussion here, without actually thinking about how it goes together :-/

In this interface presumably you make subclasses of an abstract class to 
implement different storage backends and do some kinds of configuration.

Thinking on it more, probably a good place to start would be agreeing on 
specific terminology for the objects involved, since I've seen several 
different sets of terminology, many of which use the same words for 
different ideas:

Session:
   An instance of this represents one user/browser's session.
SessionStore:
   An instance of this represents the persistence mechanism.  This
   is a functional component, not embodying any policy.
SessionManager:
   This is a container for sessions, and uses a SessionStore.  This
   contains all the policy for loading, saving, locking, expiring
   sessions.

Does that sound good?  Note that the attached interface conflates 
SessionStore and SessionManager.  Some interfaces make an explicit 
ApplicationSession, which is contained by Session and keyed off some 
application ID; my interface implies that separation, but does not 
enforce it, and does not offer any extra functionality at that level 
(e.g., per-ApplicationSession locks or transactions).


From ianb at colorstudy.com  Tue Aug 16 18:50:38 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 16 Aug 2005 11:50:38 -0500
Subject: [Web-SIG] Session interface (corrected URL)
In-Reply-To: <3A81C87DC164034AA4E2DDFE11D258E37727B3@exchange.hqamor.amorhq.net>
References: <3A81C87DC164034AA4E2DDFE11D258E37727B3@exchange.hqamor.amorhq.net>
Message-ID: <4302195E.7060704@colorstudy.com>

Robert Brewer wrote:
>>I wrote a possible interface for sessions: 
>>
> 
> http://aspn.activestate.com/ASPN/CodeDoc/Apache-mod_perl_guide/src/modul
> es.html
> 
> You wrote Apache::Session, ::DBI, ::Request, AND ::SubProcess? I must
> remember to put that in my memoirs...

Doh!  Clearly my copy-and-paste skills are lacking ;)

http://svn.colorstudy.com/home/ianb/scarecrow_session_interface.py

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From ianb at colorstudy.com  Tue Aug 16 18:54:45 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 16 Aug 2005 11:54:45 -0500
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <45421.66.192.34.8.1124207316.squirrel@66.192.34.8>
References: <45421.66.192.34.8.1124207316.squirrel@66.192.34.8>
Message-ID: <43021A55.50007@colorstudy.com>

mike bayer wrote:
> - because a "read" operation also registers a "last accessed time" data
> member, its not using multiple reader/single writer style locking,
> everyone is a writer.  However, since I am sensitive to iframes, ajax
> calls, and dynamic image calls hitting the same session concurrently
> within a request which I'd rather not slow down, I do something less than
> optimal which is I open the session store and read the full thing into
> memory first when its accessed, and then immediately unlock.  This
> obviously can create problems for an application that is storing huge
> amounts of data in its session which is not required in full for any one
> request.

I think we can all agree that we're not expecting sessions to be primary 
storage for large objects, so we shouldn't worry too much about this.

However, as a use case for objects derived from the session, consider an 
upload form with validation.  If someone uploads a large file but has an 
invalid form, you might want to keep the file around on the server side. 
  You can't put it in the form (hidden or not) because then you 
needlessly retransfer the file twice.  You can't leave the filename in 
the input field, because browsers don't allow that.

So in a lot of ways this is where it would be nice to put a big file in 
the session.  But you should really put it in a temporary directory and 
put the filename in the session (you could put the filename in a signed 
field in the form, but ignore that for now).  The advantage of putting 
it in the session is that the session has tracking, a timeout, etc.

So with the API I gave you might do:

session['upload_filename'] = '/tmp/foo.jpg'
session.store.expire_session_callbacks.append(delete_upload_filename)

def delete_upload_filename(session_id):
     session = session_store.load_session_read_only(session_id)
     if 'upload_filename' in session:
         filename = session['upload_filename']
         if os.path.exists(filename):
             os.unlink(filename)

Though there's a couple issues.  The sessino store should be passed 
along with the session ID.  It should be specified that loading a 
session from this callback will not cancel its expiration.  Maybe 
per-session callbacks should be allowed; in which case the callbacks 
would have to be identifiable by a string or some pickleable value, 
since you can't pickle the functions themselves.  I suppose you could 
implement the callback as an instance with a __call__ method, which 
pickle turns into a class name plus __dict__ values.  I hate overusing 
__call__; if it has to be an instance (to be pickleable), then might as 
well give it a method name, and maybe call other methods as well.  Then 
it  essentially becomes an ad hoc event system.

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From pje at telecommunity.com  Tue Aug 16 18:55:16 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 16 Aug 2005 12:55:16 -0400
Subject: [Web-SIG] Session interface
In-Reply-To: <43021414.9080102@colorstudy.com>
Message-ID: <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>

At 11:28 AM 8/16/2005 -0500, Ian Bicking wrote:
>I wrote a possible interface for sessions:
>http://aspn.activestate.com/ASPN/CodeDoc/Apache-mod_perl_guide/src/modules.html

Um, wha?


>Session:
>    An instance of this represents one user/browser's session.
>SessionStore:
>    An instance of this represents the persistence mechanism.  This
>    is a functional component, not embodying any policy.
>SessionManager:
>    This is a container for sessions, and uses a SessionStore.  This
>    contains all the policy for loading, saving, locking, expiring
>    sessions.

Which of these is responsible for managing client-side state?  (i.e. cookie 
reading, setting, expiration, and refresh?)

Maybe this is clearer in what you actually wrote, but the link above gives 
no clue.  :)


From ianb at colorstudy.com  Tue Aug 16 19:02:12 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 16 Aug 2005 12:02:12 -0500
Subject: [Web-SIG] Session interface
In-Reply-To: <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
References: <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
Message-ID: <43021C14.9060702@colorstudy.com>

Phillip J. Eby wrote:
>> Session:
>>    An instance of this represents one user/browser's session.
>> SessionStore:
>>    An instance of this represents the persistence mechanism.  This
>>    is a functional component, not embodying any policy.
>> SessionManager:
>>    This is a container for sessions, and uses a SessionStore.  This
>>    contains all the policy for loading, saving, locking, expiring
>>    sessions.
> 
> 
> Which of these is responsible for managing client-side state?  (i.e. 
> cookie reading, setting, expiration, and refresh?)

SessionManager is responsible for expiration.  I'm not sure what you are 
thinking of for refresh.  Updating last-accessed time?  That would be 
the SessionManager as well.  Cookies are not handled at all by these 
objects -- that's one of those boring details I think is best left to 
library users (frameworks, services, middleware), or put in another object.

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From ianb at colorstudy.com  Tue Aug 16 19:05:06 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 16 Aug 2005 12:05:06 -0500
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <43021A55.50007@colorstudy.com>
References: <45421.66.192.34.8.1124207316.squirrel@66.192.34.8>
	<43021A55.50007@colorstudy.com>
Message-ID: <43021CC2.3040102@colorstudy.com>

Ian Bicking wrote:
> Though there's a couple issues.  The sessino store should be passed 
> along with the session ID.  It should be specified that loading a 
> session from this callback will not cancel its expiration.  Maybe 
> per-session callbacks should be allowed; in which case the callbacks 
> would have to be identifiable by a string or some pickleable value, 
> since you can't pickle the functions themselves.  I suppose you could 
> implement the callback as an instance with a __call__ method, which 
> pickle turns into a class name plus __dict__ values.  I hate overusing 
> __call__; if it has to be an instance (to be pickleable), then might as 
> well give it a method name, and maybe call other methods as well.  Then 
> it  essentially becomes an ad hoc event system.

A more complete event system would also let people like Phillip who 
don't want to use ad hoc storage to simply ignore that part, and use the 
session ID and events to manage data in their application storage.


-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From pje at telecommunity.com  Tue Aug 16 19:23:20 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 16 Aug 2005 13:23:20 -0400
Subject: [Web-SIG] Session interface
In-Reply-To: <43021C14.9060702@colorstudy.com>
References: <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
	<5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com>

At 12:02 PM 8/16/2005 -0500, Ian Bicking wrote:
>Phillip J. Eby wrote:
>>>Session:
>>>    An instance of this represents one user/browser's session.
>>>SessionStore:
>>>    An instance of this represents the persistence mechanism.  This
>>>    is a functional component, not embodying any policy.
>>>SessionManager:
>>>    This is a container for sessions, and uses a SessionStore.  This
>>>    contains all the policy for loading, saving, locking, expiring
>>>    sessions.
>>
>>Which of these is responsible for managing client-side state?  (i.e. 
>>cookie reading, setting, expiration, and refresh?)
>
>SessionManager is responsible for expiration.  I'm not sure what you are 
>thinking of for refresh.  Updating last-accessed time?  That would be the 
>SessionManager as well.

By refresh, I mean updating a cookie's expiration time.


>   Cookies are not handled at all by these objects -- that's one of those 
> boring details I think is best left to library users (frameworks, 
> services, middleware), or put in another object.

Wow.  Those boring details, as you call them, are the entire concept of 
"session" to me.  Now that you've posted the right interface URL, I'm 
looking at it and not seeing anything there that seems related to what I 
think of as sessions.

To me, session management is totally about managing the client-side state, 
since anything I'm storing on the server is application state and just gets 
stored the way anything else does.  Some of the client-side state concerns:

  * Triggering actions when the state information isn't available (due to 
being a new sesssion or a client-side timeout)

  * Initial expiration vs. refresh policy

  * signed vs. unsigned data

If you handle these well, then simply storing real data in your application 
DB solves all problems, with no need for any of the objects in the 
interface you defined.

Or, to put it differently, I suppose I could wrap a pure client-side 
storage solution in the interfaces you propose, but it would be overkill, 
since concurrency would be a non-issue (among others).  It would also be 
slightly broken in that the interfaces you've written up don't deal with 
any of the *interesting* details, which (IMO) are all in the client-state 
policy areas.  (All the concurrency/scaling/cache-sharing/etc. issues of 
session stores vanish if you only have client-stored and db-stored data.)


From ianb at colorstudy.com  Tue Aug 16 19:46:49 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 16 Aug 2005 12:46:49 -0500
Subject: [Web-SIG] Session interface
In-Reply-To: <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com>
References: <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
	<5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
	<5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com>
Message-ID: <43022689.9080504@colorstudy.com>

Phillip J. Eby wrote:
>> SessionManager is responsible for expiration.  I'm not sure what you 
>> are thinking of for refresh.  Updating last-accessed time?  That would 
>> be the SessionManager as well.
> 
> 
> By refresh, I mean updating a cookie's expiration time.

I'm not sure; I always try to make the cookie last longer than the 
session can.  I suppose you could store information about when the 
cookie is supposed to expire in the session itself (since you can't read 
expiration times from the cookie).  Or you could store the expiration as 
part of the cookie data; I haven't thought about doing it that way.

>>   Cookies are not handled at all by these objects -- that's one of 
>> those boring details I think is best left to library users 
>> (frameworks, services, middleware), or put in another object.
> 
> 
> Wow.  Those boring details, as you call them, are the entire concept of 
> "session" to me.  Now that you've posted the right interface URL, I'm 
> looking at it and not seeing anything there that seems related to what I 
> think of as sessions.

OK, maybe not boring, but impossible to put in a library in any useful 
way.  If you do put them in a library, all you've really created is a 
big document on possible use cases and some really boring (as in 
trivial) functions -- write_cookie_header(), cookie_header_tuple(), 
add_session_id_to_url(), read_session_id(), etc.  If it built on some 
other standard (services, middleware, etc), then maybe it would be 
useful; we have no such standard, so I don't see any useful work to be 
done there.  Instead of inventing a single-use framework to build on, or 
trying to tackle the larger framework standardization, I'd rather ignore 
the issue and assume that we attain and save the session ID elsewhere.

I think most of us have a clear idea of what we want a session to be, 
which includes persistence; at least, that's what all the APIs discussed 
so far have been about, and that's what "session" means in most 
frameworks.  It's not what you want, and that's fine -- I think if you 
can get a session ID and notification of events you can do what you want 
to do just fine, and ignore the rest.

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From pje at telecommunity.com  Tue Aug 16 21:45:47 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 16 Aug 2005 15:45:47 -0400
Subject: [Web-SIG] Session interface
In-Reply-To: <43022689.9080504@colorstudy.com>
References: <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com>
	<5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
	<5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
	<5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com>

At 12:46 PM 8/16/2005 -0500, Ian Bicking wrote:
>Phillip J. Eby wrote:
>>>SessionManager is responsible for expiration.  I'm not sure what you are 
>>>thinking of for refresh.  Updating last-accessed time?  That would be 
>>>the SessionManager as well.
>>
>>By refresh, I mean updating a cookie's expiration time.
>
>I'm not sure; I always try to make the cookie last longer than the session 
>can.  I suppose you could store information about when the cookie is 
>supposed to expire in the session itself (since you can't read expiration 
>times from the cookie).  Or you could store the expiration as part of the 
>cookie data; I haven't thought about doing it that way.

Note that if you store state client-side (login info, for example), then 
cookie expiration is a convenient way to get the client to do your garbage 
collection.  If I want somebody's login to time out after 30 minutes of 
inactivity (or 8 hours, or whatever), the easy way to do that is to just 
set the cookie to time out, and refresh the expiration time on each hit.


>   If it built on some other standard (services, middleware, etc), then 
> maybe it would be useful; we have no such standard, so I don't see any 
> useful work to be done there.

I suppose you have a point there, in that I'd see such management as a 
useful place for a middleware plus a service.  But that's one reason why 
I'd like to see the services API spec'd out.  :)


>   Instead of inventing a single-use framework to build on, or trying to 
> tackle the larger framework standardization, I'd rather ignore the issue 
> and assume that we attain and save the session ID elsewhere.
>
>I think most of us have a clear idea of what we want a session to be, 
>which includes persistence; at least, that's what all the APIs discussed 
>so far have been about, and that's what "session" means in most 
>frameworks.  It's not what you want, and that's fine -- I think if you can 
>get a session ID and notification of events you can do what you want to do 
>just fine, and ignore the rest.

Yeah, I'm just trying to point out that you keep saying "we're trying to 
solve this problem", and I say, "you know, if you do it this way, it's not 
a problem."  And then you say, "yes, but if you do it that way, then there 
are no problems for us to solve."  (i.e., it's "boring", "trivial", etc.)

At which point I say, "yes, exactly!", thinking that we now agree that it's 
silly to do things in a way that makes them into a problem.  But apparently 
you think that means we should instead spend time making problems and 
solving them, since so many other people have chosen to make their lives 
hard in that particular way.  :)

To put it another way, I see an opportunity here to educate developers 
about better ways of doing things, rather than to institutionalize wasteful 
ways of doing them.  But I realize that I'm apparently the only person who 
thinks that way, so I'll shut up now.  (At least here, anyway; in general 
I'm still going to talk about sessions being Considered Harmful from both 
the scalability and simplicity perspectives.)


From fumanchu at amor.org  Tue Aug 16 22:07:08 2005
From: fumanchu at amor.org (Robert Brewer)
Date: Tue, 16 Aug 2005 13:07:08 -0700
Subject: [Web-SIG] Session interface
Message-ID: <3A81C87DC164034AA4E2DDFE11D258E37727BF@exchange.hqamor.amorhq.net>

Phillip J. Eby wrote:
> To put it another way, I see an opportunity here to
> educate developers about better ways of doing things,

I (and some of the other CherryPy devs) agree that there are better
ways, particularly when using a persistent server process.

> ...rather than to institutionalize wasteful
> ways of doing them.

The issue for us as framework developers is that the wasteful ways are
*already* institutionalized. Education is a worthy goal, but if it takes
5 years to convince a majority of Python web developers that they don't
need sessions, we need safe and strong implementations of sessions in
the interim. I think that if we chose to ship CherryPy, for example,
without any session functionality, we'd lose the very audience we want
to educate.

> But I realize that I'm apparently the only
> person who thinks that way, so I'll shut up now.
> (At least here, anyway; in general I'm still going
> to talk about sessions being Considered Harmful from
> both the scalability and simplicity perspectives.)

Please continue talking about it! [But as you say, probably not within
this thread ;)]. I never use sessions, and am interested in
communicating the benefits of that approach to a wider audience.


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org

From jonathan at carnageblender.com  Tue Aug 16 22:14:18 2005
From: jonathan at carnageblender.com (Jonathan Ellis)
Date: Tue, 16 Aug 2005 13:14:18 -0700
Subject: [Web-SIG] Session interface
In-Reply-To: <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com>
References: <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com>
	<5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
	<5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
	<5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com>
	<5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com>
Message-ID: <1124223258.7911.240822848@webmail.messagingengine.com>

On Tue, 16 Aug 2005 15:45:47 -0400, "Phillip J. Eby"
<pje at telecommunity.com> said:
> >I'm not sure; I always try to make the cookie last longer than the session 
> >can.  I suppose you could store information about when the cookie is 
> >supposed to expire in the session itself (since you can't read expiration 
> >times from the cookie).  Or you could store the expiration as part of the 
> >cookie data; I haven't thought about doing it that way.

Sure, sessions are overused and abused.  Particularly among certain
classes of developers which I won't characterize here. :)

But there's a reason they're in such common use; it's a huge waste
(particular for low-bandwidth clients) to store anything more than
absolutely necessary in a cookie that the client sends repeatedly.  Much
more efficient to send "here's my token" which the server uses to
retrieve the rest.

-Jonathan

From pje at telecommunity.com  Tue Aug 16 22:37:31 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 16 Aug 2005 16:37:31 -0400
Subject: [Web-SIG] Session interface
In-Reply-To: <1124223258.7911.240822848@webmail.messagingengine.com>
References: <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com>
	<5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com>
	<5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
	<5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
	<5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com>
	<5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com>

At 01:14 PM 8/16/2005 -0700, Jonathan Ellis wrote:
>Sure, sessions are overused and abused.  Particularly among certain
>classes of developers which I won't characterize here. :)
>
>But there's a reason they're in such common use; it's a huge waste
>(particular for low-bandwidth clients) to store anything more than
>absolutely necessary in a cookie that the client sends repeatedly.  Much
>more efficient to send "here's my token" which the server uses to
>retrieve the rest.

I agree; and in fact until I saw Ian's status-message example, I've never 
had need to store anything in a cookie except login credentials or an 
identifier used to find application objects like a shopping cart.

IOW, cookies are fundamentally for short strings.  However, if your session 
data consists solely of short strings, or short-lived medium-size strings 
(like a status message) then it works out nicely.

If you have session data other than short strings, then you should store it 
with your application data, since it's clearly data that's part of your 
application.  There are plenty of object-relational solutions and you can 
select your transaction/locking policies to suit your application.  You can 
then handle load balancing at the web tier without having to play 
session-affinity tricks at the load balancer.

The last time I wrote apps using a session store was in 1997, which was 
also when I wrote a session store of my own as part of a Python ASP 
emulator.  I quickly realized that session stores quickly become 
persistence systems in their own right, unless you draw the line 
somewhere.  However, if you draw the line at identifiers and other short 
strings, then you can just draw the line at the client and avoid the whole 
problem.


From pje at telecommunity.com  Tue Aug 16 22:40:55 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 16 Aug 2005 16:40:55 -0400
Subject: [Web-SIG] Session interface
In-Reply-To: <3A81C87DC164034AA4E2DDFE11D258E37727BF@exchange.hqamor.amo
	rhq.net>
Message-ID: <5.1.1.6.0.20050816163746.030b12c8@mail.telecommunity.com>

At 01:07 PM 8/16/2005 -0700, Robert Brewer wrote:
>Phillip J. Eby wrote:
> > To put it another way, I see an opportunity here to
> > educate developers about better ways of doing things,
>
>I (and some of the other CherryPy devs) agree that there are better
>ways, particularly when using a persistent server process.

It's nice to know I'm not the only crazy one around here.  ;)


>The issue for us as framework developers is that the wasteful ways are
>*already* institutionalized. Education is a worthy goal, but if it takes
>5 years to convince a majority of Python web developers that they don't
>need sessions, we need safe and strong implementations of sessions in
>the interim. I think that if we chose to ship CherryPy, for example,
>without any session functionality, we'd lose the very audience we want
>to educate.

So make a session store that uses cookies only, and upsell it as your new 
"RESTful session storage option, with infinite scalability".  ;)

The flip side is then getting more good relationally-backed persistence 
systems out there, to take up the complex-objects side of the equation.


From jonathan at carnageblender.com  Tue Aug 16 22:48:50 2005
From: jonathan at carnageblender.com (Jonathan Ellis)
Date: Tue, 16 Aug 2005 13:48:50 -0700
Subject: [Web-SIG] Session interface
In-Reply-To: <5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com>
References: <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com>
	<5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com>
	<5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
	<5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
	<5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com>
	<5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com>
	<5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com>
Message-ID: <1124225330.11574.240825141@webmail.messagingengine.com>

On Tue, 16 Aug 2005 16:37:31 -0400, "Phillip J. Eby"
<pje at telecommunity.com> said:
> At 01:14 PM 8/16/2005 -0700, Jonathan Ellis wrote:
> >But there's a reason they're in such common use; it's a huge waste
> >(particular for low-bandwidth clients) to store anything more than
> >absolutely necessary in a cookie that the client sends repeatedly.  Much
> >more efficient to send "here's my token" which the server uses to
> >retrieve the rest.
> 
> I agree; and in fact until I saw Ian's status-message example, I've never 
> had need to store anything in a cookie except login credentials or an 
> identifier used to find application objects like a shopping cart.
> 
> IOW, cookies are fundamentally for short strings.  However, if your
> session 
> data consists solely of short strings, or short-lived medium-size strings 
> (like a status message) then it works out nicely.

Sure, but given the choice between N short strings and one, one is
better. :)

> If you have session data other than short strings, then you should store
> it 
> with your application data, since it's clearly data that's part of your 
> application.

Still, it can be good to have a simple place to store non-permanent
information.

Is the potential for abuse worth it?  Perhaps not.  I also can't think
of a time when I needed sessions in the past 5 or so years.

-Jonathan

From mike_mp at zzzcomputing.com  Tue Aug 16 23:06:57 2005
From: mike_mp at zzzcomputing.com (mike bayer)
Date: Tue, 16 Aug 2005 17:06:57 -0400 (EDT)
Subject: [Web-SIG] Session interface
In-Reply-To: <5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com>
References: <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com>
	<5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com>
	<5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
	<5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
	<5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com>
	<5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com>
	<5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com>
Message-ID: <57167.66.192.34.8.1124226417.squirrel@66.192.34.8>

Phillip J. Eby said:

> I agree; and in fact until I saw Ian's status-message example, I've never
> had need to store anything in a cookie except login credentials or an
> identifier used to find application objects like a shopping cart.
>
> IOW, cookies are fundamentally for short strings.  However, if your
> session
> data consists solely of short strings, or short-lived medium-size strings
> (like a status message) then it works out nicely.
>

theres also security considerations regarding using only cookies without
server side sessions.  For login tokens, if theres no corresponding
server-side token to match up that it is in fact a current login and not
something left over from a long-closed session, then some kind of clever
encryption combined with time information must be used on the client-side
token that can guarantee the login is recent and valid.

I always use server-side sessions for logins for this reason.  I also
think server-side sessions are an easy place to store user preferences and
permissioning information originally loaded from the database, as a quick
and easy way to cut down on repeated database calls per request, which is
not as cleanly represented as an extra few thousand characters sent back
and forth with every request.

all that said, my current employer uses cookie-only sessions for
scalability reasons.  might this be-all-end-all session API also have a
"client-only" implementation available ?

- mike

From gtalvola at nameconnector.com  Tue Aug 16 23:08:59 2005
From: gtalvola at nameconnector.com (Geoffrey Talvola)
Date: Tue, 16 Aug 2005 17:08:59 -0400
Subject: [Web-SIG] Session interface
Message-ID: <61957B071FF421419E567A28A45C7FE5029D28BE@mailbox.nameconnector.com>

Jonathan Ellis wrote:
> Still, it can be good to have a simple place to store non-permanent
> information.

For example...

I think a good use of sessions is in remembering selections that have been
made earlier on.  For example, suppose you have a reporting application
where you allow the user to select one or more items to report on from a
list box, several filtering options in dropdowns or checkboxes, sorting and
grouping behavior, etc.  You want to remember those settings so that if the
user returns to the report selection page, their last selected settings are
pre-selected.  But, unless the user chooses to save those settings as a
"stored report", you'd like to forget the settings when the user logs out or
when they close their browser.

Also, assume that your application already has this bundle of selections in
the form of a Python object.

Isn't the cleanest, easiest, and more efficient way to handle this to simply
save the Python object in a session variable?  In some cases, for example
using Webware's in-memory sessions, for example, this data never has to be
marshaled or leave the application server at all.

If I didn't have sessions, I think using either cookies or a back-end db
would be more work, less clean, and less efficient in this case.

- Geoff

From jonathan at carnageblender.com  Tue Aug 16 23:28:26 2005
From: jonathan at carnageblender.com (Jonathan Ellis)
Date: Tue, 16 Aug 2005 14:28:26 -0700
Subject: [Web-SIG] Session interface
In-Reply-To: <57167.66.192.34.8.1124226417.squirrel@66.192.34.8>
References: <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com>
	<5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com>
	<5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
	<5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
	<5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com>
	<5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com>
	<5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com>
	<57167.66.192.34.8.1124226417.squirrel@66.192.34.8>
Message-ID: <1124227706.16301.240828160@webmail.messagingengine.com>

On Tue, 16 Aug 2005 17:06:57 -0400 (EDT), "mike bayer"
<mike_mp at zzzcomputing.com> said:
> I also
> think server-side sessions are an easy place to store user preferences
> and
> permissioning information originally loaded from the database, as a quick
> and easy way to cut down on repeated database calls per request, which is
> not as cleanly represented as an extra few thousand characters sent back
> and forth with every request.

Now that's an example of when I think sessions are a poor solution.  IMO
caching objects from the database is the job for the, well, database
object cache. :)

They are similar but not identical.  For instance, while session data
typically expires after a certain amount of time, permanent data should
never expire unless invalidated by an update.

-Jonathan

From pje at telecommunity.com  Tue Aug 16 23:42:40 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 16 Aug 2005 17:42:40 -0400
Subject: [Web-SIG] Session interface
In-Reply-To: <61957B071FF421419E567A28A45C7FE5029D28BE@mailbox.nameconne
	ctor.com>
Message-ID: <5.1.1.6.0.20050816171536.030b1b08@mail.telecommunity.com>

At 05:08 PM 8/16/2005 -0400, Geoffrey Talvola wrote:
>Jonathan Ellis wrote:
> > Still, it can be good to have a simple place to store non-permanent
> > information.
>
>For example...
>
>I think a good use of sessions is in remembering selections that have been
>made earlier on.  For example, suppose you have a reporting application
>where you allow the user to select one or more items to report on from a
>list box, several filtering options in dropdowns or checkboxes, sorting and
>grouping behavior, etc.  You want to remember those settings so that if the
>user returns to the report selection page, their last selected settings are
>pre-selected.  But, unless the user chooses to save those settings as a
>"stored report", you'd like to forget the settings when the user logs out or
>when they close their browser.
>
>Also, assume that your application already has this bundle of selections in
>the form of a Python object.
>
>Isn't the cleanest, easiest, and more efficient way to handle this to simply
>save the Python object in a session variable?

No.  :)

I have to admit I'm probably biased by early Zope experience, where cookie 
variables are as easy to use as form variables or any other kind of 
variable.  Just set the cookies to save the options, then refer to them in 
the page.  Sweet and simple.  And if you set the cookie path to the path of 
the page, then the client doesn't have to send them on every request, only 
the ones where it makes a difference.


>   In some cases, for example
>using Webware's in-memory sessions, for example, this data never has to be
>marshaled or leave the application server at all.
>
>If I didn't have sessions, I think using either cookies or a back-end db
>would be more work, less clean, and less efficient in this case.

Maybe that's a limitation of the framework?  As I said, I'm probably 
spoiled by how easily Zope merges GET/POST/cookie variables, such that form 
variables override cookies, but if the form variable isn't supplied the 
cookie is used as a default.  That one simple behavior made "smart forms" 
really easy to make in Zope and Zope-like systems.


From pje at telecommunity.com  Tue Aug 16 23:51:14 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 16 Aug 2005 17:51:14 -0400
Subject: [Web-SIG] Session interface
In-Reply-To: <57167.66.192.34.8.1124226417.squirrel@66.192.34.8>
References: <5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com>
	<5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com>
	<5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com>
	<5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
	<5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
	<5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com>
	<5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com>
	<5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050816174540.029e77a0@mail.telecommunity.com>

At 05:06 PM 8/16/2005 -0400, mike bayer wrote:
>theres also security considerations regarding using only cookies without
>server side sessions.  For login tokens, if theres no corresponding
>server-side token to match up that it is in fact a current login and not
>something left over from a long-closed session, then some kind of clever
>encryption combined with time information must be used on the client-side
>token that can guarantee the login is recent and valid.

That's why I listed signed vs. unsigned data as one of the concerns that 
should be part of a client-side session API design.  You don't need 
encryption, btw, you just need a signature.  Signatures are easily done by 
using a hashing algorithm and a secret key.  And by easily done, I mean a 
few lines of Python.

Really the only "interesting" part of managing a hash-based signature is 
where to store the key such that all the server processes can access it, 
but it isn't part of your source code.  You can do that with a file on a 
single server, but for multiple servers it's back to the DB or else you 
need a way to push out configuration to the servers.  You also need key 
rotation such that your signatures indicate which key was used to sign 
them, so that people's keys don't suddenly stop working when you update 
your key.

OTOH, if you have a multi-server setup you probably already know about all 
these problems and have ways to deal with them.


From ianb at colorstudy.com  Wed Aug 17 00:22:50 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 16 Aug 2005 17:22:50 -0500
Subject: [Web-SIG] Secret keys (was: Session interface)
In-Reply-To: <5.1.1.6.0.20050816174540.029e77a0@mail.telecommunity.com>
References: <5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com>
	<5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com>
	<5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com>
	<5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
	<5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
	<5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com>
	<5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com>
	<5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com>
	<5.1.1.6.0.20050816174540.029e77a0@mail.telecommunity.com>
Message-ID: <4302673A.6070009@colorstudy.com>

Phillip J. Eby wrote:
> Really the only "interesting" part of managing a hash-based signature is 
> where to store the key such that all the server processes can access it, 
> but it isn't part of your source code.  You can do that with a file on a 
> single server, but for multiple servers it's back to the DB or else you 
> need a way to push out configuration to the servers.  You also need key 
> rotation such that your signatures indicate which key was used to sign 
> them, so that people's keys don't suddenly stop working when you update 
> your key.

It would be nice if there was a standard way to get the "server's" 
secret key (or key(s)).  Or, maybe more abstractly, to sign and confirm 
the signature of an item, like:

   signed_data = sign(data)
   # Raises exception if there's a problem:
   data = extract_signed_data(signed_data)

At that level any key rotation can be hidden.  The mechanism is easy, 
the key management is actually not "hard", but it depends on what your 
definition of "server" is.  That would be a ripe place for 
standardization; easy to define, useful, multiple implementations 
expected.  But where do you stuff the functions?  It almost seems best 
to have server environments create or monkey patch some single module, 
since I can't really think of a reason that a single process should have 
multiple keys (except maybe in Zope, which has intraprocess security).

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From ianb at colorstudy.com  Wed Aug 17 00:32:11 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 16 Aug 2005 17:32:11 -0500
Subject: [Web-SIG] Session interface
In-Reply-To: <43021414.9080102@colorstudy.com>
References: <43021414.9080102@colorstudy.com>
Message-ID: <4302696B.6030601@colorstudy.com>

Anyone still interested in session libraries?  Putting the wisdom of 
such a thing aside, any thoughts on a library itself?

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From pje at telecommunity.com  Wed Aug 17 01:16:06 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 16 Aug 2005 19:16:06 -0400
Subject: [Web-SIG] Secret keys (was: Session interface)
In-Reply-To: <4302673A.6070009@colorstudy.com>
References: <5.1.1.6.0.20050816174540.029e77a0@mail.telecommunity.com>
	<5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com>
	<5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com>
	<5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com>
	<5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
	<5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>
	<5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com>
	<5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com>
	<5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com>
	<5.1.1.6.0.20050816174540.029e77a0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050816190849.01b26308@mail.telecommunity.com>

At 05:22 PM 8/16/2005 -0500, Ian Bicking wrote:
>Phillip J. Eby wrote:
>>Really the only "interesting" part of managing a hash-based signature is 
>>where to store the key such that all the server processes can access it, 
>>but it isn't part of your source code.  You can do that with a file on a 
>>single server, but for multiple servers it's back to the DB or else you 
>>need a way to push out configuration to the servers.  You also need key 
>>rotation such that your signatures indicate which key was used to sign 
>>them, so that people's keys don't suddenly stop working when you update 
>>your key.
>
>It would be nice if there was a standard way to get the "server's" secret 
>key (or key(s)).  Or, maybe more abstractly, to sign and confirm the 
>signature of an item, like:
>
>   signed_data = sign(data)
>   # Raises exception if there's a problem:
>   data = extract_signed_data(signed_data)

The extraction facility should probably accept an optional timeout, too, so 
that messages older than the timeout are considered invalid.


>At that level any key rotation can be hidden.  The mechanism is easy, the 
>key management is actually not "hard", but it depends on what your 
>definition of "server" is.  That would be a ripe place for 
>standardization; easy to define, useful, multiple implementations 
>expected.  But where do you stuff the functions?

In a WSGI service, as soon as we finish that spec.  :)


>   It almost seems best to have server environments create or monkey patch 
> some single module, since I can't really think of a reason that a single 
> process should have multiple keys (except maybe in Zope, which has 
> intraprocess security).

I'm not so much concerned about intraprocess security as I am with 
associating things with the right applications, and being able to use two 
independently-developed applications that depend on different key 
stores.  With WSGI, you can run discrete apps in the same server process, 
so it seems to make more sense to put that in the pipeline than a 
monkeypatched module.


From chrism at plope.com  Wed Aug 17 03:42:56 2005
From: chrism at plope.com (Chris McDonough)
Date: Tue, 16 Aug 2005 21:42:56 -0400
Subject: [Web-SIG] Session interface
In-Reply-To: <5.1.1.6.0.20050816171536.030b1b08@mail.telecommunity.com>
References: <5.1.1.6.0.20050816171536.030b1b08@mail.telecommunity.com>
Message-ID: <1124242977.30493.80.camel@localhost.localdomain>

I haven't been closely following this thread and this may have already
been said but IMO sessions are most useful when the querying user is not
identified and you need a place to stash data related to that user (e.g.
a shopping cart).  They are convenient in other cirumstances but rarely
necessary.  

I've never quite understood why people use server-side sessions for
authentication.  Maybe it's because they're typically so easy to use and
have been sold as "the way to maintain state" in a web application to a
lot of people.  But in reality they can be quite expensive under high
load  because of their generality and there's almost always a better
way.

On Tue, 2005-08-16 at 17:42 -0400, Phillip J. Eby wrote:
> At 05:08 PM 8/16/2005 -0400, Geoffrey Talvola wrote:
> >Jonathan Ellis wrote:
> > > Still, it can be good to have a simple place to store non-permanent
> > > information.
> >
> >For example...
> >
> >I think a good use of sessions is in remembering selections that have been
> >made earlier on.  For example, suppose you have a reporting application
> >where you allow the user to select one or more items to report on from a
> >list box, several filtering options in dropdowns or checkboxes, sorting and
> >grouping behavior, etc.  You want to remember those settings so that if the
> >user returns to the report selection page, their last selected settings are
> >pre-selected.  But, unless the user chooses to save those settings as a
> >"stored report", you'd like to forget the settings when the user logs out or
> >when they close their browser.
> >
> >Also, assume that your application already has this bundle of selections in
> >the form of a Python object.
> >
> >Isn't the cleanest, easiest, and more efficient way to handle this to simply
> >save the Python object in a session variable?
> 
> No.  :)
> 
> I have to admit I'm probably biased by early Zope experience, where cookie 
> variables are as easy to use as form variables or any other kind of 
> variable.  Just set the cookies to save the options, then refer to them in 
> the page.  Sweet and simple.  And if you set the cookie path to the path of 
> the page, then the client doesn't have to send them on every request, only 
> the ones where it makes a difference.
> 
> 
> >   In some cases, for example
> >using Webware's in-memory sessions, for example, this data never has to be
> >marshaled or leave the application server at all.
> >
> >If I didn't have sessions, I think using either cookies or a back-end db
> >would be more work, less clean, and less efficient in this case.
> 
> Maybe that's a limitation of the framework?  As I said, I'm probably 
> spoiled by how easily Zope merges GET/POST/cookie variables, such that form 
> variables override cookies, but if the form variable isn't supplied the 
> cookie is used as a default.  That one simple behavior made "smart forms" 
> really easy to make in Zope and Zope-like systems.
> 
> _______________________________________________
> Web-SIG mailing list
> Web-SIG at python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com
> 


From mso at oz.net  Wed Aug 17 06:54:48 2005
From: mso at oz.net (Mike Orr)
Date: Tue, 16 Aug 2005 21:54:48 -0700
Subject: [Web-SIG] Session interface
In-Reply-To: <43021414.9080102@colorstudy.com>
References: <43021414.9080102@colorstudy.com>
Message-ID: <4302C318.9050900@oz.net>

Regarding Ian's session interface:
http://svn.colorstudy.com/home/ianb/scarecrow_session_interface.py

Ian Bicking wrote:

>Thinking on it more, probably a good place to start would be agreeing on 
>specific terminology for the objects involved, since I've seen several 
>different sets of terminology, many of which use the same words for 
>different ideas:
>
>Session:
>   An instance of this represents one user/browser's session.
>SessionStore:
>   An instance of this represents the persistence mechanism.  This
>   is a functional component, not embodying any policy.
>SessionManager:
>   This is a container for sessions, and uses a SessionStore.  This
>   contains all the policy for loading, saving, locking, expiring
>   sessions.
>  
>


At minimum, the SessionManager links the SessionStore, Session, and 
application together.  It can be generic, along with 
loading/saving/locking.  (Although we might allow the application to 
choose a locking policy.)  But expiring is very application-specific, 
and it may not be the "application" doing it but a separate cron job.  
Perhaps most applications will be happy with an "expire all sessions 
unmodified for N minutes", but some will want to inspect the metadata 
and others the content.  So maybe all the SessionManager can do is:

    .delete_session(id)   => pass message directly to SessionStore
    .iter_sessions()  =>  tuples of (id, metadata)
    .iter_sessions_with_content() => tuples of (id, metadata, content)

... where metadata includes the access time and whatever else we 
decide.  Of course, iterating the content may be disk/memory intensive.

If .delete_expired_sessions() is included, the application would have to 
subclass SessionManager rather than just using it.  That's not 
necessarily bad but a potential limitation.  Or the application could 
kludge up a policy from your methods:

    cutoff = time.time() - (60 * 60 * 4)
    for sid in sm.session_ids():
        if sm.last_accessed(sid) < cutoff:
            sm.delete_session(sid)

I suppose kludgy is in the eye of the beholder.  This would not be kludgy:

    cutoff = time.time() - (60 * 60 * 4)
    for sid, metadata in sm.iter_sessions():
        if metadata.atime < cutoff:
            sm.delete_session(sid)

Curses on anybody who says, "What's the difference?"

PS. Kudos for using .names_with_underscores rather than .studlyCaps.

Your other methods look all right at first glance.  We'll know when we 
port existing frameworks to it whether it's adequate.  (Or should that 
be "when we port it to existing frameworks"?  Or "when we make existing 
frameworks use it as middleware"?)  We'll also have to keep an eye on a 
usage pattern to recommend for future frameworks, and on whether this 
API has anything to do with the "sessionless" persistance patterns that 
have also been proposed. 

Interesting ideas you've had about read/write vs read-only sessions.  
I'd say let's support read-only sessions, and maybe that will encourage 
applications to use them.

Session ID cookies seem like a generic thing this class should handle, 
especially for applications that don't otherwise use cookies.  XML-RPC 
encapsulates the XML (an necessary evil); why shouldn't we encapsulate 
the cookie (another necessary evil)?

>Does that sound good?  Note that the attached interface conflates 
>SessionStore and SessionManager.  Some interfaces make an explicit 
>ApplicationSession, which is contained by Session and keyed off some 
>application ID; my interface implies that separation, but does not 
>enforce it, and does not offer any extra functionality at that level 
>(e.g., per-ApplicationSession locks or transactions).
>  
>


I'm not sure what you mean by ApplicationSession.  Perl's session object 
is a dictionary, and you can store anything in it.  Our top-level object 
has to be flexible due to grandfathering, unless we want to force 
applications to translate to/from our session object to their native 
session format.  Yet you define certain attributes/methods the Session 
must have, which legacy Sessions don't.  I guess allow the application 
to provide a subclass or compatible class, and let it worry about how to 
upgrade its native session object.

Regarding sessionless persistence, that reminds me of a disagreement I 
had with Titus in designing session2.  Quixote provides Session.user 
default None, but doesn't define what other values it can have.  I put a 
full-fledged User object with username/group/permission info.  Titus 
puts a string name and stores everything else in his application 
database.  So his *SessionStore classes put the name in a VARCHAR column 
and didn't save the rest of the session data.  I argued that "most 
people will have a User object, and they'll expect the entire Session to 
be pickled because that's what PHP/Perl do."  He relented, so the 
current *SessionStores can be used either way.

Perhaps applications should store all session data directly, keyed by 
session ID (and perhaps "username"), rather than using pickled 
Sessions.  That would be a good idea for a parallel project.  I'm not 
sure how relevant that would be to this API except to share "cookie 
code".  This API + implementations are required in any case, both 
because "most users" will not consider Python if it doesn't have "robust 
session handling", and a common library would allow frameworks to use it 
rather than reinventing the wheel incompatibly.  This is true regardless 
of the merits of sessions.

-- Mike Orr <mso at oz.net>

From ianb at colorstudy.com  Wed Aug 17 07:31:12 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed, 17 Aug 2005 00:31:12 -0500
Subject: [Web-SIG] Session interface
In-Reply-To: <4302C318.9050900@oz.net>
References: <43021414.9080102@colorstudy.com> <4302C318.9050900@oz.net>
Message-ID: <4302CBA0.6040307@colorstudy.com>

Mike Orr wrote:
> Regarding Ian's session interface:
> http://svn.colorstudy.com/home/ianb/scarecrow_session_interface.py
> 
> Ian Bicking wrote:
> 
>> Thinking on it more, probably a good place to start would be agreeing 
>> on specific terminology for the objects involved, since I've seen 
>> several different sets of terminology, many of which use the same 
>> words for different ideas:
>>
>> Session:
>>   An instance of this represents one user/browser's session.
>> SessionStore:
>>   An instance of this represents the persistence mechanism.  This
>>   is a functional component, not embodying any policy.
>> SessionManager:
>>   This is a container for sessions, and uses a SessionStore.  This
>>   contains all the policy for loading, saving, locking, expiring
>>   sessions.
>>  
>>
> 
> 
> At minimum, the SessionManager links the SessionStore, Session, and 
> application together.  It can be generic, along with 
> loading/saving/locking.  (Although we might allow the application to 
> choose a locking policy.)  

That could be a little difficult, since multiple applications may be 
sharing a session.  But at the same time, applications that don't expect 
ConflictError are going to be pissed if you configure your system for 
optimistic locking.

Of course, given a session ID and a session store, each application 
could have its own manager.  Possibly.  Hmm... interesting.  In that 
case each SessionManager needs an id, which is a bit annoying -- it has 
to be stable and shared, because the same SessionManager has to be 
identifiable over multiple processes.  But I hate inventing IDs all over 
the place.  I feel like I'm pulling string keys out of my ass, and if 
I'm going to pull things out of my ass I at least don't want to then put 
them into my code.  I sense UUIDs coming on :(

That said, this isn't the only place I need strings that are unique to 
an application instance.

> But expiring is very application-specific, 
> and it may not be the "application" doing it but a separate cron job.  
> Perhaps most applications will be happy with an "expire all sessions 
> unmodified for N minutes", but some will want to inspect the metadata 
> and others the content.  So maybe all the SessionManager can do is:
> 
>    .delete_session(id)   => pass message directly to SessionStore
>    .iter_sessions()  =>  tuples of (id, metadata)
>    .iter_sessions_with_content() => tuples of (id, metadata, content)

I think metadata is probably good; or lazily-loaded sessions or 
something.  The metadata is important I think, because updating metadata 
shouldn't be effected by locking and whatnot.  I think Mike mentioned a 
problem with locking and updating the timestamp contained in the session 
-- we should avoid that.

> ... where metadata includes the access time and whatever else we 
> decide.  Of course, iterating the content may be disk/memory intensive.

Sure.  We could have a callback to do filtering too, maybe with a 
default filter by expiration time.  Or event callbacks.

> If .delete_expired_sessions() is included, the application would have to 
> subclass SessionManager rather than just using it.  That's not 
> necessarily bad but a potential limitation.  Or the application could 
> kludge up a policy from your methods:
> 
>    cutoff = time.time() - (60 * 60 * 4)
>    for sid in sm.session_ids():
>        if sm.last_accessed(sid) < cutoff:
>            sm.delete_session(sid)
> 
> I suppose kludgy is in the eye of the beholder.  This would not be kludgy:
> 
>    cutoff = time.time() - (60 * 60 * 4)
>    for sid, metadata in sm.iter_sessions():
>        if metadata.atime < cutoff:
>            sm.delete_session(sid)
> 
> Curses on anybody who says, "What's the difference?"
> 
> PS. Kudos for using .names_with_underscores rather than .studlyCaps.
> 
> Your other methods look all right at first glance.  We'll know when we 
> port existing frameworks to it whether it's adequate.  (Or should that 
> be "when we port it to existing frameworks"?  Or "when we make existing 
> frameworks use it as middleware"?)  We'll also have to keep an eye on a 
> usage pattern to recommend for future frameworks, and on whether this 
> API has anything to do with the "sessionless" persistance patterns that 
> have also been proposed.

Acquiring or creating a session ID is outside of the scope of this 
interface, but I think that's much of what would be useful to 
sessionless users.  Or, rather, people who want application-specific 
sessions.

> Interesting ideas you've had about read/write vs read-only sessions.  
> I'd say let's support read-only sessions, and maybe that will encourage 
> applications to use them.
> 
> Session ID cookies seem like a generic thing this class should handle, 
> especially for applications that don't otherwise use cookies.  XML-RPC 
> encapsulates the XML (an necessary evil); why shouldn't we encapsulate 
> the cookie (another necessary evil)?

XML-RPC contains the XML, but it doesn't deal with the transport really. 
  And, just using XML-RPC as an example, what if you want to stuff the 
session ID inside the XML-RPC request instead of in a cookie header?

But anyway, the reason I don't want to handle this is because this would 
be much easier if building upon a Standard That Does Not Yet Exist, and 
I'd rather avoid overlapping with that standard.

>> Does that sound good?  Note that the attached interface conflates 
>> SessionStore and SessionManager.  Some interfaces make an explicit 
>> ApplicationSession, which is contained by Session and keyed off some 
>> application ID; my interface implies that separation, but does not 
>> enforce it, and does not offer any extra functionality at that level 
>> (e.g., per-ApplicationSession locks or transactions).
>>  
>>
> 
> 
> I'm not sure what you mean by ApplicationSession.  Perl's session object 
> is a dictionary, and you can store anything in it.  Our top-level object 
> has to be flexible due to grandfathering, unless we want to force 
> applications to translate to/from our session object to their native 
> session format.  Yet you define certain attributes/methods the Session 
> must have, which legacy Sessions don't.  I guess allow the application 
> to provide a subclass or compatible class, and let it worry about how to 
> upgrade its native session object.

I was thinking of pythonweb's "Store": 
http://pythonweb.org/projects/webmodules/doc/0.5.3/html_multipage/lib/node153.html

I vaguely suggest in the interface that each application should put all 
of its data in a single key (based on the application name).  Now I 
think that should be based on a unique name (not the application name, 
because the application may exist multiple times in the process), and 
maybe with an entirely different manager.

> Regarding sessionless persistence, that reminds me of a disagreement I 
> had with Titus in designing session2.  Quixote provides Session.user 
> default None, but doesn't define what other values it can have.  I put a 
> full-fledged User object with username/group/permission info.  Titus 
> puts a string name and stores everything else in his application 
> database.  So his *SessionStore classes put the name in a VARCHAR column 
> and didn't save the rest of the session data.  I argued that "most 
> people will have a User object, and they'll expect the entire Session to 
> be pickled because that's what PHP/Perl do."  He relented, so the 
> current *SessionStores can be used either way.

In the interface I suggest anything pickleable can go in a key.  This 
requirement has been the source of some controversy in Webware, since 
people wanted to put open file objects and such in the session; mostly 
people coming from Java where apparently that's the norm.  Anyway, it's 
still possible with this interface to have a store that never pickles 
anything; I can just hope no one writes code they expect anyone else to 
use that demands in-memory session storage.  Those are lame even when 
you are using threads.

I think the example shows one reason the session shouldn't be considered 
a public API.  I think it's fine to put the username or the user object 
in the session -- we can debate the pluses and minuses, but it works -- 
but I think you should definitely wrap that implementation detail in 
something else.  E.g., request.user should return 
request.session['user'] or something.

> Perhaps applications should store all session data directly, keyed by 
> session ID (and perhaps "username"), rather than using pickled 
> Sessions.  That would be a good idea for a parallel project.  I'm not 
> sure how relevant that would be to this API except to share "cookie 
> code".  This API + implementations are required in any case, both 
> because "most users" will not consider Python if it doesn't have "robust 
> session handling", and a common library would allow frameworks to use it 
> rather than reinventing the wheel incompatibly.  This is true regardless 
> of the merits of sessions.

I guess if applications each have their own SessionManager, they could 
have their own Session classes, and if they wanted to the Session 
objects could use application-specific storage and even an 
application-specific API (not just a dictionary interface).  I don't 
know what the point of that would be, though, since it's all 
application-specific and not generic, so you might as well just use the 
session ID and ignore the rest of the API.

-- 
Ian Bicking  /  ianb at colorstudy.com  / http://blog.ianbicking.org

From mso at oz.net  Wed Aug 17 07:33:10 2005
From: mso at oz.net (Mike Orr)
Date: Tue, 16 Aug 2005 22:33:10 -0700
Subject: [Web-SIG] Session interface
In-Reply-To: <4302C318.9050900@oz.net>
References: <43021414.9080102@colorstudy.com> <4302C318.9050900@oz.net>
Message-ID: <4302CC16.2050206@oz.net>

Mike Orr wrote:

>Regarding sessionless persistence, that reminds me of a disagreement I 
>had with Titus in designing session2.  Quixote provides Session.user 
>default None, but doesn't define what other values it can have.  I put a 
>full-fledged User object with username/group/permission info.  Titus 
>puts a string name and stores everything else in his application 
>database.  So his *SessionStore classes put the name in a VARCHAR column 
>and didn't save the rest of the session data.  I argued that "most 
>people will have a User object, and they'll expect the entire Session to 
>be pickled because that's what PHP/Perl do."  He relented, so the 
>current *SessionStores can be used either way.
>
>Perhaps applications should store all session data directly, keyed by 
>session ID (and perhaps "username"), rather than using pickled 
>Sessions.  That would be a good idea for a parallel project.  I'm not 
>sure how relevant that would be to this API except to share "cookie 
>code".  This API + implementations are required in any case, both 
>because "most users" will not consider Python if it doesn't have "robust 
>session handling", and a common library would allow frameworks to use it 
>rather than reinventing the wheel incompatibly.  This is true regardless 
>of the merits of sessions.
>  
>


Another thing about sessionless persistence.  I find sessions useful for 
storing miscellaneous data that would otherwise be sent to the browser 
and back.  Usually it's not a question of byte size but rather: (A) I 
don't want the user to see the data directly -- it contains more 
information about the application/server structure than I care to 
divulve, and (B) I don't want the user manipulating the data and sending 
back something invalid or in the wrong state -- which I would then have 
to error-check.

I could store the data in my relational database, but then I'd have to 
make a half-dozen tables for:

    .user       : a User instance.
    .search   : the latest search results (list of record IDs), the last 
page
        viewed (positive int), and the criteria to redo the search or
        repopulate the search form (dict).
    .message:  a message to display at the next request.
    ... other stuff ??

So I guess sessions are a lazy way to have object-database features in a 
relational-database application.  At least for data that lasts longer 
than a request but shorter than a session timeout.

-- Mike Orr <mso at oz.net>

From chrism at plope.com  Wed Aug 17 07:48:44 2005
From: chrism at plope.com (Chris McDonough)
Date: Wed, 17 Aug 2005 01:48:44 -0400
Subject: [Web-SIG] Session interface
In-Reply-To: <4302CBA0.6040307@colorstudy.com>
References: <43021414.9080102@colorstudy.com> <4302C318.9050900@oz.net>
	<4302CBA0.6040307@colorstudy.com>
Message-ID: <1124257724.17688.11.camel@plope.dyndns.org>

FWIW, some interesting ideas (and not so interesting ideas) for
sessioning architecture in general are captured at

http://www.zope.org/Wikis/DevSite/Projects/CoreSessionTracking/UseCases

and

http://www.zope.org/Wikis/DevSite/Projects/CoreSessionTracking/CoreSessionTrackingDiscussion

UML that more or less represents Zope's current sessioning model is at:

http://www.zope.org/Wikis/DevSite/Projects/CoreSessionTracking/CoreSessionTrackingUML

- C

On Wed, 2005-08-17 at 00:31 -0500, Ian Bicking wrote:
> Mike Orr wrote:
> > Regarding Ian's session interface:
> > http://svn.colorstudy.com/home/ianb/scarecrow_session_interface.py
> > 
> > Ian Bicking wrote:
> > 
> >> Thinking on it more, probably a good place to start would be agreeing 
> >> on specific terminology for the objects involved, since I've seen 
> >> several different sets of terminology, many of which use the same 
> >> words for different ideas:
> >>
> >> Session:
> >>   An instance of this represents one user/browser's session.
> >> SessionStore:
> >>   An instance of this represents the persistence mechanism.  This
> >>   is a functional component, not embodying any policy.
> >> SessionManager:
> >>   This is a container for sessions, and uses a SessionStore.  This
> >>   contains all the policy for loading, saving, locking, expiring
> >>   sessions.
> >>  
> >>
> > 
> > 
> > At minimum, the SessionManager links the SessionStore, Session, and 
> > application together.  It can be generic, along with 
> > loading/saving/locking.  (Although we might allow the application to 
> > choose a locking policy.)  
> 
> That could be a little difficult, since multiple applications may be 
> sharing a session.  But at the same time, applications that don't expect 
> ConflictError are going to be pissed if you configure your system for 
> optimistic locking.
> 
> Of course, given a session ID and a session store, each application 
> could have its own manager.  Possibly.  Hmm... interesting.  In that 
> case each SessionManager needs an id, which is a bit annoying -- it has 
> to be stable and shared, because the same SessionManager has to be 
> identifiable over multiple processes.  But I hate inventing IDs all over 
> the place.  I feel like I'm pulling string keys out of my ass, and if 
> I'm going to pull things out of my ass I at least don't want to then put 
> them into my code.  I sense UUIDs coming on :(
> 
> That said, this isn't the only place I need strings that are unique to 
> an application instance.
> 
> > But expiring is very application-specific, 
> > and it may not be the "application" doing it but a separate cron job.  
> > Perhaps most applications will be happy with an "expire all sessions 
> > unmodified for N minutes", but some will want to inspect the metadata 
> > and others the content.  So maybe all the SessionManager can do is:
> > 
> >    .delete_session(id)   => pass message directly to SessionStore
> >    .iter_sessions()  =>  tuples of (id, metadata)
> >    .iter_sessions_with_content() => tuples of (id, metadata, content)
> 
> I think metadata is probably good; or lazily-loaded sessions or 
> something.  The metadata is important I think, because updating metadata 
> shouldn't be effected by locking and whatnot.  I think Mike mentioned a 
> problem with locking and updating the timestamp contained in the session 
> -- we should avoid that.
> 
> > ... where metadata includes the access time and whatever else we 
> > decide.  Of course, iterating the content may be disk/memory intensive.
> 
> Sure.  We could have a callback to do filtering too, maybe with a 
> default filter by expiration time.  Or event callbacks.
> 
> > If .delete_expired_sessions() is included, the application would have to 
> > subclass SessionManager rather than just using it.  That's not 
> > necessarily bad but a potential limitation.  Or the application could 
> > kludge up a policy from your methods:
> > 
> >    cutoff = time.time() - (60 * 60 * 4)
> >    for sid in sm.session_ids():
> >        if sm.last_accessed(sid) < cutoff:
> >            sm.delete_session(sid)
> > 
> > I suppose kludgy is in the eye of the beholder.  This would not be kludgy:
> > 
> >    cutoff = time.time() - (60 * 60 * 4)
> >    for sid, metadata in sm.iter_sessions():
> >        if metadata.atime < cutoff:
> >            sm.delete_session(sid)
> > 
> > Curses on anybody who says, "What's the difference?"
> > 
> > PS. Kudos for using .names_with_underscores rather than .studlyCaps.
> > 
> > Your other methods look all right at first glance.  We'll know when we 
> > port existing frameworks to it whether it's adequate.  (Or should that 
> > be "when we port it to existing frameworks"?  Or "when we make existing 
> > frameworks use it as middleware"?)  We'll also have to keep an eye on a 
> > usage pattern to recommend for future frameworks, and on whether this 
> > API has anything to do with the "sessionless" persistance patterns that 
> > have also been proposed.
> 
> Acquiring or creating a session ID is outside of the scope of this 
> interface, but I think that's much of what would be useful to 
> sessionless users.  Or, rather, people who want application-specific 
> sessions.
> 
> > Interesting ideas you've had about read/write vs read-only sessions.  
> > I'd say let's support read-only sessions, and maybe that will encourage 
> > applications to use them.
> > 
> > Session ID cookies seem like a generic thing this class should handle, 
> > especially for applications that don't otherwise use cookies.  XML-RPC 
> > encapsulates the XML (an necessary evil); why shouldn't we encapsulate 
> > the cookie (another necessary evil)?
> 
> XML-RPC contains the XML, but it doesn't deal with the transport really. 
>   And, just using XML-RPC as an example, what if you want to stuff the 
> session ID inside the XML-RPC request instead of in a cookie header?
> 
> But anyway, the reason I don't want to handle this is because this would 
> be much easier if building upon a Standard That Does Not Yet Exist, and 
> I'd rather avoid overlapping with that standard.
> 
> >> Does that sound good?  Note that the attached interface conflates 
> >> SessionStore and SessionManager.  Some interfaces make an explicit 
> >> ApplicationSession, which is contained by Session and keyed off some 
> >> application ID; my interface implies that separation, but does not 
> >> enforce it, and does not offer any extra functionality at that level 
> >> (e.g., per-ApplicationSession locks or transactions).
> >>  
> >>
> > 
> > 
> > I'm not sure what you mean by ApplicationSession.  Perl's session object 
> > is a dictionary, and you can store anything in it.  Our top-level object 
> > has to be flexible due to grandfathering, unless we want to force 
> > applications to translate to/from our session object to their native 
> > session format.  Yet you define certain attributes/methods the Session 
> > must have, which legacy Sessions don't.  I guess allow the application 
> > to provide a subclass or compatible class, and let it worry about how to 
> > upgrade its native session object.
> 
> I was thinking of pythonweb's "Store": 
> http://pythonweb.org/projects/webmodules/doc/0.5.3/html_multipage/lib/node153.html
> 
> I vaguely suggest in the interface that each application should put all 
> of its data in a single key (based on the application name).  Now I 
> think that should be based on a unique name (not the application name, 
> because the application may exist multiple times in the process), and 
> maybe with an entirely different manager.
> 
> > Regarding sessionless persistence, that reminds me of a disagreement I 
> > had with Titus in designing session2.  Quixote provides Session.user 
> > default None, but doesn't define what other values it can have.  I put a 
> > full-fledged User object with username/group/permission info.  Titus 
> > puts a string name and stores everything else in his application 
> > database.  So his *SessionStore classes put the name in a VARCHAR column 
> > and didn't save the rest of the session data.  I argued that "most 
> > people will have a User object, and they'll expect the entire Session to 
> > be pickled because that's what PHP/Perl do."  He relented, so the 
> > current *SessionStores can be used either way.
> 
> In the interface I suggest anything pickleable can go in a key.  This 
> requirement has been the source of some controversy in Webware, since 
> people wanted to put open file objects and such in the session; mostly 
> people coming from Java where apparently that's the norm.  Anyway, it's 
> still possible with this interface to have a store that never pickles 
> anything; I can just hope no one writes code they expect anyone else to 
> use that demands in-memory session storage.  Those are lame even when 
> you are using threads.
> 
> I think the example shows one reason the session shouldn't be considered 
> a public API.  I think it's fine to put the username or the user object 
> in the session -- we can debate the pluses and minuses, but it works -- 
> but I think you should definitely wrap that implementation detail in 
> something else.  E.g., request.user should return 
> request.session['user'] or something.
> 
> > Perhaps applications should store all session data directly, keyed by 
> > session ID (and perhaps "username"), rather than using pickled 
> > Sessions.  That would be a good idea for a parallel project.  I'm not 
> > sure how relevant that would be to this API except to share "cookie 
> > code".  This API + implementations are required in any case, both 
> > because "most users" will not consider Python if it doesn't have "robust 
> > session handling", and a common library would allow frameworks to use it 
> > rather than reinventing the wheel incompatibly.  This is true regardless 
> > of the merits of sessions.
> 
> I guess if applications each have their own SessionManager, they could 
> have their own Session classes, and if they wanted to the Session 
> objects could use application-specific storage and even an 
> application-specific API (not just a dictionary interface).  I don't 
> know what the point of that would be, though, since it's all 
> application-specific and not generic, so you might as well just use the 
> session ID and ignore the rest of the API.
> 


From mso at oz.net  Wed Aug 17 09:15:55 2005
From: mso at oz.net (Mike Orr)
Date: Wed, 17 Aug 2005 00:15:55 -0700
Subject: [Web-SIG] Session interface
In-Reply-To: <4302CBA0.6040307@colorstudy.com>
References: <43021414.9080102@colorstudy.com> <4302C318.9050900@oz.net>
	<4302CBA0.6040307@colorstudy.com>
Message-ID: <4302E42B.6020803@oz.net>

Ian Bicking wrote:

>> At minimum, the SessionManager links the SessionStore, Session, and 
>> application together.  It can be generic, along with 
>> loading/saving/locking.  (Although we might allow the application to 
>> choose a locking policy.)  
>
>
> That could be a little difficult, since multiple applications may be 
> sharing a session.  But at the same time, applications that don't 
> expect ConflictError are going to be pissed if you configure your 
> system for optimistic locking.

>
> Of course, given a session ID and a session store, each application 
> could have its own manager.


I wasn't thinking of multi-application sessions, much less whether they 
would have their own SessionManagers.  And since my applications don't 
have a locking policy, I have no opinion which one is best, if you want 
to impose one.  Certainly it makes sense that applications sharing a 
session must agree on a locking policy.  I'd bias toward a common 
SessionManager.  Expecially to centralize the expiration.  Should 
applications sharing a session be allowed to have different expiration 
policies?  Perhaps SessionManager.delete_expired_sessions() is a good 
thing after all.

Another limitation is the byte size of the session pickle.  The 
SessionStore knows this, and the SessionManager should make it available 
to the application.  Then the application (or launcher script) can raise 
an exception at startup if it deems the size insufficient.  None would 
mean no limit, of course.

This happened to me when I was putting the entire result data in the 
session, then imported a dataset with 3000 records.  So "Browse All" 
finds 3000 results... and that doesn't fit into a 65535-byte BLOB.  The 
database truncates the pickle, and behold, an obscure error at the next 
request.  I decided a larger session was ridiculous, and switched to 
storing record IDs only.  ("3000 ints is not an unreasonable size!")

> I was thinking of pythonweb's "Store": 
> http://pythonweb.org/projects/webmodules/doc/0.5.3/html_multipage/lib/node153.html 
>
>
> I vaguely suggest in the interface that each application should put 
> all of its data in a single key (based on the application name).  Now 
> I think that should be based on a unique name (not the application 
> name, because the application may exist multiple times in the 
> process), and maybe with an entirely different manager.


Oooh, I haven't seen PythonWeb before.  Well worth coordinating with, if 
feasable.  I wonder how hard it would be to port Quixote to 
PythonWeb....  Seriously, they have a database-independent interactive 
tool a la mysql/pgsql/sqlite3?  "Wow, that's nifty."  But a common 
templating front-end??  "Get real.  I choose a template engine for its 
unique features, not the lowest common denominator." 


>> Regarding sessionless persistence, that reminds me of a disagreement 
>> I had with Titus in designing session2.  Quixote provides 
>> Session.user default None, but doesn't define what other values it 
>> can have.  I put a full-fledged User object with 
>> username/group/permission info.  Titus puts a string name and stores 
>> everything else in his application database.  So his *SessionStore 
>> classes put the name in a VARCHAR column and didn't save the rest of 
>> the session data.  I argued that "most people will have a User 
>> object, and they'll expect the entire Session to be pickled because 
>> that's what PHP/Perl do."  He relented, so the current *SessionStores 
>> can be used either way.
>
>
> In the interface I suggest anything pickleable can go in a key.  This 
> requirement has been the source of some controversy in Webware, since 
> people wanted to put open file objects and such in the session; mostly 
> people coming from Java where apparently that's the norm.  Anyway, 
> it's still possible with this interface to have a store that never 
> pickles anything; I can just hope no one writes code they expect 
> anyone else to use that demands in-memory session storage.  Those are 
> lame even when you are using threads.
>
> I think the example shows one reason the session shouldn't be 
> considered a public API.  I think it's fine to put the username or the 
> user object in the session -- we can debate the pluses and minuses, 
> but it works -- but I think you should definitely wrap that 
> implementation detail in something else.  E.g., request.user should 
> return request.session['user'] or something.


I'm not sure what you mean.  There has to be a public API or the 
application can't use the session.  "Should I set an attribute or a key, 
or call a method?"  ("Coffee, tea, or milk?")  There is no 
request.user.  Quixote has a get_user() function but it translates to 
session.user.  That's actually request.session.user but you're supposed 
to pretend you don't know that.  Or are you saying that applications 
should not set attributes?  In that case we might as well use a 
dictionary as the official Session object, as Perl does.  Of course 
you'd have to put the metadata somewhere....

-- Mike Orr <mso at oz.net>

From mike_mp at zzzcomputing.com  Wed Aug 17 18:52:16 2005
From: mike_mp at zzzcomputing.com (mike bayer)
Date: Wed, 17 Aug 2005 12:52:16 -0400 (EDT)
Subject: [Web-SIG] Session interface
In-Reply-To: <1124227706.16301.240828160@webmail.messagingengine.com>
References: <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com>  
	<5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com>  
	<5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>  
	<5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com>  
	<5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com>  
	<5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com>  
	<5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com>  
	<57167.66.192.34.8.1124226417.squirrel@66.192.34.8>
	<1124227706.16301.240828160@webmail.messagingengine.com>
Message-ID: <42618.66.192.34.8.1124297536.squirrel@66.192.34.8>

Jonathan Ellis said:
>
> Now that's an example of when I think sessions are a poor solution.  IMO
> caching objects from the database is the job for the, well, database
> object cache. :)
>
> They are similar but not identical.  For instance, while session data
> typically expires after a certain amount of time, permanent data should
> never expire unless invalidated by an update.
>

putting a few user preferences in the session instead of constructing
and/or installing a separate database caching system is cheating, but its
a small cheat.  I think small cheats are fine to get a job done; the exact
specification and design of big architectural features are usually derived
from the set of small cheats they are replacing.

From jjinux at gmail.com  Wed Aug 17 20:34:34 2005
From: jjinux at gmail.com (Shannon -jj Behrens)
Date: Wed, 17 Aug 2005 11:34:34 -0700
Subject: [Web-SIG] Session interface
In-Reply-To: <61957B071FF421419E567A28A45C7FE5029D28BE@mailbox.nameconnector.com>
References: <61957B071FF421419E567A28A45C7FE5029D28BE@mailbox.nameconnector.com>
Message-ID: <c41f67b905081711342998f979@mail.gmail.com>

Wow!  I'm dumbfounded by this whole conversation!  I thought session
backends were something innane enough that we could agree on them!  I
have the same use cases as Geoffrey.  No, cookies are not a good
replacement for sessions since you have to validate them everytime you
use them.  You can't trust them unless you encrypt and sign them, and
I wasn't aware that that many people were doing that.  Neither is
relying on a cookie to time out sufficient to control a session
timeout.  Clients lie.  Perhaps I have much to learn.  I'm going to
sit back and just read :-/

-jj

On 8/16/05, Geoffrey Talvola <gtalvola at nameconnector.com> wrote:
> Jonathan Ellis wrote:
> > Still, it can be good to have a simple place to store non-permanent
> > information.
> 
> For example...
> 
> I think a good use of sessions is in remembering selections that have been
> made earlier on.  For example, suppose you have a reporting application
> where you allow the user to select one or more items to report on from a
> list box, several filtering options in dropdowns or checkboxes, sorting and
> grouping behavior, etc.  You want to remember those settings so that if the
> user returns to the report selection page, their last selected settings are
> pre-selected.  But, unless the user chooses to save those settings as a
> "stored report", you'd like to forget the settings when the user logs out or
> when they close their browser.
> 
> Also, assume that your application already has this bundle of selections in
> the form of a Python object.
> 
> Isn't the cleanest, easiest, and more efficient way to handle this to simply
> save the Python object in a session variable?  In some cases, for example
> using Webware's in-memory sessions, for example, this data never has to be
> marshaled or leave the application server at all.
> 
> If I didn't have sessions, I think using either cookies or a back-end db
> would be more work, less clean, and less efficient in this case.
> 
> - Geoff
> _______________________________________________
> Web-SIG mailing list
> Web-SIG at python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/jjinux%40gmail.com
> 


-- 
I have decided to switch to Gmail, but messages to my Yahoo account will
still get through.

From titus at caltech.edu  Wed Aug 17 21:05:26 2005
From: titus at caltech.edu (Titus Brown)
Date: Wed, 17 Aug 2005 12:05:26 -0700
Subject: [Web-SIG] Session interface
In-Reply-To: <c41f67b905081711342998f979@mail.gmail.com>
References: <61957B071FF421419E567A28A45C7FE5029D28BE@mailbox.nameconnector.com>
	<c41f67b905081711342998f979@mail.gmail.com>
Message-ID: <20050817190526.GH30939@caltech.edu>

-> Wow!  I'm dumbfounded by this whole conversation!  I thought session
-> backends were something innane enough that we could agree on them!  I
-> have the same use cases as Geoffrey.  No, cookies are not a good
-> replacement for sessions since you have to validate them everytime you
-> use them.  You can't trust them unless you encrypt and sign them, and
-> I wasn't aware that that many people were doing that.  Neither is
-> relying on a cookie to time out sufficient to control a session
-> timeout.  Clients lie.  Perhaps I have much to learn.  I'm going to
-> sit back and just read :-/

(What he said ;)

--titus

From jjinux at gmail.com  Wed Aug 17 21:17:03 2005
From: jjinux at gmail.com (Shannon -jj Behrens)
Date: Wed, 17 Aug 2005 12:17:03 -0700
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com>
References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
	<5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com>
	<4301124C.7040708@colorstudy.com>
	<5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com>
Message-ID: <c41f67b905081712174fb3dd34@mail.gmail.com>

> (And I'm also aware that "scaling down" is important, but the rule that all
> state goes either in the browser or the application DB scales down just as
> well as it scales up.)

What's wrong with storing serialized session state in the database?

-jj

-- 
I have decided to switch to Gmail, but messages to my Yahoo account will
still get through.

From pje at telecommunity.com  Wed Aug 17 21:54:34 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 17 Aug 2005 15:54:34 -0400
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <c41f67b905081712174fb3dd34@mail.gmail.com>
References: <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com>
	<3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
	<5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com>
	<4301124C.7040708@colorstudy.com>
	<5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com>

At 12:17 PM 8/17/2005 -0700, Shannon -jj Behrens wrote:
> > (And I'm also aware that "scaling down" is important, but the rule that all
> > state goes either in the browser or the application DB scales down just as
> > well as it scales up.)
>
>What's wrong with storing serialized session state in the database?

Nothing.  My point was that state either belongs to the client, or it 
belongs to the *application* database.  It's web-tier storage that forces 
you to do session affinity when scaling the number of web servers, and to 
deal with locking and other issues when scaling processes on a single web 
server.  The database tier is also the best place for persistent storage of 
users' data because it then reflects a *consistent* state with all the 
other application data.  If you restore it from a backup after a crash, the 
data is consistent.  Likewise, you only have one set of DBAs, and only one 
system to crashproof.  If you're building a system with a lot of users that 
causes somebody to lose thousands of dollars a minute when the system's 
down, you really want to minimize the number of moving parts, and have a 
relatively simple recovery strategy, in which "lose everybody's session 
data because we can't restore the DB and the session store to the same 
state" is not a recommended option.

Meanwhile, clients scale with the number of clients, so if you can get away 
with storing something client side, then that works great.  Most 
client-side storage I've done is for stuff that if the client fakes it, you 
really don't care.  If they fake their default reporting selections, for 
example, who cares?


From mike_mp at zzzcomputing.com  Thu Aug 18 00:25:19 2005
From: mike_mp at zzzcomputing.com (mike bayer)
Date: Wed, 17 Aug 2005 18:25:19 -0400 (EDT)
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com>
References: <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com>
	<3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
	<5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com>
	<4301124C.7040708@colorstudy.com>
	<5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com>
	<5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com>
Message-ID: <61788.66.192.34.8.1124317519.squirrel@66.192.34.8>

Phillip J. Eby said:
>  My point was that state either belongs to the client, or it
> belongs to the *application* database.  It's web-tier storage that forces
> you to do session affinity when scaling the number of web servers, and to
> deal with locking and other issues when scaling processes on a single web
> server.  The database tier is also the best place for persistent storage
> of
> users' data because it then reflects a *consistent* state with all the
> other application data.

this is definitely the best approach for a big-time, multi-servered
architecture.  But even in this case, I think its a good idea to approach
per-user-session state information with code that is conceptually aware of
it being session-scoped information...meaning even if all my state is in
the database, id still want to access state which is session-scoped via a
"session" API.  having a strong concept of "session scope" makes it easier
to model things like data caching for the right amount of time, user
interface flow, creating multi-step transactions, etc.

the point of the session API with the switchable backend is you can build
smaller applications and prototypes with file-based sessions and later
expand the backend to talk to a database.  an application should ideally
be able to put whatever is "session-scoped" into that session without
concern for size or efficiency....its the backend's job to be ready for
it.

there is value in being able to use the concept of "sessions" without
having to create a specialized database schema every single time, despite
the fact that the specialized schema becomes necessary when you want to
scale up.

- mike

From pje at telecommunity.com  Thu Aug 18 00:49:22 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 17 Aug 2005 18:49:22 -0400
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <61788.66.192.34.8.1124317519.squirrel@66.192.34.8>
References: <5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com>
	<5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com>
	<3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
	<5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com>
	<4301124C.7040708@colorstudy.com>
	<5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com>
	<5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com>

At 06:25 PM 8/17/2005 -0400, mike bayer wrote:
>But even in this case, I think its a good idea to approach
>per-user-session state information with code that is conceptually aware of
>it being session-scoped information...meaning even if all my state is in
>the database, id still want to access state which is session-scoped via a
>"session" API.  having a strong concept of "session scope" makes it easier
>to model things like data caching for the right amount of time, user
>interface flow, creating multi-step transactions, etc.

That really hasn't been my experience.  Partly, this is because I tend to 
use RESTful approaches that put 99% of all statefulness in the 
browser.  For example, if I have a multi-page form, I embed all the 
previous pages' data as hidden fields on the subsequent pages.  The entire 
form is processed by a single validation routine, so it doesn't matter what 
the client sends or in what order, because as soon as all the data is both 
present and valid, the form is done.  Similarly, the vast majority of UI 
flow is easiest to model as URL-per-state, so that the browser is in charge 
of the flow, and the back button works.

As for caching, that's something that you tune when you have to tune it, 
for whatever you're tuning it for.  And that's on the basis of what type of 
object you're persisting.  Note that if you have a Cart type, let's say, 
then you don't really have a case where some Carts are session-specific and 
some are not!  Session-like behavior is inherent in the object types 
involved, so there's no real benefit to creating a secondary classification 
scheme for session scope.  The only session API I need in that case is:

     cart = get_cart(get_cart_id(request))

And since the cart is just another persistent application object, it's part 
of the same transaction, and I have nothing else to mess around with.

You also mentioned prototyping, but a good object persistence toolkit 
shouldn't be tied strictly to SQL; you ought to be able to plug in a 
"pickle all the data to disk" mode and use it for *all* your application 
data, not just the session-specific objects.


From fumanchu at amor.org  Thu Aug 18 01:05:17 2005
From: fumanchu at amor.org (Robert Brewer)
Date: Wed, 17 Aug 2005 16:05:17 -0700
Subject: [Web-SIG] and now for something completely different!
Message-ID: <3A81C87DC164034AA4E2DDFE11D258E37727DB@exchange.hqamor.amorhq.net>

Phillip J. Eby wrote:
> You also mentioned prototyping, but a good object persistence toolkit 
> shouldn't be tied strictly to SQL; you ought to be able to plug in a 
> "pickle all the data to disk" mode and use it for *all* your 
> application data, not just the session-specific objects.

And for extra points, a good object-persistence toolkit should let you
put some data into a DB and some into shelve and leave some in RAM. You
pick.

Oh, but this is web-sig, not db-sig. ;)


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org

From jjinux at gmail.com  Thu Aug 18 03:08:04 2005
From: jjinux at gmail.com (Shannon -jj Behrens)
Date: Wed, 17 Aug 2005 18:08:04 -0700
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com>
References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
	<5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com>
	<4301124C.7040708@colorstudy.com>
	<5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com>
	<5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com>
	<61788.66.192.34.8.1124317519.squirrel@66.192.34.8>
	<5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com>
Message-ID: <c41f67b905081718085accacf3@mail.gmail.com>

I checked with a bunch of "really smart people" who are familiar with
a variety of Web technologies.  I was worried that this idea "sessions
are considered evil" was widespread, and I didn't know about it. 
Apparently, that is not the case.  Phillip, I'm not discounting your
opinion or even arguing against it, but apparently, the entire world
didn't decide to start hating session scope behind my back ;)

/me giggles

-jj

On 8/17/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 06:25 PM 8/17/2005 -0400, mike bayer wrote:
> >But even in this case, I think its a good idea to approach
> >per-user-session state information with code that is conceptually aware of
> >it being session-scoped information...meaning even if all my state is in
> >the database, id still want to access state which is session-scoped via a
> >"session" API.  having a strong concept of "session scope" makes it easier
> >to model things like data caching for the right amount of time, user
> >interface flow, creating multi-step transactions, etc.
> 
> That really hasn't been my experience.  Partly, this is because I tend to
> use RESTful approaches that put 99% of all statefulness in the
> browser.  For example, if I have a multi-page form, I embed all the
> previous pages' data as hidden fields on the subsequent pages.  The entire
> form is processed by a single validation routine, so it doesn't matter what
> the client sends or in what order, because as soon as all the data is both
> present and valid, the form is done.  Similarly, the vast majority of UI
> flow is easiest to model as URL-per-state, so that the browser is in charge
> of the flow, and the back button works.
> 
> As for caching, that's something that you tune when you have to tune it,
> for whatever you're tuning it for.  And that's on the basis of what type of
> object you're persisting.  Note that if you have a Cart type, let's say,
> then you don't really have a case where some Carts are session-specific and
> some are not!  Session-like behavior is inherent in the object types
> involved, so there's no real benefit to creating a secondary classification
> scheme for session scope.  The only session API I need in that case is:
> 
>      cart = get_cart(get_cart_id(request))
> 
> And since the cart is just another persistent application object, it's part
> of the same transaction, and I have nothing else to mess around with.
> 
> You also mentioned prototyping, but a good object persistence toolkit
> shouldn't be tied strictly to SQL; you ought to be able to plug in a
> "pickle all the data to disk" mode and use it for *all* your application
> data, not just the session-specific objects.
> 
> 


-- 
I have decided to switch to Gmail, but messages to my Yahoo account will
still get through.

From pje at telecommunity.com  Thu Aug 18 03:32:45 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 17 Aug 2005 21:32:45 -0400
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <c41f67b905081718085accacf3@mail.gmail.com>
References: <5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com>
	<3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
	<5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com>
	<4301124C.7040708@colorstudy.com>
	<5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com>
	<5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com>
	<61788.66.192.34.8.1124317519.squirrel@66.192.34.8>
	<5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050817211540.027e99d8@mail.telecommunity.com>

At 06:08 PM 8/17/2005 -0700, Shannon -jj Behrens wrote:
>I checked with a bunch of "really smart people" who are familiar with
>a variety of Web technologies.  I was worried that this idea "sessions
>are considered evil" was widespread, and I didn't know about it.

Sadly, it's not widespread, any more than RESTful applications are, or 
object-publishing, or any of the other "the way the web was won" approaches 
to web applications.  In the Java world, for example, it's just assumed 
that you have to apply tons of resources and trickery to scale your 
sessions, because that's just How Things Are.

The reason it's How Things Are in Java-land is because Java made sessions 
part of their servlet and other specs right from the start -- a serious 
error that I was hoping we could avoid in Python-land.  At least PHP gives 
you session management hooks that make it easy to put session data in the 
application database!

It is, however, becoming gradually known in Java-land that the "physical 
three-tier model" is insane, and IMO that model is fairly closely related 
to the idea that you should store sessions in the web tier.  I'd guess it's 
going to be a couple more years before "web tier sessions considered 
harmful" is known by any but the most cynical
veterans of building high-volume, database-intensive applications, though.

To be precise, what I object to are:

1. Web-tier sessions that store application data in a different database 
that may or may not be backed up, and may or may not even be a "decent" 
database

2. "bag of data" sessions that encourage people to throw arbitrary objects 
in there without thinking about what the information's real lifetime 
is.  (If it's a preference, you want it to either persist on the client or 
the server, permanently.  If it's credentials, you want it to time out on 
the client.  If it's application state, you really need it in your database 
for integrity/synchronization reasons.  If it's transient state like a 
status message, it doesn't belong in the DB, it belongs on the client.  And 
so on.)

So, given these principles, I don't see much need for a session manager 
besides client-state management, and a good O-R mapper.  If you have those, 
then the rest is trivial.


From ianb at colorstudy.com  Thu Aug 18 04:16:32 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed, 17 Aug 2005 21:16:32 -0500
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <5.1.1.6.0.20050817211540.027e99d8@mail.telecommunity.com>
References: <5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com>	<3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>	<5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com>	<4301124C.7040708@colorstudy.com>	<5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com>	<5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com>	<61788.66.192.34.8.1124317519.squirrel@66.192.34.8>	<5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com>
	<5.1.1.6.0.20050817211540.027e99d8@mail.telecommunity.com>
Message-ID: <4303EF80.2060706@colorstudy.com>

Phillip J. Eby wrote:
> The reason it's How Things Are in Java-land is because Java made sessions 
> part of their servlet and other specs right from the start -- a serious 
> error that I was hoping we could avoid in Python-land.

Too late; all the major (and even all the minor) Python web programming 
environments have sessions.


> At least PHP gives 
> you session management hooks that make it easy to put session data in the 
> application database!

That shouldn't be hard here either.

-- 
Ian Bicking  /  ianb at colorstudy.com  / http://blog.ianbicking.org

From mike_mp at zzzcomputing.com  Thu Aug 18 04:33:09 2005
From: mike_mp at zzzcomputing.com (michael bayer)
Date: Wed, 17 Aug 2005 22:33:09 -0400
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com>
References: <5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com>
	<5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com>
	<3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
	<5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com>
	<4301124C.7040708@colorstudy.com>
	<5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com>
	<5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com>
	<5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com>
Message-ID: <467390AD-4E57-4A6D-838B-B972EDF84AD3@zzzcomputing.com>


On Aug 17, 2005, at 6:49 PM, Phillip J. Eby wrote:

> That really hasn't been my experience.  Partly, this is because I  
> tend to use RESTful approaches that put 99% of all statefulness in  
> the browser.  For example, if I have a multi-page form, I embed all  
> the previous pages' data as hidden fields on the subsequent pages.   
> The entire form is processed by a single validation routine, so it  
> doesn't matter what the client sends or in what order, because as  
> soon as all the data is both present and valid, the form is done.   
> Similarly, the vast majority of UI flow is easiest to model as URL- 
> per-state, so that the browser is in charge of the flow, and the  
> back button works.

its usually not my experience either, and I have rarely written any  
kind of app that uses sessions.  99% of everything I've done relies  
upon browser state as well.  although despite my being there "when  
the web was won" in 95, I am hesitant to call myself a RESTFUL  
developer...to me, REST seems to be taking some common sense ideas  
and turning them into some kind of rigid ideological crusade, which  
is just as bad as all the other ideological crusades we "web winners"  
had to fight with IIS and active server pages, EJB, UML, SOAP, etc.

the app i work on is a document mangement system where users have to  
edit large sets of fields, and do alot of reloading in order to load  
in new sections of the document or save various subsets of data.  Its  
been running and being expanded regularly for several years, and it  
does it all using client-state only, but it has begun to outgrow that  
approach; it would be much more succinctly written storing the user's  
current workspace within something that at least conceptually is a  
"session".  it would also allow popups, IFRAMES and future Ajax  
controls to all access the same user-workspace without having to  
perform vast Javascript gymnastics (which it does right now).

a document editing system is also a good example of where objects  
need to be persisted in two different scopes, i.e. a session-scope as  
well as a permanent scope.   I dont really think a session has  
anything to do with a "physical three-tiered model".   physically, it  
can be whereever you want.  i just think its advantageous from a  
conceptual point of view.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/web-sig/attachments/20050817/fd0c9ffc/attachment.htm

From pje at telecommunity.com  Thu Aug 18 04:51:35 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 17 Aug 2005 22:51:35 -0400
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <4303EF80.2060706@colorstudy.com>
References: <5.1.1.6.0.20050817211540.027e99d8@mail.telecommunity.com>
	<5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com>
	<3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
	<5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com>
	<4301124C.7040708@colorstudy.com>
	<5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com>
	<5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com>
	<61788.66.192.34.8.1124317519.squirrel@66.192.34.8>
	<5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com>
	<5.1.1.6.0.20050817211540.027e99d8@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050817223310.01b307e8@mail.telecommunity.com>

At 09:16 PM 8/17/2005 -0500, Ian Bicking wrote:
>Phillip J. Eby wrote:
>>The reason it's How Things Are in Java-land is because Java made sessions 
>>part of their servlet and other specs right from the start -- a serious 
>>error that I was hoping we could avoid in Python-land.
>
>Too late; all the major (and even all the minor) Python web programming 
>environments have sessions.

I seem to recall that it's part of the Java servlets *specification*, 
whereas we did manage to avoid that trap in WSGI.  :)


>>At least PHP gives you session management hooks that make it easy to put 
>>session data in the application database!
>
>That shouldn't be hard here either.

Yep.  That's why I was pushing for standardizing that part separately from 
any actual storage facility, and for having good ways of managing the 
client-side state, which every "session" facility needs.

If client-side state management turns out to be more library than framework 
or spec, so be it; we can nominate it for stdlib inclusion in 2.5, and it's 
one less thing for people to think about.  "Boring" in this case is a good 
thing, it means we have a solved problem.  :)

What I *don't* want to standardize is the "bag of persistent objects" 
session interface as the primary way of accessing session data; I'd rather 
make the client key <-> retrieval aspect explicit, so that it's clear that 
you can totally choose how that links up, e.g.:

       session_id = get_client_state(env, 'session.id', new_hook, timeout_hook)
       my_bag_of_junk = session_store[session_id]

To put it another way: I'd like to distinguish "session variables" 
(client-side values) from "session objects" (server-side objects), and make 
the boundary between them very clear in the API.  That doesn't mean a 
session store can't offer a shortcut API, but hopefully the standardization 
of session object stores is *in terms of* the session variables API, so 
that e.g. the callbacks you need are the same, you still specify somewhere 
what session variable you'll use, etc.

Note too that focusing our effort at this API level lets us address 
"interesting" problems such as when redirection is needed to start a 
session, when we need to replace page content to notify that a session has 
timed out, etc.  These are all client-state management problems and nothing 
to do with the persistence question, but are more interesting problems to 
solve (IMO) than re-solving the same old object persistence problems over 
and over again.


From pje at telecommunity.com  Thu Aug 18 05:00:49 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 17 Aug 2005 23:00:49 -0400
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <467390AD-4E57-4A6D-838B-B972EDF84AD3@zzzcomputing.com>
References: <5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com>
	<5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com>
	<5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com>
	<3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
	<5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com>
	<4301124C.7040708@colorstudy.com>
	<5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com>
	<5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com>
	<5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050817225353.01b358e8@mail.telecommunity.com>

At 10:33 PM 8/17/2005 -0400, michael bayer wrote:
>its usually not my experience either, and I have rarely written any kind 
>of app that uses sessions.  99% of everything I've done relies upon 
>browser state as well.  although despite my being there "when the web was 
>won" in 95, I am hesitant to call myself a RESTFUL developer...to me, REST 
>seems to be taking some common sense ideas and turning them into some kind 
>of rigid ideological crusade, which is just as bad as all the other 
>ideological crusades we "web winners" had to fight with IIS and active 
>server pages, EJB, UML, SOAP, etc.

I agree; I just find it useful to use the REST banner because before that 
word came around, there was nothing to call the approach.  I'm a pragmatic 
RESTee in that browsers don't do PUT and DELETE so POST is pretty much what 
we have to work with for human-usable applications today.


>a document editing system is also a good example of where objects need to 
>be persisted in two different scopes, i.e. a session-scope as well as a 
>permanent scope.   I dont really think a session has anything to do with a 
>"physical three-tiered model".   physically, it can be whereever you 
>want.  i just think its advantageous from a conceptual point of view.

I don't object to server-side objects that are session-specific; I object 
to the "bag of arbitrary objects" session interface, that is typically 
stored in a web tier or middle tier.  Those are two distinct sins that are 
usually coupled in what most people think of as "a session".  When I say I 
consider sessions harmful, it's specifically those two characteristics of 
the common meaning of the term.  I'm not saying that I think there's no 
such thing as a "session" in the sense of a browsing session.  Shopping 
carts would be pretty hard to do, for example, without session-specific 
server-side objects.  I just think that storing the shopping cart data in 
anything other than your application database is almost certainly a Very 
Bad Idea.


From ianb at colorstudy.com  Thu Aug 18 05:21:41 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed, 17 Aug 2005 22:21:41 -0500
Subject: [Web-SIG] Session interface, v2
Message-ID: <4303FEC5.3050408@colorstudy.com>

Same location:

http://svn.colorstudy.com/home/ianb/scarecrow_session_interface.py

This version separates out SessionManager from SessionStore, and 
suggests that managers be per-application (or maybe per-framework).  I 
also expanded docstrings and bunch of other changes.  Open questions are 
marked with ???.  I'm also copying the interface below (example at the 
bottom):

class SessionError(Exception):
     pass

class InvalidSession(SessionError):
     """
     Raised when an invalid session ID is used.
     """

class SessionNotFound(SessionError, LookupError):
     """
     Raised when a session can't be found.
     """

class ConflictError(SessionError):
     """
     Raised when the ``locking_policy`` is ``optimistic``, and a
     session being saved is stale.
     """

def create_session_id():
     """Return a unique session ID (an ASCII string).

     This string must be made up of a-zA-Z0-9_-.

     ???: Should we allow hints, like ``REMOTE_ADDR``?
     """

class ISessionListener:

     """
     Objects with this interface can be appended to the ``listener`` 
attribute of a
     session manager or session.
     """

     def create_session(session_store, new_session):
         """Called when a new session is created.
         """

     def delete_session(session_store, session_id):
         """Called before a session is deleted.

         This can load the session; this will not affect the ultimate
         deletion of the session.
         """

     def rollback(session_store, session):
         """Called before a session is abandoned via .rollback()"""

class ISessionManager:

     """
     The session manager represents policy related to sessions;
     expiration, collection, locking.  It also typically belongs to one
     'application', and ties together the session store with the
     session objects.
     """

     id = """The string-identifier for this session manager.

     All applications that share this session manager need to use the
     same id when creating the session manager.

     This string should be made up of a-zA-Z0-9_.-
     """

     locking_policy = """The lock policy.

     This is one of these strings:

     ``'optimistic'``:
       Optimistic locking; concurrent sessions may be opened for writing;
       however, if a session is saved that was loaded before the last save
       of the session, a ConflictError will be raised.

     ``'lossy'``:
       First-come-first-serve.  No locking is done; if a session is written
       it overwrites any other session data that was written.

     ``'serialized`'':
       All sessions opened for writing are serialized; the request is
       blocked until session is available to be opened.
     """

     session_factory = """A callable to produce sessions

     This should be a class or object like ``ISession``.
     """

     listeners = """A list of ISessionListeners.

     When certain events happen, a method on every object in this list
     will be called.
     """

     store = """A ISessionStore"""

     def __init__(id, store, session_factory, locking_policy='lossy'):
         """Initialize the variables

         ???: Does ``__init__`` need to be standardized?
         """


     def load_session(id):
         """Return the session from the given ID.

         This method may block if ``locking_policy`` is ``'serialized'``.

         ???: Does this always return a new session object?  I think it
         shouldn't.
         """

     def load_session_read_only(id):
         """Return a read-only version of the session.

         Read-only sessions do not need to be locked as aggressively.
         Also, loading a read-only session will not update its
         last-accessed time, so you may use this to peek at sessions.

         This cannot ensure that the values stored in the session are
         immutable, so it is very possible that you could make implicit
         changes to the session object and then they will be thrown
         away.
         """

     def create_session(id=None):
         """Create a new session object for the given id.

         If ``id`` is None then a new id will be generated.

         This will call ``session_listener.create(session_store, 
new_session)``
         """

     def save_session(session):
         """Save the given session.

         This may raise a ``ConflictError``
         """

     def unlock_session(session):
         """If the session store is locked for any reason, unlock it.

         It is not an error if no lock exists on the session.
         ``save_session()`` implies ``unlock_session()``.

         This method makes the session obsolete.
         """

     def delete_session(id):
         """Delete the given session.

         This is given the id of the session, not the session object
         itself.

         This calls ``session_listener.delete(session_store,
         session_id)``.
         """

     def delete_expired_sessions():
         """Scan for and delete any expired sessions.

         ???: How are sessions defined to be expired?  Should listeners
         participate?  Should they be able to cancel an expiration?
         """

     def session_ids():
         """Return a list of session IDs.

         ???: Should this return other metadata, like last accessed
         time?
         """

     def last_accessed(id):
         """The integer timestamp when the identified session was last
         accessed.

         Loading the session read-only does not update this value, only
         writing or calling ``touch()``
         """

     def last_written(id):
         """The integer timestamp when the session was last written to
         """

     def touch(id):
         """Update the session's last_accessed time.
         """

class ISession:

     id = """The string (str, not unicode) ID of this session"""

     manager = """Reference to parent ISessionManager object"""

     read_only = """Boolean, if this session was loaded read-only"""

     last_accessed = """Last access integer timestamp"""

     creation_time = """Creation integer timestamp"""

     loaded_timestamp = """Integer timestamp when session was loaded

     If the session manager's ``locking_policy`` is ``optimistic``, when the
     session is saved if the ``last_written`` time is later than this time
     a ``ConflictError`` will be raised.
     """

     obsolete = """
     Boolean; true if this session object has been deleted.  All
     other methods should fail once this is true.  This attribute
     is writable."""

     listeners = """A list of ISessionListener instances"""

     data = """The data being stored.

     This should be pickleable.  The other instance variables are 
metadata, and
     are not saved as the 'body' of the session; only this data is.

     Typically this is a dictionary-like object; however, if you want
     application-specific storage this object could have a specific 
interface,
     so long as your session store understands how to save it.

     ???: Should there be some way to identify this kind of
     tightly-bound-to-storage session data from free-form (like a 
dictionary)
     session data?  If there was, then application-specific storage 
could use
     something custom for its sessions, but fall on something more generic
     (e.g., pickle and stuff the string somewhere) for other sessions.
     """

     # ???: Should the expire time be overloadable on a per-session
     # basis?  If listeners can cancel the expiration, then this can be
     # done in an ad hoc way

     # ???: Should there be a way of marking the session "dirty"?  Maybe
     # some soft version of a hash should be kept to detect changes?  (a
     # hash that could hash mutable objects)

     def __init__(id, manager, read_only, last_accessed, creation_time, 
data):
         """Create the session object

         If the session is new, then ``data`` will be none; otherwise it 
will contain
         the unpickled data.
         """

     def __getitem__(name):
         """Return the object by the given name."""

     def __setitem__(name, value):
         """Add or overwrite the named object.

         The object should be pickleable.
         """

     def __delitem__(name, value):
         """Delete the named object."""

     def touch():
         """Update the session's last_accessed time.

         Typically just calls ``self.manager.touch(self.id)``
         """

     def commit():
         """Calls ``self.manager.save_session(self)``
         """

     def rollback():
         """Calls ``self.manager.unlock_session(self)``.

         Also calls ``session_listener.rollback(self)``.
         """

class ISessionStore:

     """
     This is responsible for storing sessions.
     """

     def save_session(session):
         """Save the session

         This uses both ``session.id`` and ``session.store.id`` to save 
the session.
         """

     def load_session(session_store_id, session_id, read_only, 
session_factory):
         """Load the session"""

     def session_ids(session_store_id):
         """Returns a list of session IDs

         ???: Plus other metadata?
         """

     def delete_session(session_store_id, session_id):
         """Delete the session"""

     def touch(session_store_id, session_id):
         """Update the last accessed time for the session"""

     def write_lock_session(session_store_id, session_id):
         """Lock the session for writing

         ???: Should there be a way of loading a session without
         blocking on a lock (e.g., getting an exception when trying to
         load a locked exception)?
         """


"""
Example usage::

     session_store = (create or identify from configuration)

     # This is in a typical web framework...

     def get_session(request):
         session_id = request.get_cookie('session_id')
         if session_id is None:
             session_id = create_session_id()
             request.response.set_cookie('session_id', session_id)
         session_manager = get_session_manager(request)
         session = session_manager.load_session(session_id)
         # A callback to be run when the request has been finished:
         request.run_when_done(session_store.save_session, session)
         return session

     def get_session_store(request):
         # The application id should be unique to this instance of the
         # application.  But if you don't mind being a little sloppy
         # you could use the framework name here (that would make it
         # possible for an application to clobber the session variables
         # from another application).
         appid = get_app_id(request)
         session_store = SessionManager(appid, 
get_session_store(request), MySessionClass)
         return session_store

     def get_session_store(request):
         return request.environ['session.store']

     class MySessionClass(UserDict):
         def __init__(self, id, manager, read_only, last_accessed, 
creation_time, data):
             self.id = id
             self.manager = manager
             self.read_only = read_only
             self.last_accessed = last_accessed
             self.creation_time = creation_time
             if data is None:
                 data = {}
             self.data = data

"""

From renesd at gmail.com  Thu Aug 18 05:43:38 2005
From: renesd at gmail.com (Rene Dudfield)
Date: Thu, 18 Aug 2005 13:43:38 +1000
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <5.1.1.6.0.20050817225353.01b358e8@mail.telecommunity.com>
References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
	<5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com>
	<4301124C.7040708@colorstudy.com>
	<5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com>
	<5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com>
	<5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com>
	<467390AD-4E57-4A6D-838B-B972EDF84AD3@zzzcomputing.com>
	<5.1.1.6.0.20050817225353.01b358e8@mail.telecommunity.com>
Message-ID: <64ddb72c05081720434ff0868d@mail.gmail.com>

Some more requirements for sessions can be found at the php page on sessions.

Hash function declaring:
   Chosing eg md5/sha.  Also by using a distributed hash function you
can easily route the request to a specific web server.  So with one
rewrite rule you can have your scalable sessions/session affinity. 
The function could simply append the number 1-100 in front of session
id which relates to a particular webserver.
tag rewriting.  
    ie.Which tags to do rewriting in.  eg where it appends
?SESSIONID=ABCFED938743523 to your output html.


url_rewriter.tags  string

    url_rewriter.tags specifies which HTML tags are rewritten to
include session id if transparent sid support is enabled. Defaults to
a=href,area=href,frame=src,input=src,form=fakeentry,fieldset=


http://php.net/session

... and now for all the arguments pro Session rolled up into one paragraph.

Taking load off the database server(with sessions) is a way to make an
application more scalable.  Often the database server is the
bottleneck of the web app.  Being able to move some load to the
client, or the webservers is a good option to have.  Being able to not
use 2 tiers is also what people may want.  In this way sessions allow
you to scale up, and down.  Sessions allow you to do a lot of jobs
which databases are not needed for.  Sessions are also more reliable,
and secure than cookies.  Cookies may not be enabled on the browser,
and storing some stuff on the client side in the clear, or even
encrypted is dangerous.  Sessions are understood by a large amount of
php/java/perl/apache people.  Lots of the python web frameworks have
implemented sessions too.  This means sessions will be used.  So
making a good working implementation of sessions that everyone can
share would be double plus good.


On 8/18/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 10:33 PM 8/17/2005 -0400, michael bayer wrote:
> >its usually not my experience either, and I have rarely written any kind
> >of app that uses sessions.  99% of everything I've done relies upon
> >browser state as well.  although despite my being there "when the web was
> >won" in 95, I am hesitant to call myself a RESTFUL developer...to me, REST
> >seems to be taking some common sense ideas and turning them into some kind
> >of rigid ideological crusade, which is just as bad as all the other
> >ideological crusades we "web winners" had to fight with IIS and active
> >server pages, EJB, UML, SOAP, etc.
> 
> I agree; I just find it useful to use the REST banner because before that
> word came around, there was nothing to call the approach.  I'm a pragmatic
> RESTee in that browsers don't do PUT and DELETE so POST is pretty much what
> we have to work with for human-usable applications today.
> 
> 
> >a document editing system is also a good example of where objects need to
> >be persisted in two different scopes, i.e. a session-scope as well as a
> >permanent scope.   I dont really think a session has anything to do with a
> >"physical three-tiered model".   physically, it can be whereever you
> >want.  i just think its advantageous from a conceptual point of view.
> 
> I don't object to server-side objects that are session-specific; I object
> to the "bag of arbitrary objects" session interface, that is typically
> stored in a web tier or middle tier.  Those are two distinct sins that are
> usually coupled in what most people think of as "a session".  When I say I
> consider sessions harmful, it's specifically those two characteristics of
> the common meaning of the term.  I'm not saying that I think there's no
> such thing as a "session" in the sense of a browsing session.  Shopping
> carts would be pretty hard to do, for example, without session-specific
> server-side objects.  I just think that storing the shopping cart data in
> anything other than your application database is almost certainly a Very
> Bad Idea.
>

From floydophone at gmail.com  Thu Aug 18 06:03:02 2005
From: floydophone at gmail.com (Peter Hunt)
Date: Thu, 18 Aug 2005 00:03:02 -0400
Subject: [Web-SIG] and now for something completely different!
Message-ID: <6654eac40508172103342ad54a@mail.gmail.com>

Phillip -

I agree with you on all counts, except for the issue of how to determine 
when a session ends (timeouts, etc), and how to clean up the associated 
objects (Carts etc) with them.

Peter Hunt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/web-sig/attachments/20050818/71310f4a/attachment.htm

From fumanchu at amor.org  Thu Aug 18 07:35:13 2005
From: fumanchu at amor.org (Robert Brewer)
Date: Wed, 17 Aug 2005 22:35:13 -0700
Subject: [Web-SIG] and now for something completely different!
Message-ID: <3A81C87DC164034AA4E2DDFE11D258E37727DC@exchange.hqamor.amorhq.net>

Phillip J. Eby wrote:
> I'm a pragmatic RESTee in that browsers don't do PUT
> and DELETE, so POST is pretty much what we have to
> work with for human-usable applications today.

Unless you can rely on XmlHttpRequest, which supports arbitrary methods
(which is why I made CP 2.1 fully support all HTTP methods).
Fortunately, I'm currently in a position where I can do that. ;)


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org

From chrism at plope.com  Thu Aug 18 07:41:51 2005
From: chrism at plope.com (Chris McDonough)
Date: Thu, 18 Aug 2005 01:41:51 -0400
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <64ddb72c05081720434ff0868d@mail.gmail.com>
References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
	<5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com>
	<4301124C.7040708@colorstudy.com>
	<5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com>
	<5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com>
	<5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com>
	<467390AD-4E57-4A6D-838B-B972EDF84AD3@zzzcomputing.com>
	<5.1.1.6.0.20050817225353.01b358e8@mail.telecommunity.com>
	<64ddb72c05081720434ff0868d@mail.gmail.com>
Message-ID: <1124343711.31954.26.camel@plope.dyndns.org>

On Thu, 2005-08-18 at 13:43 +1000, Rene Dudfield wrote:

> ... and now for all the arguments pro Session rolled up into one paragraph.
> 
> Taking load off the database server(with sessions) is a way to make an
> application more scalable.  

In my experience, they can make applications less scalable because
typically people don't need to know much about how sessions work so they
tend to overuse them without understanding their cost.  Very general
persistent session implementations that serialize object data into a
blob are typically even more expensive than simple relational database
row reads and writes, too.  This cost is amplified by session ease of
use.

> Often the database server is the
> bottleneck of the web app.  Being able to move some load to the
> client, or the webservers is a good option to have.

This is probably true for a lot of folks but my web apps are almost
always CPU bound at the web/application server.  I wish I had the
database-too-slow problem.

>   Being able to not
> use 2 tiers is also what people may want.  In this way sessions allow
> you to scale up, and down.  Sessions allow you to do a lot of jobs
> which databases are not needed for.

Typically persistent sessions are backed by some sort of database
anyway.  It's just that they're craftily coded in such a way that you
typically don't need to know much about it.

>   Sessions are also more reliable,
> and secure than cookies.
>  Cookies may not be enabled on the browser,

The most common way of enabling sessions is via cookies, and whether
sessions work reliably or not is often contingent on cookies.  Formvar
or URL-encoded session identifiers tend to be hit-and-miss and much
harder to maintain across pages.

> and storing some stuff on the client side in the clear, or even
> encrypted is dangerous.

I agree.  At least it's harder to get right.

>   Sessions are understood by a large amount of
> php/java/perl/apache people.  Lots of the python web frameworks have
> implemented sessions too.  This means sessions will be used.  So
> making a good working implementation of sessions that everyone can
> share would be double plus good.

What might be more practical and easier to think about because its scope
is so much smaller is a a common "browser identifier" implementation.

The most useful purpose of a session is to allow you to store state
across requests by some anonymous browser.  If you can reliably detect
that "the requesting browser is the browser identified by token ABC123"
and that token can be associated with the browser reliably for some
extended period of time, that's half the battle.  This can be done with
a cookie, a URL element, a form variable, or a query string element.
The association between an identifier and a browser doesn't really even
need to time out; it could live forever with no ill effect.

Creating namespaces that can be written to from within application code
and which expire after some number of minutes of inactivity and so forth
(aka sessions) could be written in terms of storing and retrieving data
based on this browser identifier.

- C


From ianb at colorstudy.com  Thu Aug 18 17:32:36 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu, 18 Aug 2005 10:32:36 -0500
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <64ddb72c05081720434ff0868d@mail.gmail.com>
References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>	<5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com>	<4301124C.7040708@colorstudy.com>	<5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com>	<5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com>	<5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com>	<467390AD-4E57-4A6D-838B-B972EDF84AD3@zzzcomputing.com>	<5.1.1.6.0.20050817225353.01b358e8@mail.telecommunity.com>
	<64ddb72c05081720434ff0868d@mail.gmail.com>
Message-ID: <4304AA14.90207@colorstudy.com>

Rene Dudfield wrote:
> Some more requirements for sessions can be found at the php page on sessions.
> 
> Hash function declaring:
>    Chosing eg md5/sha.  Also by using a distributed hash function you
> can easily route the request to a specific web server.  So with one
> rewrite rule you can have your scalable sessions/session affinity. 
> The function could simply append the number 1-100 in front of session
> id which relates to a particular webserver.

Right-o, I've seen that feature before.  Maybe create_session_id() 
should grow a prefix argument, and for now it'll be up to the glue code 
to provide that.  It's really a configuration parameter.  Though I 
suppose you could turn the SERVER_ADDR into a 8-byte code, which would 
probably identify the proper server.  Or maybe you should pick it up 
from an environmental variable... bah, it'll only be clear in the 
context of a specific environment and configuration.

> tag rewriting.  
>     ie.Which tags to do rewriting in.  eg where it appends
> ?SESSIONID=ABCFED938743523 to your output html.

That would certainly be well implemented by middleware.

> url_rewriter.tags  string
> 
>     url_rewriter.tags specifies which HTML tags are rewritten to
> include session id if transparent sid support is enabled. Defaults to
> a=href,area=href,frame=src,input=src,form=fakeentry,fieldset=

Huh, what are fakeentry and fieldset?


-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From pje at telecommunity.com  Thu Aug 18 18:44:09 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 18 Aug 2005 12:44:09 -0400
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <6654eac40508172103342ad54a@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050818124137.01ad9f20@mail.telecommunity.com>

At 12:03 AM 8/18/2005 -0400, Peter Hunt wrote:
>Phillip -
>
>I agree with you on all counts, except for the issue of how to determine 
>when a session ends (timeouts, etc), and how to clean up the associated 
>objects (Carts etc) with them.

I'm not sure I ever said how I clean up the associated objects, but my 
preference is to have an automated process remove them when they haven't 
been touched for N amount of time, and set the cookie expiration so it 
expires before that N is elapsed.  Or actually, I set the cookie to expire 
after N, and do the cleanup at time N+M.


From jjinux at gmail.com  Thu Aug 18 22:19:10 2005
From: jjinux at gmail.com (Shannon -jj Behrens)
Date: Thu, 18 Aug 2005 13:19:10 -0700
Subject: [Web-SIG] and now for something completely different!
In-Reply-To: <1124343711.31954.26.camel@plope.dyndns.org>
References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net>
	<5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com>
	<4301124C.7040708@colorstudy.com>
	<5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com>
	<5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com>
	<5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com>
	<467390AD-4E57-4A6D-838B-B972EDF84AD3@zzzcomputing.com>
	<5.1.1.6.0.20050817225353.01b358e8@mail.telecommunity.com>
	<64ddb72c05081720434ff0868d@mail.gmail.com>
	<1124343711.31954.26.camel@plope.dyndns.org>
Message-ID: <c41f67b9050818131927acb2b5@mail.gmail.com>

> What might be more practical and easier to think about because its scope
> is so much smaller is a a common "browser identifier" implementation.
> 
> The most useful purpose of a session is to allow you to store state
> across requests by some anonymous browser.  If you can reliably detect
> that "the requesting browser is the browser identified by token ABC123"
> and that token can be associated with the browser reliably for some
> extended period of time, that's half the battle.  This can be done with
> a cookie, a URL element, a form variable, or a query string element.
> The association between an identifier and a browser doesn't really even
> need to time out; it could live forever with no ill effect.
> 
> Creating namespaces that can be written to from within application code
> and which expire after some number of minutes of inactivity and so forth
> (aka sessions) could be written in terms of storing and retrieving data
> based on this browser identifier.

It turns out that having one unique ID per browser is a bad idea. 
Specifically, if a client gives you a cookie with an sid, and you've
never heard about that sid (perhaps the session timed out), create a
new sid.  Also, it makes sense to change the sid when the user
successfully logs in.  There are a newly discovered set of session
injection attacks to be avoided:

    http://www.acros.si/papers/session_fixation.pdf

I hadn't heard about them until recently.  It's interesting reading.

I hope you'll find that to be helpful.

Best Regards,
-jj

-- 
I have decided to switch to Gmail, but messages to my Yahoo account will
still get through.

From renesd at gmail.com  Fri Aug 19 04:27:49 2005
From: renesd at gmail.com (Rene Dudfield)
Date: Fri, 19 Aug 2005 12:27:49 +1000
Subject: [Web-SIG] WSGI app in a zope directory?
Message-ID: <64ddb72c05081819273ddf8645@mail.gmail.com>

Hey,

does anyone know of a way to get a wsgi app inside of zope?

Cheers.

From mso at oz.net  Sat Aug 20 21:56:31 2005
From: mso at oz.net (Mike Orr)
Date: Sat, 20 Aug 2005 12:56:31 -0700
Subject: [Web-SIG] Session interface, v2
In-Reply-To: <4303FEC5.3050408@colorstudy.com>
References: <4303FEC5.3050408@colorstudy.com>
Message-ID: <43078AEF.4000309@oz.net>

Ian Bicking wrote:

>Same location:
>
>http://svn.colorstudy.com/home/ianb/scarecrow_session_interface.py
>  
>


Good work.


>This version separates out SessionManager from SessionStore, and 
>suggests that managers be per-application (or maybe per-framework).
>
There's per-application/per-framework at the class level and instance 
level; I'm not sure which you're referring to.

Regarding the former, I was saying we may be able to make a generic 
SessionManager class usable as-is by several frameworks.  You seemed to 
doubt this, but I argued we should at least try.

The latter only matters in multi-application deployments, where several 
applications (possibly different frameworks) are sharing a session.  
Several features below seem to exist only for this environment, and I'm 
having a hard time evaluating them without knowing the complete use 
cases you're trying to support.  You've said bits about that but maybe 
we can flesh it out.

* Scenario 1: Two apps mounted at /foo and /bar, using a common Paste 
dispatcher.  Both applications are embedded in the same process.  
(threaded or asynchronous servers)
* Scenario 2: Same, but the apps are in separate processes.  The 
dispatcher remains.  (forking servers)
* Scenario 3: Two apps mounted at /foo and /bar, using separate handlers 
in the Apache config.  At no point is there a common Python process 
between them.
* Scenario 4: Two apps in different virtual hosts.
* Scenario 5: Two apps in different webservers.
* Others: ??

Which situations are you trying to support, which session-related 
objects would there be, and how would they interrelate?  At what point 
do we say scenarios won't attract enough users to justify our time?

I'm also not sure how these would relate to your "application inversion" 
paradigm.  I'm used to applications as single long-running units that 
can hold shared state.  But your Paste implementation seems to suggest 
instantiating the application for each URL, and maybe the application 
would last for only one request.  I'm not sure how easy that will be to 
port some applications to it, or how this impacts the session 
classees/instances.


>class SessionError(Exception):
>     pass
>
>class InvalidSession(SessionError):
>     """
>     Raised when an invalid session ID is used.
>     """
>
>class SessionNotFound(SessionError, LookupError):
>     """
>     Raised when a session can't be found.
>     """
>
>class ConflictError(SessionError):
>     """
>     Raised when the ``locking_policy`` is ``optimistic``, and a
>     session being saved is stale.
>     """
>
>def create_session_id():
>  
>

Could go into a SessionCookie class, along with anything else that can 
be used by both session-based and sessionless fans.

>class ISessionListener:
>  
>


Is this just an extra, or what are listeners for?  Is this for 
per-application behavior with a shared manager?

>class ISessionManager:
>
>     id = """The string-identifier for this session manager.
>
>     All applications that share this session manager need to use the
>     same id when creating the session manager.
>  
>


With this rule I was expecting some central repository of session 
managers, and factory functions a la logger.getLogger(), but there 
doesn't seem to be any.  What's the purpose of the SessionManager id?


>     locking_policy = """The lock policy.
>
>     This is one of these strings:
>
>     ``'optimistic'``:
>       Optimistic locking; concurrent sessions may be opened for writing;
>       however, if a session is saved that was loaded before the last save
>       of the session, a ConflictError will be raised.
>
>     ``'lossy'``:
>       First-come-first-serve.  No locking is done; if a session is written
>       it overwrites any other session data that was written.
>
>     ``'serialized`'':
>       All sessions opened for writing are serialized; the request is
>       blocked until session is available to be opened.
>     """
>  
>


Optimistic locking sounds like a pain.  The application would have to 
catch the error and then... what?  Say "Sorry, your form input was 
thrown away."  Redo the operation somehow (isn't that the same as lossy 
operation?).  Reconcile the two states somehow (how?)?  Not that we 
shouldn't provide it, just that it will need more howto documentation.


>     def delete_expired_sessions():
>         """Scan for and delete any expired sessions.
>
>         ???: How are sessions defined to be expired?  Should listeners
>         participate?  Should they be able to cancel an expiration?
>         """
>
>     def session_ids():
>         """Return a list of session IDs.
>
>         ???: Should this return other metadata, like last accessed
>         time?
>         """
>  
>


If so, it shouldn't be called .session_ids().


>class ISession:
>
>     manager = """Reference to parent ISessionManager object"""
>
>     def __init__(id, manager, read_only, last_accessed, creation_time, 
>data):
>         """Create the session object
>
>     def touch():
>         """Update the session's last_accessed time.
>
>         Typically just calls ``self.manager.touch(self.id)``
>         """
>
>     def commit():
>         """Calls ``self.manager.save_session(self)``
>         """
>
>     def rollback():
>         """Calls ``self.manager.unlock_session(self)``.
>
>         Also calls ``session_listener.rollback(self)``.
>         """
>  
>


These look like they don't belong here.  The application already has a 
reference to the SessionManager and should call it directly.  It points 
up a difference in philosophy between the session being a "dumb object" 
(no reference to the manager) vs being manager-aware.  Is the latter 
necessary?  Are you thinking of cases where the session would be 
provided by the middleware, then the application would have dispose of 
the session at the end of the request?  The middleware could provide a 
reference to the session manager for this.  Although that would expose 
irrelevant methods.

> class ISessionStore:
>      def load_session(session_store_id, session_id, read_only,
> session_factory):
>      def session_ids(session_store_id):
>      def delete_session(session_store_id, session_id):
>      def touch(session_store_id, session_id):
>      def write_lock_session(session_store_id, session_id):

Isn't session_store_id 'self'?  Specifying it seems to imply this is a meta SessionStore, not an individual store.  Why would a deployment have multiple stores?


From ianb at colorstudy.com  Sat Aug 20 23:46:00 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Sat, 20 Aug 2005 16:46:00 -0500
Subject: [Web-SIG] Session interface, v2
In-Reply-To: <43078AEF.4000309@oz.net>
References: <4303FEC5.3050408@colorstudy.com> <43078AEF.4000309@oz.net>
Message-ID: <4307A498.3000408@colorstudy.com>

Mike Orr wrote:
> Ian Bicking wrote:
> 
>> Same location:
>>
>> http://svn.colorstudy.com/home/ianb/scarecrow_session_interface.py
>>  
>>
> 
> 
> Good work.
> 
> 
>> This version separates out SessionManager from SessionStore, and 
>> suggests that managers be per-application (or maybe per-framework).
>>
> There's per-application/per-framework at the class level and instance 
> level; I'm not sure which you're referring to.
> 
> Regarding the former, I was saying we may be able to make a generic 
> SessionManager class usable as-is by several frameworks.  You seemed to 
> doubt this, but I argued we should at least try.

I don't think several frameworks should share a single SessionManager 
*instance*.  But I do think we can make a class that can embody all the 
features that are need for typical sessions, and frameworks use 
instances.  I think people who want something else from a session -- 
application-specific storage, for instance -- may need their own 
SessionManager class.

> The latter only matters in multi-application deployments, where several 
> applications (possibly different frameworks) are sharing a session.  
> Several features below seem to exist only for this environment, and I'm 
> having a hard time evaluating them without knowing the complete use 
> cases you're trying to support.  You've said bits about that but maybe 
> we can flesh it out.
> 
> * Scenario 1: Two apps mounted at /foo and /bar, using a common Paste 
> dispatcher.  Both applications are embedded in the same process.  
> (threaded or asynchronous servers)

This is the case that drives a lot of the issues.  Say, for instance, 
that the two application's are instances of the same basic app (e.g., 
blogs for two different users).  If they share a session they'll 
overwrite each other's values, or become hopelessly confused by 
seemingly inconsistent data.

If each of them has a separate app id and separate session managers, 
then they'll never see the other's data.  But you can only do that with 
some fixed id (generated randomly or by hand, either way probably stored 
in the configuration).  Hmm... potentially you could just generate such 
an id from the configuration file's name (if not given something more 
specific); that's a little sloppy, but generally likely to be unique and 
stable.

> * Scenario 2: Same, but the apps are in separate processes.  The 
> dispatcher remains.  (forking servers)

If the two apps share the same pool of long-lived worker processes, then 
all the same issues remain as with scenario 1.  This isn't really an 
issue of threaded vs. multiprocess, but an issue of processes that run 
multiple independent applications over time.  A common pool of worker 
processes would be similar to PHP, except that PHP tends to throw away 
more information each request... though I believe session clobbering 
would be a problem in PHP if you had two apps on the same domain that 
shared a session variable name.

> * Scenario 3: Two apps mounted at /foo and /bar, using separate handlers 
> in the Apache config.  At no point is there a common Python process 
> between them.

It depends on the configuration, but clobbering could happen here too. 
If both apps use the same session id (e.g., they use the same cookie 
name) and share session store configuration (they are writing to the 
same location), then it will be a problem.  Using session managers with 
separate app ids they can share session store configuration safely.

> * Scenario 4: Two apps in different virtual hosts.

Probably not an issue because the session id won't be shared.  A good 
session id manager might be able to handle this, though, but forwarding 
the user between the two hosts with a special GET variable that triggers 
the setting of a cookie; if that was happening it would be like scenario 3.

> * Scenario 5: Two apps in different webservers.

Much like scenario 4; problems are possible without the session manager, 
but increasingly less likely.

Most conflict issues could also be fixed by not sharing a session id 
between applications (and probably using a configurable session cookie 
name).

> * Others: ??
> 
> Which situations are you trying to support, which session-related 
> objects would there be, and how would they interrelate?  

I want to support all of them.  In part this is because I have a vision 
of much more granular applications, so I want it to be possible to 
deploy small applications with little risk of interaction problems.

> At what point 
> do we say scenarios won't attract enough users to justify our time?

Well, I'm just thinking about the simple session stores, not much along 
the application-specific stores.  So I'm leaving something out there.

> I'm also not sure how these would relate to your "application inversion" 
> paradigm.  I'm used to applications as single long-running units that 
> can hold shared state.  But your Paste implementation seems to suggest 
> instantiating the application for each URL, and maybe the application 
> would last for only one request.  I'm not sure how easy that will be to 
> port some applications to it, or how this impacts the session 
> classees/instances.

I don't think this really relates a whole lot.  Paste doesn't need to 
instantiate for each URL, it could fetch an already-instantiated 
application just as well.  paste.urlmap only dispatches to pre-existing 
applications, for instance, while paste.urlparser instantiates.

>> def create_session_id():
>>  
>>
> 
> Could go into a SessionCookie class, along with anything else that can 
> be used by both session-based and sessionless fans.

It could, but session IDs can come from elsewhere.  E.g., you might want 
to use it as an argument in an XMLRPC class.  So I think it's pretty 
independent of any particular browser identification technique.

>> class ISessionListener:
>>  
>>
> 
> 
> Is this just an extra, or what are listeners for?  Is this for 
> per-application behavior with a shared manager?

It's kind of an extra.  I'm not really sure what would be done with it. 
  An example I gave before about storing files and only storing the 
filename in the session would be helped by listeners, as you could add a 
file-deleting listener that was triggered on session delete.  Anytime 
when you put data associated with the session somewhere outside of the 
session store I think this will be useful.

>> class ISessionManager:
>>
>>     id = """The string-identifier for this session manager.
>>
>>     All applications that share this session manager need to use the
>>     same id when creating the session manager.
>>  
>>
> 
> 
> With this rule I was expecting some central repository of session 
> managers, and factory functions a la logger.getLogger(), but there 
> doesn't seem to be any.  What's the purpose of the SessionManager id?

The session manager id is used by the session store, to keep the 
sessions separate.  Actual session data is keyed by (session_manager_id, 
session_id), so that separate applications have separate 
session_manager_ids, and separate browsers have separate session_ids.

>>     locking_policy = """The lock policy.
>>
>>     This is one of these strings:
>>
>>     ``'optimistic'``:
>>       Optimistic locking; concurrent sessions may be opened for writing;
>>       however, if a session is saved that was loaded before the last save
>>       of the session, a ConflictError will be raised.
>>
>>     ``'lossy'``:
>>       First-come-first-serve.  No locking is done; if a session is 
>> written
>>       it overwrites any other session data that was written.
>>
>>     ``'serialized`'':
>>       All sessions opened for writing are serialized; the request is
>>       blocked until session is available to be opened.
>>     """
>>  
>>
> 
> 
> Optimistic locking sounds like a pain.  The application would have to 
> catch the error and then... what?  Say "Sorry, your form input was 
> thrown away."  Redo the operation somehow (isn't that the same as lossy 
> operation?).  Reconcile the two states somehow (how?)?  Not that we 
> shouldn't provide it, just that it will need more howto documentation.

It is a bit of a pain.  In Zope they catch ConflictErrors, roll back 
everything, and restart the request.  I've had this bite me, as it just 
makes the contention worse, but for sessions in particular it might not 
be so bad (as long as *everything* is transactional and can be rolled back).

Anyway, it's there mostly for the frameworks that already know how to 
handle this.

>> class ISession:
>>
>>     manager = """Reference to parent ISessionManager object"""
>>
>>     def __init__(id, manager, read_only, last_accessed, creation_time, 
>> data):
>>         """Create the session object
>>
>>     def touch():
>>         """Update the session's last_accessed time.
>>
>>         Typically just calls ``self.manager.touch(self.id)``
>>         """
>>
>>     def commit():
>>         """Calls ``self.manager.save_session(self)``
>>         """
>>
>>     def rollback():
>>         """Calls ``self.manager.unlock_session(self)``.
>>
>>         Also calls ``session_listener.rollback(self)``.
>>         """
>>  
>>
> 
> 
> These look like they don't belong here.  The application already has a 
> reference to the SessionManager and should call it directly.  It points 
> up a difference in philosophy between the session being a "dumb object" 
> (no reference to the manager) vs being manager-aware.  Is the latter 
> necessary?  Are you thinking of cases where the session would be 
> provided by the middleware, then the application would have dispose of 
> the session at the end of the request?  The middleware could provide a 
> reference to the session manager for this.  Although that would expose 
> irrelevant methods.

Mostly these are there both to make the interface slightly nicer (many 
times you won't have to interact with the session manager), and to 
facilitate per-session session listeners.  I'm not sure per-session 
listeners are a good idea, though.

>> class ISessionStore:
>>      def load_session(session_store_id, session_id, read_only,
>> session_factory):
>>      def session_ids(session_store_id):
>>      def delete_session(session_store_id, session_id):
>>      def touch(session_store_id, session_id):
>>      def write_lock_session(session_store_id, session_id):
> 
> 
> Isn't session_store_id 'self'?  Specifying it seems to imply this is a 
> meta SessionStore, not an individual store.  Why would a deployment have 
> multiple stores?

Oops, this was a leftover from when SessionManager was named 
SessionStore.  These should all be session_manager_id.  Fixed in svn.


-- 
Ian Bicking  /  ianb at colorstudy.com  / http://blog.ianbicking.org

From renesd at gmail.com  Sun Aug 21 01:42:34 2005
From: renesd at gmail.com (Rene Dudfield)
Date: Sun, 21 Aug 2005 09:42:34 +1000
Subject: [Web-SIG] Session interface, v2
In-Reply-To: <4307A498.3000408@colorstudy.com>
References: <4303FEC5.3050408@colorstudy.com> <43078AEF.4000309@oz.net>
	<4307A498.3000408@colorstudy.com>
Message-ID: <64ddb72c050820164212e0eacf@mail.gmail.com>

Looks quite good.  It should be able to handle all the uses I have for
sessions.  I am sure it will change a little once it is started to be
implemented.

> > * Scenario 2: Same, but the apps are in separate processes.  The
> > dispatcher remains.  (forking servers)
> 
> If the two apps share the same pool of long-lived worker processes, then
> all the same issues remain as with scenario 1.  This isn't really an
> issue of threaded vs. multiprocess, but an issue of processes that run
> multiple independent applications over time.  A common pool of worker
> processes would be similar to PHP, except that PHP tends to throw away
> more information each request... though I believe session clobbering
> would be a problem in PHP if you had two apps on the same domain that
> shared a session variable name.
> 

Yes session clobbering can happen with php.  It gets around it by
allowing you to set the session.name.  Eg PHP_SESSION becomes
MY_BLOG_PHP_SESSION.  Just like in the proposal with SessionManager
and its app_id. http://www.php.net/function.session-name.php

> > * Scenario 3: Two apps mounted at /foo and /bar, using separate handlers
> > in the Apache config.  At no point is there a common Python process
> > between them.
> 
> It depends on the configuration, but clobbering could happen here too.
> If both apps use the same session id (e.g., they use the same cookie
> name) and share session store configuration (they are writing to the
> same location), then it will be a problem.  Using session managers with
> separate app ids they can share session store configuration safely.
> 
> > * Scenario 4: Two apps in different virtual hosts.
> 
> Probably not an issue because the session id won't be shared.  A good
> session id manager might be able to handle this, though, but forwarding
> the user between the two hosts with a special GET variable that triggers
> the setting of a cookie; if that was happening it would be like scenario 3.
> 

The most secure way for virtual hosts would be to use different
session stores?  Using different session stores for separate domains
should be the default for a little extra security?  However using the
same SessionStores accross virtual domains could be quite useful for
passing users settings amongst virtual domains(just like Ian said
above).

From ianb at colorstudy.com  Sun Aug 21 01:56:31 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Sat, 20 Aug 2005 18:56:31 -0500
Subject: [Web-SIG] Session interface, v2
In-Reply-To: <64ddb72c050820164212e0eacf@mail.gmail.com>
References: <4303FEC5.3050408@colorstudy.com>
	<43078AEF.4000309@oz.net>	<4307A498.3000408@colorstudy.com>
	<64ddb72c050820164212e0eacf@mail.gmail.com>
Message-ID: <4307C32F.9050205@colorstudy.com>

Rene Dudfield wrote:
>>>* Scenario 4: Two apps in different virtual hosts.
>>
>>Probably not an issue because the session id won't be shared.  A good
>>session id manager might be able to handle this, though, but forwarding
>>the user between the two hosts with a special GET variable that triggers
>>the setting of a cookie; if that was happening it would be like scenario 3.
>>
> 
> 
> The most secure way for virtual hosts would be to use different
> session stores?  Using different session stores for separate domains
> should be the default for a little extra security?  However using the
> same SessionStores accross virtual domains could be quite useful for
> passing users settings amongst virtual domains(just like Ian said
> above).

As long as session ids are generated properly, there should be no 
overlap in ids unless you are using the same browser identification 
(i.e., the same cookie).  So if the virtual hosts aren't explicitly 
sharing session ids there's no real problem (as long as all those 
applications are trusted to read any session, of course).

-- 
Ian Bicking  /  ianb at colorstudy.com  / http://blog.ianbicking.org

From mso at oz.net  Sun Aug 21 03:08:37 2005
From: mso at oz.net (Mike Orr)
Date: Sat, 20 Aug 2005 18:08:37 -0700
Subject: [Web-SIG] Session interface, v2
In-Reply-To: <4307A498.3000408@colorstudy.com>
References: <4303FEC5.3050408@colorstudy.com> <43078AEF.4000309@oz.net>
	<4307A498.3000408@colorstudy.com>
Message-ID: <4307D415.7070705@oz.net>

Ian Bicking wrote:

> Mike Orr wrote:
> I don't think several frameworks should share a single SessionManager 
> *instance*.  


Isn't that what being a session manager means?  The single gateway to 
the stores.  Otherwise it's more a case of two instances co-managing.  
That sounds more difficult, since the two managers may have different 
bugs and thus an unintentional difference in policy.


> class ISessionManager:
>
>>>
>>>     id = """The string-identifier for this session manager.
>>>
>>>     All applications that share this session manager need to use the
>>>     same id when creating the session manager.
>>>  
>>>
>>
>>
>> With this rule I was expecting some central repository of session 
>> managers, and factory functions a la logger.getLogger(), but there 
>> doesn't seem to be any.  What's the purpose of the SessionManager id?
>
>
> The session manager id is used by the session store, to keep the 
> sessions separate.  Actual session data is keyed by 
> (session_manager_id, session_id), so that separate applications have 
> separate session_manager_ids, and separate browsers have separate 
> session_ids.


OK, we're using different terminology for the same thing.  I would call 
that an application ID.  Two applications that want to share sessions 
would use the same ID, and two instances of a blogging application that 
don't want to share would have different app IDs   MySQLSessionStore has 
an app ID in the constructor, and the session is saved under (app_id, 
session_id).  It defaults to '' if you only have one application and are 
too lazy to make up a name.

Calling it app_id seems to make more sense.  The user would find it 
logical to have to name their applications (=session namespaces).  
Whereas naming "session managers" sounds like an obscure implementation 
detail not related to this.  I would think a session manager ID is its 
memory address, and why on earth would we want to know that?


>>> class ISessionStore:
>>>      def load_session(session_store_id, session_id, read_only,
>>> session_factory):
>>>      def session_ids(session_store_id):
>>>      def delete_session(session_store_id, session_id):
>>>      def touch(session_store_id, session_id):
>>>      def write_lock_session(session_store_id, session_id):
>>
>>
>>
>> Isn't session_store_id 'self'?  Specifying it seems to imply this is 
>> a meta SessionStore, not an individual store.  Why would a deployment 
>> have multiple stores?
>
>
> Oops, this was a leftover from when SessionManager was named 
> SessionStore.  These should all be session_manager_id.  Fixed in svn.


OK, translating 'session_manager_id' to 'app_id', this almost makes 
sense.  So a SessionStore instance can handle multiple applications. 
Is this likely?  I'd like to find some way to avoid passing this value 
to every method, since from the application's perspective, there's only 
one that matters.

From ianb at colorstudy.com  Sun Aug 21 08:29:19 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Sun, 21 Aug 2005 01:29:19 -0500
Subject: [Web-SIG] Session interface, v2
In-Reply-To: <4307D415.7070705@oz.net>
References: <4303FEC5.3050408@colorstudy.com> <43078AEF.4000309@oz.net>
	<4307A498.3000408@colorstudy.com> <4307D415.7070705@oz.net>
Message-ID: <43081F3F.2030805@colorstudy.com>

Mike Orr wrote:
>> The session manager id is used by the session store, to keep the 
>> sessions separate.  Actual session data is keyed by 
>> (session_manager_id, session_id), so that separate applications have 
>> separate session_manager_ids, and separate browsers have separate 
>> session_ids.
> 
> 
> 
> 
> OK, we're using different terminology for the same thing.  I would call 
> that an application ID.  Two applications that want to share sessions 
> would use the same ID, and two instances of a blogging application that 
> don't want to share would have different app IDs   MySQLSessionStore has 
> an app ID in the constructor, and the session is saved under (app_id, 
> session_id).  It defaults to '' if you only have one application and are 
> too lazy to make up a name.
> 
> Calling it app_id seems to make more sense.  The user would find it 
> logical to have to name their applications (=session namespaces).  
> Whereas naming "session managers" sounds like an obscure implementation 
> detail not related to this.  I would think a session manager ID is its 
> memory address, and why on earth would we want to know that?

The session manager needs to be instantiated with the app id, and we 
could rename it there, yes.  It doesn't really matter to me.

>>>> class ISessionStore:
>>>>      def load_session(session_store_id, session_id, read_only,
>>>> session_factory):
>>>>      def session_ids(session_store_id):
>>>>      def delete_session(session_store_id, session_id):
>>>>      def touch(session_store_id, session_id):
>>>>      def write_lock_session(session_store_id, session_id):
>>>
>>>
>>>
>>>
>>> Isn't session_store_id 'self'?  Specifying it seems to imply this is 
>>> a meta SessionStore, not an individual store.  Why would a deployment 
>>> have multiple stores?
>>
>>
>>
>> Oops, this was a leftover from when SessionManager was named 
>> SessionStore.  These should all be session_manager_id.  Fixed in svn.
> 
> 
> 
> OK, translating 'session_manager_id' to 'app_id', this almost makes 
> sense.  So a SessionStore instance can handle multiple applications. Is 
> this likely?  I'd like to find some way to avoid passing this value to 
> every method, since from the application's perspective, there's only one 
> that matters.

The session manager embodies that context, so you never pass that 
around.  The session manager also has the locking policy; as you noticed 
you don't want optimistic locking unless you are ready for 
ConflictErrors, and you don't want lossy if you are relying on the 
session for something important.  So application's shouldn't share that 
setting either.  The SessionStore interface is fairly dumb about what 
it's storing, so it should be able to support multiple policies 
simultaneously.

-- 
Ian Bicking  /  ianb at colorstudy.com  / http://blog.ianbicking.org

From ianb at colorstudy.com  Sun Aug 21 22:22:55 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Sun, 21 Aug 2005 15:22:55 -0500
Subject: [Web-SIG] More on app configuration...
Message-ID: <4308E29F.6040607@colorstudy.com>

So, I got in the first bit of working code for paste.deploy, which is a 
continuation of the work I mentioned back in the thread "WSGI 
deployment: an experiment": 
http://mail.python.org/pipermail/web-sig/2005-July/001598.html

It's still incomplete; I only just got the very first tests to pass. 
The code is in http://svn.pythonpaste.org/Paste/Deploy/trunk/

But I thought I'd describe what I'm currently thinking.  First, this 
isn't really configuration-file-based so much as URI-based.  Right now 
there's only two schemes:

   egg:EggSpec#entry_point_name
   config:config_filename#section_name

And I'll probably add:

   python:[protocol/]import_path
     (I'm not sure where protocol should really go in this case)

Plain imports don't have a explicit protocol, so they are a little 
harder to handle.  Eggs use entry points, which have explicit protocols. 
   Configuration files use section name prefixes to denote the protocol; 
though since configuration files don't contain actual code, they usually 
refer to something with an explicit protocol.

Right now there's only a couple protocols -- paste.app_factory1, 
paste.composit_factory1, paste.filter_factory1, and paste.server_factory1.

I added "composit" for applications that bring together other 
applications.  This includes URL dispatchers, pipelines, and some other 
things.  This is like filters, which wrap a single application, but 
composits get a reference back into the application loader, so they can 
load applications by name.  Of course anyone can load an application by 
URI, so it's not strictly necessary to have a separate type.  But I 
think it's helpful to make explicit when an application is really just a 
dispatcher, vs. a real terminal application.  I can't think of a better 
name than "composit", but if anyone has ideas...

I haven't decided exactly how configuration will work.  Right now I've 
included both global shared configuration and local configuration. 
Global configuration is inherited throughout the system, and exists in 
one flat namespace.  Local configuration can be added in a configuration 
file.  Applications can explicitly pull defaults from the global 
configuration, e.g. "email_errors" might be filled by 
"system_admin_email".  Hopefully this makes happy those who don't like a 
big global pile of settings.

So a configuration section might look like:

   [DEFAULT]
   # This section holds global configuration
   system_admin_email = ianb at colorstudy.com

   [app:main]
   use = egg:MyApp#main
   # or you could do:
   # paste.app_factory1 = import_spec:object
   # override a global setting:
   set system_admin_email = webmaster at host.com
   # and a local setting:
   database = mysql://localhost/myapp

It uses ConfigParser, because it's dumb and the closest thing to making 
no decision on configuration formats.  Maybe later it can use the file 
extension to denote format; I suppose if so then I should make .ini 
required right now (another scheme could be added for a different 
format, but that seems wrong).

-- 
Ian Bicking  /  ianb at colorstudy.com  / http://blog.ianbicking.org

From ianb at colorstudy.com  Mon Aug 22 06:08:24 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Sun, 21 Aug 2005 23:08:24 -0500
Subject: [Web-SIG] PasteDeploy 0.1 (was: Re: More on app configuration...)
In-Reply-To: <4308E29F.6040607@colorstudy.com>
References: <4308E29F.6040607@colorstudy.com>
Message-ID: <43094FB8.7090505@colorstudy.com>

I did a bunch more work on this today.  It's still in an early state, 
but I decided I should release versions more often.  So it's out there:

   http://cheeseshop.python.org/pypi/PasteDeploy/0.1

But probably more interesting to start with, the documentation:

   http://pythonpaste.org/deploy/paste-deploy.html

I also wrote some interfaces (just for documentation purposes):

http://svn.pythonpaste.org/Paste/Deploy/trunk/paste/deploy/interfaces.py

I'm feeling pretty good about how it turned out.

-- 
Ian Bicking  /  ianb at colorstudy.com  / http://blog.ianbicking.org

From renesd at gmail.com  Mon Aug 22 08:34:28 2005
From: renesd at gmail.com (Rene Dudfield)
Date: Mon, 22 Aug 2005 16:34:28 +1000
Subject: [Web-SIG] PasteDeploy 0.1 (was: Re: More on app
	configuration...)
In-Reply-To: <43094FB8.7090505@colorstudy.com>
References: <4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com>
Message-ID: <64ddb72c050821233445da5238@mail.gmail.com>

Hey,

a what is it good for/why use it section would be good on the web page.


Cheers,

On 8/22/05, Ian Bicking <ianb at colorstudy.com> wrote:
> I did a bunch more work on this today.  It's still in an early state,
> but I decided I should release versions more often.  So it's out there:
> 
>    http://cheeseshop.python.org/pypi/PasteDeploy/0.1
> 
> But probably more interesting to start with, the documentation:
> 
>    http://pythonpaste.org/deploy/paste-deploy.html
> 
> I also wrote some interfaces (just for documentation purposes):
> 
> http://svn.pythonpaste.org/Paste/Deploy/trunk/paste/deploy/interfaces.py
> 
> I'm feeling pretty good about how it turned out.
> 
> --
> Ian Bicking  /  ianb at colorstudy.com  / http://blog.ianbicking.org
> _______________________________________________
> Web-SIG mailing list
> Web-SIG at python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/renesd%40gmail.com
>

From ianb at colorstudy.com  Mon Aug 22 19:44:32 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon, 22 Aug 2005 12:44:32 -0500
Subject: [Web-SIG] PasteDeploy 0.1
In-Reply-To: <64ddb72c050821233445da5238@mail.gmail.com>
References: <4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
Message-ID: <430A0F00.6050807@colorstudy.com>

Rene Dudfield wrote:
> Hey,
> 
> a what is it good for/why use it section would be good on the web page.

Good point.  It's not a complete solution yet so I'm not sure exactly 
how to describe it; but this is what I put for now:

Paste Deployment is a system for finding and configuring WSGI 
applications and servers. For WSGI application consumers it provides a 
single, simple function (loadapp) for loading a WSGI application from a 
configuration file or a Python Egg. For WSGI application providers it 
only asks for a single, simple entry point to your application, so that 
application users don't need to be exposed to the implementation details 
of your application.

The result is something a system administrator can install and manage 
without knowing any Python, or the details of the WSGI application or 
its container.


As an aside I've also added a couple features this morning to make the 
common case of pipelining filters a bit easier to configure.

Hmm... it's also just occurred to me that filters should be easier to 
define.  In almost all cases I find I want to curry the configuration so 
it can be applied at the same time the wrapped application is passed in. 
  I might add another protocol for that.

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From pje at telecommunity.com  Tue Aug 23 02:26:35 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 22 Aug 2005 20:26:35 -0400
Subject: [Web-SIG] PasteDeploy 0.1
In-Reply-To: <430A0F00.6050807@colorstudy.com>
References: <64ddb72c050821233445da5238@mail.gmail.com>
	<4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>

At 12:44 PM 8/22/2005 -0500, Ian Bicking wrote:
>Hmm... it's also just occurred to me that filters should be easier to
>define.  In almost all cases I find I want to curry the configuration so
>it can be applied at the same time the wrapped application is passed in.
>   I might add another protocol for that.

I think the format is improving, as it was now clear enough for me to 
figure out what I'd like to change.  ;-)

I stole this example off your blog, and then rewrote it using a slightly 
more advanced version of my last syntax proposal:

     # Put one login system in front of the entire site
     #
     [login wrapper from Paste]
     database = "mysql://localhost/userdb"
     table    = "users"

     # Then this passes different path prefixes to different apps
     #
     [urlmap from Paste]
     "/"     = static()
     "/cms"  = auth(filebrowser_app())
     "/blog" = blog()


     # variables used later
     #
     [config = vars]
     admin_email = "me at example.com"
     document_root = "/home/me/htdocs"

     # a very simple app...
     #
     [static = static from Paste]
     document_root = config.document_root

     # the login filter should give us a username; this just restricts
     # who can access
     #
     [auth = auth wrapper from Paste]
     require_role = "admin"
     admin_email = config.admin_email

     # this application is distributed in an egg
     #
     [filebrowser_app = filebrowser from FileBrowser]
     document_root = config.document_root
     admin_email = config.admin_email


     # In this case the app isn't distributed as an Egg with
     # entry_points, so we manually create a glue function blog_app
     # and just invoke it here
     #
     [blog = myglue.apps:blog_app]
     admin_email = config.admin_email


Most of the above should be pretty obvious, but a few points anyway:

* This format is generic; it has nothing to do with WSGI in particular and 
can be used to assemble any component tree.  It also supports implementing 
the "wsgi services" concept.

* Argument names can be either an identifier or a quoted string

* You can use factories from a default group (e.g. 'vars' above might 
effectively be short for 'vars from WSGIUtils')

* named sections ("[name = ...]") have to come after the unnamed sections, 
and they are turned into "curried" factory objects that are available in 
the eval() namespace used for all expressions.  When called in an 
expression, they can accept keyword arguments to override the defaults in 
the named section.  They have properties with the same names as the values 
defined in that section.

* The first part of a section (after the "name=", if any) is an import spec 
for a factory, or if it's followed by "from" or "wrapper from", then it's 
the name of an entry point that advertises a factory.

* "wrapper" means that the factory will be called with two positional 
arguments; non-wrappers are called with one argument.  Named wrappers can 
be passed a positional argument if used in an another factory argument 
expression - this will be the object they should wrap.

* The last unnamed section is the effective "result" of parsing the file, 
although it will be wrapped by any contiguous preceding "wrapper" sections

The parser for this format would of course be considerably more complex 
than the Paste-Deploy parser (especially since evaluation would be done 
lazily), but I think the syntax is both cleaner and more powerful.  The 
factory signatures are:

     def non_wrapper_factory(parent_component, **kw):
         ...

     def wrapper_factory(child_component, parent_component, **kw):
         ...

With the parent/child parameters always being supplied positionally.  The 
idea is that parent_component will be used to create a chain of service 
contexts, and child_component is an application to be wrapped by middleware.

I've thought this through enough that I know how I could implement all of 
the features shown, but it may be a week or two at least before I could try 
hacking together an implementation.  Also, the services side of it isn't 
really fleshed out yet, and it may also be that we need to provide some 
simple "builtin" functions in the eval() namespace to do things like lookup 
services or load other deployment files, etc.


From ianb at colorstudy.com  Tue Aug 23 04:03:18 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon, 22 Aug 2005 21:03:18 -0500
Subject: [Web-SIG] PasteDeploy 0.1
In-Reply-To: <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
References: <64ddb72c050821233445da5238@mail.gmail.com>
	<4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
Message-ID: <430A83E6.5030302@colorstudy.com>

Phillip J. Eby wrote:
> At 12:44 PM 8/22/2005 -0500, Ian Bicking wrote:
> 
>> Hmm... it's also just occurred to me that filters should be easier to
>> define.  In almost all cases I find I want to curry the configuration so
>> it can be applied at the same time the wrapped application is passed in.
>>   I might add another protocol for that.
> 
> 
> I think the format is improving, as it was now clear enough for me to 
> figure out what I'd like to change.  ;-)
> 
> I stole this example off your blog, and then rewrote it using a slightly 
> more advanced version of my last syntax proposal:
> 
>     # Put one login system in front of the entire site
>     #
>     [login wrapper from Paste]
>     database = "mysql://localhost/userdb"
>     table    = "users"
> 
>     # Then this passes different path prefixes to different apps
>     #
>     [urlmap from Paste]
>     "/"     = static()
>     "/cms"  = auth(filebrowser_app())
>     "/blog" = blog()

One aspect of paste.deploy that wasn't shown in that example is that 
it's easy to refer to other configuration files.  It would actually be 
more realistic to do:

   [composit:app]
   use = egg:Paste#urlmap
   / = config:static_root.ini
   /cms = config:filebrowser.ini
   /blog = config:blog.ini

And if filebrowser.ini defined an authentication filter named "auth", 
you could add this to blog.ini to reuse that configuration:

   [filter-app:main]
   use = config:filebrowser.ini#auth
   next = blog

   [app:blog]
   ....

And so forth.  I think this will be really useful to me (when I have my 
sysadmin/deployer hat on) -- it's something I left out of my own 
previous specs, but I think incorrectly.

>     # variables used later
>     #
>     [config = vars]
>     admin_email = "me at example.com"
>     document_root = "/home/me/htdocs"

This seems useful.  I had thought about some way of using the globals in 
expressions; but with pure-string expressions it's not easy to do much 
of interest.

>     # a very simple app...
>     #
>     [static = static from Paste]
>     document_root = config.document_root
> 
>     # the login filter should give us a username; this just restricts
>     # who can access
>     #
>     [auth = auth wrapper from Paste]
>     require_role = "admin"
>     admin_email = config.admin_email
> 
>     # this application is distributed in an egg
>     #
>     [filebrowser_app = filebrowser from FileBrowser]
>     document_root = config.document_root
>     admin_email = config.admin_email

However, in paste.deploy there does remain real global configuration, so 
you wouldn't have to manually copy in values from the globals.  While 
admittedly it makes the interface slightly less elegant from the Python 
side, I think it's an important feature.

>     # In this case the app isn't distributed as an Egg with
>     # entry_points, so we manually create a glue function blog_app
>     # and just invoke it here
>     #
>     [blog = myglue.apps:blog_app]
>     admin_email = config.admin_email
> 
> 
> Most of the above should be pretty obvious, but a few points anyway:
> 
> * This format is generic; it has nothing to do with WSGI in particular 
> and can be used to assemble any component tree.  It also supports 
> implementing the "wsgi services" concept.

Ditto paste.deploy.  Not all of the bits are well defined in the 
implementation, but there's nothing inside or out that's connected to WSGI.

> * Argument names can be either an identifier or a quoted string

I tried to avoid anything fancy; if I was going to do something fancy 
I'd feel a need to look at all the configuration formats currently for 
Python, and if not reuse them at least steal from them.

But it's clear that plain ConfigParser parsing is pretty lame.

> * You can use factories from a default group (e.g. 'vars' above might 
> effectively be short for 'vars from WSGIUtils')

How is that default group determined?  What is a "group"?

> * named sections ("[name = ...]") have to come after the unnamed 
> sections, and they are turned into "curried" factory objects that are 
> available in the eval() namespace used for all expressions.  When called 
> in an expression, they can accept keyword arguments to override the 
> defaults in the named section.  They have properties with the same names 
> as the values defined in that section.

The properties are fine; I can't say the calling syntax appeals to me 
particularly.

> * The first part of a section (after the "name=", if any) is an import 
> spec for a factory, or if it's followed by "from" or "wrapper from", 
> then it's the name of an entry point that advertises a factory.

How do you determine the entry point type?  Or is there one entry point 
type for anything available in a configuration file?  paste.deploy 
defines an entry point type for each kind of object.

> * "wrapper" means that the factory will be called with two positional 
> arguments; non-wrappers are called with one argument.  Named wrappers 
> can be passed a positional argument if used in an another factory 
> argument expression - this will be the object they should wrap.

This part is unclear to me.

> * The last unnamed section is the effective "result" of parsing the 
> file, although it will be wrapped by any contiguous preceding "wrapper" 
> sections

This isn't clear to me when reading the configuration file.  INI files 
are flat, and I wouldn't expect them to be usefully ordered, especially 
in a way that puts particular importance on the last unnamed section.

I'd feel more comfortable with a nested configuration format in that case.

> The parser for this format would of course be considerably more complex 
> than the Paste-Deploy parser (especially since evaluation would be done 
> lazily), but I think the syntax is both cleaner and more powerful.  The 
> factory signatures are:
> 
>     def non_wrapper_factory(parent_component, **kw):
>         ...
> 
>     def wrapper_factory(child_component, parent_component, **kw):
>         ...
> 
> With the parent/child parameters always being supplied positionally.  
> The idea is that parent_component will be used to create a chain of 
> service contexts, and child_component is an application to be wrapped by 
> middleware.
> 
> I've thought this through enough that I know how I could implement all 
> of the features shown, but it may be a week or two at least before I 
> could try hacking together an implementation.  Also, the services side 
> of it isn't really fleshed out yet, and it may also be that we need to 
> provide some simple "builtin" functions in the eval() namespace to do 
> things like lookup services or load other deployment files, etc.

I dunno... I can't say much about the services, because I don't really 
know what you intend with those.  These are some things I like about 
your example:

* More structured/richer section names could be good; paste.deploy's 
"use" could go as a result.

* A clear notion of evaluation and variables would be nice.

* A config format with good quoting rules is called for.  ConfigParser 
isn't anything more than a stop-gap.

But some things I don't like:

* Using ordering in a syntax that doesn't feel ordered or nested.

* Using function composition to represent application/filter 
composition.  But only sometimes.

* "name from egg_spec" reads nice on one level, but is vague on another 
level.  Even if "egg:egg_spec#name" doesn't read well, I think it is 
nicely self-describing.

* eval() scares me a bit; if I used eval() I would feel a need to keep 
sufficient information around to do proper tracebacks that include the 
source configuration file.  But all-strings isn't great either. 
Evaluation without conditionals seems like it goes only half-way; OTOH 
conditionals get to something too complex for configuration.  So however 
it goes, configuration should be somewhere in the middle of completely 
dumb (ConfigParser, unevaluated values), and completely general (Python 
code).  Where in the middle I'm unsure.

-- 
Ian Bicking  /  ianb at colorstudy.com  / http://blog.ianbicking.org

From pje at telecommunity.com  Tue Aug 23 05:12:44 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 22 Aug 2005 23:12:44 -0400
Subject: [Web-SIG] PasteDeploy 0.1
In-Reply-To: <430A83E6.5030302@colorstudy.com>
References: <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com>

At 09:03 PM 8/22/2005 -0500, Ian Bicking wrote:
>One aspect of paste.deploy that wasn't shown in that example is that it's 
>easy to refer to other configuration files.  It would actually be more 
>realistic to do:
>
>   [composit:app]
>   use = egg:Paste#urlmap
>   / = config:static_root.ini
>   /cms = config:filebrowser.ini
>   /blog = config:blog.ini

In the minimum case, one could do that without any change to the syntax I 
proposed:

     [static = file]
     filename = "static_root.ini"

But I think it would be nicer to just provide a "builtin function" to load 
a component from a file, since it's common enough to deserve a primitive.


>And if filebrowser.ini defined an authentication filter named "auth", you 
>could add this to blog.ini to reuse that configuration:
>
>   [filter-app:main]
>   use = config:filebrowser.ini#auth
>   next = blog

One of the things I really dislike about the PasteDeploy syntax is that it 
mingles factory arguments with chaining, which seems like mixing metalevels 
to me; I never know whether an argument is intended for the parser (e.g. 
use, next) or for the factory.  My blog.ini would look like this (in its 
entirety):

    [file]
    # Borrow the filebrowser 'auth' wrapper
    filename = "filebrowser.ini"
    factory = "auth"

    [myglue.apps:blog_app]


>However, in paste.deploy there does remain real global configuration, so 
>you wouldn't have to manually copy in values from the globals.  While 
>admittedly it makes the interface slightly less elegant from the Python 
>side, I think it's an important feature.

That's easily emulated if you need it; just create a configuration service 
or services that can be acquired via the parent_component links.  Actually, 
the format I propose allows numerous other ways to emulate that feature on 
varying scales, but doesn't force all factories to understand any one 
specific configuration protocol.


>>* Argument names can be either an identifier or a quoted string
>
>I tried to avoid anything fancy; if I was going to do something fancy I'd 
>feel a need to look at all the configuration formats currently for Python, 
>and if not reuse them at least steal from them.
>
>But it's clear that plain ConfigParser parsing is pretty lame.

The only reason for allowing string literals is to avoid coming up with a 
lame new escaping scheme for use cases like the URL map.


>>* You can use factories from a default group (e.g. 'vars' above might 
>>effectively be short for 'vars from WSGIUtils')
>
>How is that default group determined?  What is a "group"?

Er, sorry, I meant entry point group, like "wsgi.factories" or whatever.  I 
was just pointing out that the meaning of a non-import string could be 
loaded from an entry point group, and that the group might vary depending 
on the application loading the configuration file.


>>* named sections ("[name = ...]") have to come after the unnamed 
>>sections, and they are turned into "curried" factory objects that are 
>>available in the eval() namespace used for all expressions.  When called 
>>in an expression, they can accept keyword arguments to override the 
>>defaults in the named section.  They have properties with the same names 
>>as the values defined in that section.
>
>The properties are fine; I can't say the calling syntax appeals to me 
>particularly.

I thought about *not* calling them (except for wrappers), but then the 
properties would have to go.


>>* The first part of a section (after the "name=", if any) is an import 
>>spec for a factory, or if it's followed by "from" or "wrapper from", then 
>>it's the name of an entry point that advertises a factory.
>
>How do you determine the entry point type?  Or is there one entry point 
>type for anything available in a configuration file?  paste.deploy defines 
>an entry point type for each kind of object.

I'm thinking that the loader gets passed some arguments to determine what 
entry point group to use.  This format, by the way, only requires one group 
for all the "normal" entry points, because the "wrapper" keyword 
distinguishes between the two factory signatures -- which are the only 
signatures you get.


>>* "wrapper" means that the factory will be called with two positional 
>>arguments; non-wrappers are called with one argument.  Named wrappers can 
>>be passed a positional argument if used in an another factory argument 
>>expression - this will be the object they should wrap.
>
>This part is unclear to me.

See the urlmap in the example, where "/blog" = auth(blog()).  'auth' is a 
"wrapper", so it can be called with something to wrap (e.g. 'blog()').


>* Using ordering in a syntax that doesn't feel ordered or nested.

Fair enough.  However, I'm used to ordered .ini files (they do exist), so 
I'm not sure that's enough on its own to rule out the syntax.  Also, we 
could nix the '[]' for section headings and come up with something else, e.g.:

      login wrapper from Paste:
        database = "mysql://localhost/userdb"
        table    = "users"

      urlmap from Paste:
        "/"     = static()
        "/cms"  = auth(filebrowser_app())
        "/blog" = blog()

      def config() as vars:
          admin_email = "me at example.com"
          document_root = "/home/me/htdocs"

      def static() as static from Paste:
          document_root = config.document_root

      def auth() as auth wrapper from Paste:
          require_role = "admin"
          admin_email = config.admin_email

      def filebrowser_app() as filebrowser from FileBrowser:
          document_root = config.document_root
          admin_email = config.admin_email

      def blog() as myglue.apps:blog_app:
          admin_email = config.admin_email


This probably wouldn't be any harder to parse than my initial proposal, as 
I was thinking of using the "tokenize" module, and in this case a DEDENT 
token would indicate the end of a section.  I'm not sure I like the 'def 
x()' bit, makes it look a little too much like Python, at the same time as 
it seems good to have it be like Python.


>* Using function composition to represent application/filter 
>composition.  But only sometimes.

Only sometimes you don't like it?  :)  Or do you mean that the format I 
gave only uses it sometimes, and that's what you dislike?  (i.e., you'd be 
fine if it was always done that way?)


>* "name from egg_spec" reads nice on one level, but is vague on another 
>level.  Even if "egg:egg_spec#name" doesn't read well, I think it is 
>nicely self-describing.

Um, wha???  The only difference between the two is that one of them has 
"egg:" in front of it, which seems a bit redundant to me.  That's probably 
because I assume that in the long run eggs will be so ubiquitous that it 
really will be redundant to explicitly refer to them as such.  :)

Conversely, if I assume that some further description is required, I would 
want to say "pypi:" or "project:" or something else of that sort, because 
"egg" isn't the essential nature of the thing; the name is a *project* 
name, while eggs are an implementation detail.


>* eval() scares me a bit; if I used eval() I would feel a need to keep 
>sufficient information around to do proper tracebacks that include the 
>source configuration file.

Sure, that can be done, especially if one does a couple of tricks with 
compile(), new.code(), and co_firstlineno.


>But all-strings isn't great either. Evaluation without conditionals seems 
>like it goes only half-way; OTOH conditionals get to something too complex 
>for configuration.  So however it goes, configuration should be somewhere 
>in the middle of completely dumb (ConfigParser, unevaluated values), and 
>completely general (Python code).  Where in the middle I'm unsure.

eval() isn't full Python code, and you can set a restricted set of builtins 
if you really want.  I don't see much point to dumbing it down any further 
than that, though.


From ianb at colorstudy.com  Tue Aug 23 06:30:19 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon, 22 Aug 2005 23:30:19 -0500
Subject: [Web-SIG] PasteDeploy 0.1
In-Reply-To: <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com>
References: <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com>
Message-ID: <430AA65B.2090409@colorstudy.com>

Phillip J. Eby wrote:
> At 09:03 PM 8/22/2005 -0500, Ian Bicking wrote:
> 
>> One aspect of paste.deploy that wasn't shown in that example is that 
>> it's easy to refer to other configuration files.  It would actually be 
>> more realistic to do:
>>
>>   [composit:app]
>>   use = egg:Paste#urlmap
>>   / = config:static_root.ini
>>   /cms = config:filebrowser.ini
>>   /blog = config:blog.ini
> 
> 
> In the minimum case, one could do that without any change to the syntax 
> I proposed:
> 
>     [static = file]
>     filename = "static_root.ini"
> 
> But I think it would be nicer to just provide a "builtin function" to 
> load a component from a file, since it's common enough to deserve a 
> primitive.

Definitely; files are the essence of configuration, file references 
should be a core concept.  (Of course ZODB objects can work much like 
files, or wherever else you put the configuration is likely to have a 
filesystem-like feel)

>> And if filebrowser.ini defined an authentication filter named "auth", 
>> you could add this to blog.ini to reuse that configuration:
>>
>>   [filter-app:main]
>>   use = config:filebrowser.ini#auth
>>   next = blog
> 
> 
> One of the things I really dislike about the PasteDeploy syntax is that 
> it mingles factory arguments with chaining, which seems like mixing 
> metalevels to me; I never know whether an argument is intended for the 
> parser (e.g. use, next) or for the factory.  My blog.ini would look like 
> this (in its entirety):
> 
>    [file]
>    # Borrow the filebrowser 'auth' wrapper
>    filename = "filebrowser.ini"
>    factory = "auth"
> 
>    [myglue.apps:blog_app]

I'm fine getting rid of "use", maybe like:

   [filter-app:main config:filebrowser.ini#auth]

Well... upon writing it, it doesn't look that nice.  But in theory... or 
maybe something like:

   [filter-app:main]
   config:filebrowser.ini#auth

Where the first (non-comment) line was interpreted as the thing being 
loaded.  That makes it look very different.  Or whatever; I don't feel 
that strongly about it.

>> However, in paste.deploy there does remain real global configuration, 
>> so you wouldn't have to manually copy in values from the globals.  
>> While admittedly it makes the interface slightly less elegant from the 
>> Python side, I think it's an important feature.
> 
> 
> That's easily emulated if you need it; just create a configuration 
> service or services that can be acquired via the parent_component 
> links.  Actually, the format I propose allows numerous other ways to 
> emulate that feature on varying scales, but doesn't force all factories 
> to understand any one specific configuration protocol.

It's important to me, and it's not intuitive to me what you envision. 
So I feel a need to services in action, replacing global configuration.

I'll admit, it felt a little funny to me when I converted middleware to 
have a global configuration parameter that they simply ignored.  But 
*not* having that access would bother me more.

>>> * You can use factories from a default group (e.g. 'vars' above might 
>>> effectively be short for 'vars from WSGIUtils')
>>
>>
>> How is that default group determined?  What is a "group"?
> 
> 
> Er, sorry, I meant entry point group, like "wsgi.factories" or 
> whatever.  I was just pointing out that the meaning of a non-import 
> string could be loaded from an entry point group, and that the group 
> might vary depending on the application loading the configuration file.

In paste.deploy I allow for future groups that ultimately return the 
same kind of object, so the group adds important information. 
paste.composit_factory1 and paste.app_factory1 both return the same kind 
of object, for instance.

>>> * named sections ("[name = ...]") have to come after the unnamed 
>>> sections, and they are turned into "curried" factory objects that are 
>>> available in the eval() namespace used for all expressions.  When 
>>> called in an expression, they can accept keyword arguments to 
>>> override the defaults in the named section.  They have properties 
>>> with the same names as the values defined in that section.
>>
>>
>> The properties are fine; I can't say the calling syntax appeals to me 
>> particularly.
> 
> 
> I thought about *not* calling them (except for wrappers), but then the 
> properties would have to go.

I like the properties more than the composition.

>>> * The first part of a section (after the "name=", if any) is an 
>>> import spec for a factory, or if it's followed by "from" or "wrapper 
>>> from", then it's the name of an entry point that advertises a factory.
>>
>>
>> How do you determine the entry point type?  Or is there one entry 
>> point type for anything available in a configuration file?  
>> paste.deploy defines an entry point type for each kind of object.
> 
> 
> I'm thinking that the loader gets passed some arguments to determine 
> what entry point group to use.  This format, by the way, only requires 
> one group for all the "normal" entry points, because the "wrapper" 
> keyword distinguishes between the two factory signatures -- which are 
> the only signatures you get.

In paste.deploy the syntax (filter:, etc) and the group are redundant. 
So if you accidentally treat an application like a filter it'll be 
caught before you call the object with the wrong parameters.

I'm also realizing that positional parameters are bad; I think I'll be 
changing to calling with purely keyword parameters.  It's too easy to 
mix up positional parameters, and pass in the wrong object in the wrong 
location, and then it only gets caught later when you try to use the 
wrong object in a way it doesn't support.  That was one of the more 
common errors I produced when converting my code.

>>> * "wrapper" means that the factory will be called with two positional 
>>> arguments; non-wrappers are called with one argument.  Named wrappers 
>>> can be passed a positional argument if used in an another factory 
>>> argument expression - this will be the object they should wrap.
>>
>>
>> This part is unclear to me.
> 
> 
> See the urlmap in the example, where "/blog" = auth(blog()).  'auth' is 
> a "wrapper", so it can be called with something to wrap (e.g. 'blog()').

But the wrapper there is called with one argument, and the app with 
zero; but you say the wrapper has two and the app one...?

>> * Using ordering in a syntax that doesn't feel ordered or nested.
> 
> 
> Fair enough.  However, I'm used to ordered .ini files (they do exist), 
> so I'm not sure that's enough on its own to rule out the syntax.  Also, 
> we could nix the '[]' for section headings and come up with something 
> else, e.g.:
> 
>      login wrapper from Paste:
>        database = "mysql://localhost/userdb"
>        table    = "users"
> 
>      urlmap from Paste:
>        "/"     = static()
>        "/cms"  = auth(filebrowser_app())
>        "/blog" = blog()
> 
>      def config() as vars:
>          admin_email = "me at example.com"
>          document_root = "/home/me/htdocs"
> 
>      def static() as static from Paste:
>          document_root = config.document_root
> 
>      def auth() as auth wrapper from Paste:
>          require_role = "admin"
>          admin_email = config.admin_email
> 
>      def filebrowser_app() as filebrowser from FileBrowser:
>          document_root = config.document_root
>          admin_email = config.admin_email
> 
>      def blog() as myglue.apps:blog_app:
>          admin_email = config.admin_email

That's not any different to me, I guess.  This would be a better use of 
indentation:

main = pipeline:
   login wrapper from Paste:
       config...
   urlmap from Paste:
       "/" = static
       # for some reason this feels a lot better than
       # auth(filebrowser_app())) to me:
       "/cms" = pipeline(auth, filebrowser_app)
       "/blog" = blog

config = vars:
   admin_email = "me at example.com"
   document_root = "/home/me/htdocs"

static = static from Paste:
   document_root = config.document_root

# gotta admit I still really prefer "filter" to "wrapper"
auth = auth wrapper from Paste:
     require_role = "admin"
     admin_email = config.admin_email

filebrowser_app = filebrowser from FileBrowser:
     # better use of properties than config.document_root, really:
     document_root = static.document_root
     admin_email = config.admin_email

# or you could do the whole thing like:

filebrowser = pipeline:
     auth wrapper from Paste:
         require_role = "admin"
         admin_email = config.admin_email
     filebrowser from FileBrowser:
         document_root = static.document_root

# maybe you could specialize/clone like:

auth2 = auth:
     require_role = "editor"


That configuration feels way better to me.  The named applications are 
full peers with the unnamed applications/objects (and I think there just 
shouldn't be top-level unnamed applications).


>> * Using function composition to represent application/filter 
>> composition.  But only sometimes.
> 
> 
> Only sometimes you don't like it?  :)  Or do you mean that the format I 
> gave only uses it sometimes, and that's what you dislike?  (i.e., you'd 
> be fine if it was always done that way?)

I don't like the function calling at all, but even moreso because it's 
inconsistent (it isn't used for the main app).

>> * "name from egg_spec" reads nice on one level, but is vague on 
>> another level.  Even if "egg:egg_spec#name" doesn't read well, I think 
>> it is nicely self-describing.
> 
> 
> Um, wha???  The only difference between the two is that one of them has 
> "egg:" in front of it, which seems a bit redundant to me.  That's 
> probably because I assume that in the long run eggs will be so 
> ubiquitous that it really will be redundant to explicitly refer to them 
> as such.  :)

In paste.deploy config files are full peers to Eggs, and can be used 
anywhere that an egg: reference can be used.  I think that's a neat 
feature.  I don't want to tack on referencing other config files, like a 
special loader factory or textual inclusion hacks or anything like that.

Config files describe applications.  Egg entry points describe 
applications.  They should be peers.

> Conversely, if I assume that some further description is required, I 
> would want to say "pypi:" or "project:" or something else of that sort, 
> because "egg" isn't the essential nature of the thing; the name is a 
> *project* name, while eggs are an implementation detail.

egg: is an access method, just like http: or whatever.  It doesn't say 
what the URI describes, just how to find it.

-- 
Ian Bicking  /  ianb at colorstudy.com  / http://blog.ianbicking.org

From pje at telecommunity.com  Tue Aug 23 16:42:30 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 23 Aug 2005 10:42:30 -0400
Subject: [Web-SIG] PasteDeploy 0.1
In-Reply-To: <430AA65B.2090409@colorstudy.com>
References: <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com>
	<5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com>

At 11:30 PM 8/22/2005 -0500, Ian Bicking wrote:
>Phillip J. Eby wrote:
>>At 09:03 PM 8/22/2005 -0500, Ian Bicking wrote:
>>>However, in paste.deploy there does remain real global configuration, so 
>>>you wouldn't have to manually copy in values from the globals.
>>>While admittedly it makes the interface slightly less elegant from the 
>>>Python side, I think it's an important feature.
>>
>>That's easily emulated if you need it; just create a configuration 
>>service or services that can be acquired via the parent_component 
>>links.  Actually, the format I propose allows numerous other ways to 
>>emulate that feature on varying scales, but doesn't force all factories 
>>to understand any one specific configuration protocol.
>
>It's important to me, and it's not intuitive to me what you envision. So I 
>feel a need to services in action, replacing global configuration.

Using a config service in a factory to get a default argument value:

     def some_app_factory(parent_component, **args):
         config = parent_component.get_service("global_config")
         args.setdefault('someparam', config['someparam'])

Registering a config service (old syntax):

     [globalconfigservice from SomeEgg]
     someparam = "foo"

     [...next component in the stack...]

The config service would respond to 'get_service("global_config")' by 
returning self.

The idea is that when you chain non-wrapper components in a pipeline, each 
one gets the previous component as its "parent component", so you can 
"acquire" services from your parents.  Components nearer to you (i.e. more 
local) can override more global service definitions.


>>I'm thinking that the loader gets passed some arguments to determine what 
>>entry point group to use.  This format, by the way, only requires one 
>>group for all the "normal" entry points, because the "wrapper" keyword 
>>distinguishes between the two factory signatures -- which are the only 
>>signatures you get.
>
>In paste.deploy the syntax (filter:, etc) and the group are redundant. So 
>if you accidentally treat an application like a filter it'll be caught 
>before you call the object with the wrong parameters.

Well, that could certainly be done with this approach too.


>>>>* "wrapper" means that the factory will be called with two positional 
>>>>arguments; non-wrappers are called with one argument.  Named wrappers 
>>>>can be passed a positional argument if used in an another factory 
>>>>argument expression - this will be the object they should wrap.
>>>
>>>
>>>This part is unclear to me.
>>
>>See the urlmap in the example, where "/blog" = auth(blog()).  'auth' is a 
>>"wrapper", so it can be called with something to wrap (e.g. 'blog()').
>
>But the wrapper there is called with one argument, and the app with zero; 
>but you say the wrapper has two and the app one...?

Named sections have a default parent_component argument, so that you don't 
have to explicitly pass them in.

># or you could do the whole thing like:
>
>filebrowser = pipeline:
>     auth wrapper from Paste:
>         require_role = "admin"
>         admin_email = config.admin_email
>     filebrowser from FileBrowser:
>         document_root = static.document_root
>
># maybe you could specialize/clone like:
>
>auth2 = auth:
>     require_role = "editor"

Interesting.  If we used "in" to include other files then you could refer 
to e.g.:

     foo = main in "some.ini":
         # override params

Also, I was thinking that in this syntax, you want to be able to leave off 
the trailing ':' for simple definitions, so that this would be a complete 
definition, without needing a body:

     foo = main in "some.ini"

Finally, I think we could drop the "pipeline" keyword and simply use a ':' 
to define a name, which then gives us a way to stack components inside a 
definition, e.g.:

     main:
         login wrapper from Paste:
             # blah
         urlmap from Paste:
             "/":     static
             "/blog": main in "blog.ini"
             "/cms":
                  auth wrapper from Paste:
                      require_role = "admin"
                  filebrowser from FileBrowser:
                      document_root = static.document_root


The idea here is that if you want to pass a component or components to a 
factory, you have to use ':' syntax, with either a one-line component 
specifier on the same line, or a multi-component stack as an indented suite.

An interesting question is whether you should be able to refer to nested 
definitions as factory prototypes (ala your auth2/auth) or whether only 
top-level names should be usable.  For example in this:

     foo: bar from baz

     spam:
         foo: snickety from lemon
         scuzz: foo
         sprim:
             thingy: foo

Does "scuzz: foo" refer to the inner foo or the outer foo?  What about 
"thingy: foo"?

I'm inclined to say that both refer to the spam: foo rather than the 
outermost "foo".  i.e., more or less the same rules as Python scopes.

One minor problem with this syntax overall, though, is that it's a bit 
context-dependent.  Whether "foo:" means "define foo" or "create a foo" is 
just a matter of alternating layers.  It would be better if the syntax were 
less ambiguous, e.g.:

     main :=
         login wrapper from Paste:
             # blah
         urlmap from Paste:
             "/"     := static
             "/blog" := main in "blog.ini"
             "/cms"  :=
                  auth wrapper from Paste:
                      require_role = "admin"
                  filebrowser from FileBrowser:
                      document_root = static.document_root

But that doesn't actually seem to help visually, and makes it harder to 
write because you have to remember all the time whether you need ":" or 
":=".  Maybe this would be better:

     main is:
         login wrapper from Paste:
             # blah
         urlmap from Paste:
             match_mode = "longest"
             "/" is static
             "/blog" is main in "blog.ini"
             "/cms" is:
                  auth wrapper from Paste:
                      require_role = "admin"
                  filebrowser from FileBrowser:
                      document_root = static.document_root

What's nice about this is that now you can unambiguously create a top-level 
object in the simple case:

     zapp from Zope:
         cfg_file = "site.zcml"

Without needing to do:

     main is:
         zapp from Zope:
             cfg_file = "site.zcml"

Which is a pain, IMO.  Although I suppose we could allow:

     main is zapp from Zope:
         cfg_file = "site.zcml"

which isn't too bad.


>>>* "name from egg_spec" reads nice on one level, but is vague on another 
>>>level.  Even if "egg:egg_spec#name" doesn't read well, I think it is 
>>>nicely self-describing.
>>
>>Um, wha???  The only difference between the two is that one of them has 
>>"egg:" in front of it, which seems a bit redundant to me.  That's 
>>probably because I assume that in the long run eggs will be so ubiquitous 
>>that it really will be redundant to explicitly refer to them as such.  :)
>
>In paste.deploy config files are full peers to Eggs, and can be used 
>anywhere that an egg: reference can be used.  I think that's a neat 
>feature.  I don't want to tack on referencing other config files, like a 
>special loader factory or textual inclusion hacks or anything like that.
>
>Config files describe applications.  Egg entry points describe 
>applications.  They should be peers.

Okay.  The "in" syntax I gave above allows that, although I could also go 
for only using "from", as long as config URLs are quoted strings.  I also 
think the strings should be relative or absolute URLs, rather than 
filenames.  (So that '/' has the same meaning on all platforms.)  That will 
be something of a pain for Windows users who may need to include drive 
letters, but oh well.  We can always treat the letters A-Z as a special 
"file:" protocol to fix that.  :)


>>Conversely, if I assume that some further description is required, I 
>>would want to say "pypi:" or "project:" or something else of that sort, 
>>because "egg" isn't the essential nature of the thing; the name is a 
>>*project* name, while eggs are an implementation detail.
>
>egg: is an access method, just like http: or whatever.  It doesn't say 
>what the URI describes, just how to find it.

Ah, but that's just it.  The project name is a URN, not a URL, precisely 
because it *doesn't* describe how to locate the resource, it just names the 
resource and tells the system to go find it.


From ianb at colorstudy.com  Tue Aug 23 18:03:05 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 23 Aug 2005 11:03:05 -0500
Subject: [Web-SIG] PasteDeploy 0.1
In-Reply-To: <5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com>
References: <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com>
	<5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com>
	<5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com>
Message-ID: <430B48B9.90607@colorstudy.com>

Phillip J. Eby wrote:
>> It's important to me, and it's not intuitive to me what you envision. 
>> So I feel a need to services in action, replacing global configuration.
> 
> 
> Using a config service in a factory to get a default argument value:
> 
>     def some_app_factory(parent_component, **args):
>         config = parent_component.get_service("global_config")
>         args.setdefault('someparam', config['someparam'])

So "parent_component" is some special object created by the config loader...

> Registering a config service (old syntax):
> 
>     [globalconfigservice from SomeEgg]
>     someparam = "foo"

...and I assume in this case globalconfigservice does something along 
the lines:

def globalconfigservice(parent_component, next, **args):
     config = parent_component.get_service('global_config').copy()
     config.update(args)
     component = Component(parent_component)
     component.save_service('global_config', config)
     return next(component)

Obviously I'm making up the component interface here.


>     [...next component in the stack...]
> 
> The config service would respond to 'get_service("global_config")' by 
> returning self.
> 
> The idea is that when you chain non-wrapper components in a pipeline, 
> each one gets the previous component as its "parent component", so you 
> can "acquire" services from your parents.  Components nearer to you 
> (i.e. more local) can override more global service definitions.

OK, well now I'm a bit confused... is globalconfigservice a wrapper?  I 
assume globalconfigservice can't modify the parent_component it is 
passed, and has to create a new one?

>> # or you could do the whole thing like:
>>
>> filebrowser = pipeline:
>>     auth wrapper from Paste:
>>         require_role = "admin"
>>         admin_email = config.admin_email
>>     filebrowser from FileBrowser:
>>         document_root = static.document_root
>>
>> # maybe you could specialize/clone like:
>>
>> auth2 = auth:
>>     require_role = "editor"
> 
> 
> Interesting.  If we used "in" to include other files then you could 
> refer to e.g.:
> 
>     foo = main in "some.ini":
>         # override params

Hmm... it would be nice to allow configuration filenames to be 
variables.  Though "in" and "from" don't scream "config file" and "egg" 
to me -- they are both equally vague terms.  I'd rather see "in egg" and 
"in file".

> Also, I was thinking that in this syntax, you want to be able to leave 
> off the trailing ':' for simple definitions, so that this would be a 
> complete definition, without needing a body:
> 
>     foo = main in "some.ini"

Yes, that works well.

> Finally, I think we could drop the "pipeline" keyword and simply use a 
> ':' to define a name, which then gives us a way to stack components 
> inside a definition, e.g.:
> 
>     main:
>         login wrapper from Paste:
>             # blah
>         urlmap from Paste:
>             "/":     static
>             "/blog": main in "blog.ini"
>             "/cms":
>                  auth wrapper from Paste:
>                      require_role = "admin"
>                  filebrowser from FileBrowser:
>                      document_root = static.document_root
> 
> 
> The idea here is that if you want to pass a component or components to a 
> factory, you have to use ':' syntax, with either a one-line component 
> specifier on the same line, or a multi-component stack as an indented 
> suite.

This starts looking a lot like class statements (especially when class 
statements get reused as data definitions).  And of course a bit like 
YAML.  But then both those resemblences are okay.

> An interesting question is whether you should be able to refer to nested 
> definitions as factory prototypes (ala your auth2/auth) or whether only 
> top-level names should be usable.  For example in this:
> 
>     foo: bar from baz
> 
>     spam:
>         foo: snickety from lemon
>         scuzz: foo
>         sprim:
>             thingy: foo
> 
> Does "scuzz: foo" refer to the inner foo or the outer foo?  What about 
> "thingy: foo"?
> 
> I'm inclined to say that both refer to the spam: foo rather than the 
> outermost "foo".  i.e., more or less the same rules as Python scopes.

I agree.  Will spam.foo be an unambiguous representation?  It seems like 
it should be.  Would there be a global object, like globals.foo?

> One minor problem with this syntax overall, though, is that it's a bit 
> context-dependent.  Whether "foo:" means "define foo" or "create a foo" 
> is just a matter of alternating layers.  It would be better if the 
> syntax were less ambiguous, e.g.:

I don't see the distinction between "define" and "create".  By this 
distinction do you mean that pieces of the loading process lazy?  Can 
all parts be lazy?  (I.e., the config file defines named factories, the 
body of sections isn't evaluated until those factories are invoked)

>     main :=
>         login wrapper from Paste:
>             # blah
>         urlmap from Paste:
>             "/"     := static
>             "/blog" := main in "blog.ini"
>             "/cms"  :=
>                  auth wrapper from Paste:
>                      require_role = "admin"
>                  filebrowser from FileBrowser:
>                      document_root = static.document_root

...for instance, when this was "main = pipeline:", it was clear this was 
just another "create", except using "pipeline" to create the object, and 
pipeline looks at the section contents.  The unnamed sections below it 
are just like positional parameters (would named sections be ordered? -- 
I've always wanted ordered class statements, I imagine I'd like to keep 
order here too)

I don't have any attachment to "pipeline", but I think some word is fine 
in that position, and I don't see why this is a particularly "special" 
construct (except of course that it should be builtin).  Would this be 
allowed?:

main = urlmap from Paste:
   "/" = static from Paste:
     document_root = "/home/me/htdocs"

> But that doesn't actually seem to help visually, and makes it harder to 
> write because you have to remember all the time whether you need ":" or 
> ":=".  Maybe this would be better:
> 
>     main is:
>         login wrapper from Paste:
>             # blah
>         urlmap from Paste:
>             match_mode = "longest"
>             "/" is static
>             "/blog" is main in "blog.ini"
>             "/cms" is:
>                  auth wrapper from Paste:
>                      require_role = "admin"
>                  filebrowser from FileBrowser:
>                      document_root = static.document_root

While I'm not attached to "pipeline", "is" is about as vague as "in" and 
"from".

> What's nice about this is that now you can unambiguously create a 
> top-level object in the simple case:
> 
>     zapp from Zope:
>         cfg_file = "site.zcml"
> 
> Without needing to do:
> 
>     main is:
>         zapp from Zope:
>             cfg_file = "site.zcml"

You could do:

main = zapp from Zope:
     cfg_file = "site.zcml"

Assuming "main" was a special magic name for the primary application.  I 
would certainly assume that reading the config file (even I'd never seen 
these config files before).  I, for instance, do not like Python's "if 
__name__=='__main__'" idiom; I think using a conventional name to 
indicate the primary function of a file is just fine.

> Which is a pain, IMO.  Although I suppose we could allow:
> 
>     main is zapp from Zope:
>         cfg_file = "site.zcml"
> 
> which isn't too bad.
> 
> 
>>>> * "name from egg_spec" reads nice on one level, but is vague on 
>>>> another level.  Even if "egg:egg_spec#name" doesn't read well, I 
>>>> think it is nicely self-describing.
>>>
>>>
>>> Um, wha???  The only difference between the two is that one of them 
>>> has "egg:" in front of it, which seems a bit redundant to me.  That's 
>>> probably because I assume that in the long run eggs will be so 
>>> ubiquitous that it really will be redundant to explicitly refer to 
>>> them as such.  :)
>>
>>
>> In paste.deploy config files are full peers to Eggs, and can be used 
>> anywhere that an egg: reference can be used.  I think that's a neat 
>> feature.  I don't want to tack on referencing other config files, like 
>> a special loader factory or textual inclusion hacks or anything like 
>> that.
>>
>> Config files describe applications.  Egg entry points describe 
>> applications.  They should be peers.
> 
> 
> Okay.  The "in" syntax I gave above allows that, although I could also 
> go for only using "from", as long as config URLs are quoted strings.  I 
> also think the strings should be relative or absolute URLs, rather than 
> filenames.  (So that '/' has the same meaning on all platforms.)  That 
> will be something of a pain for Windows users who may need to include 
> drive letters, but oh well.  We can always treat the letters A-Z as a 
> special "file:" protocol to fix that.  :)

By URLs, do you just mean that they use URL syntax, URL quoting of 
filenames, etc?  That's fine by me; I normalize \ to / in paste.deploy 
and run urllib.unquote on the result already.  I'm not sure what to do 
with \'s; they are dumb and annoying and I hate them, but when they slip 
into the system it should at least handle them reasonably.

While it is slightly annoying to keep track of it, I think it's 
important that filenames be defined as relative to the config file that 
they are contained in.  The current working directory is useless, and 
always using absolute filenames makes config files very hard to reuse.

>>> Conversely, if I assume that some further description is required, I 
>>> would want to say "pypi:" or "project:" or something else of that 
>>> sort, because "egg" isn't the essential nature of the thing; the name 
>>> is a *project* name, while eggs are an implementation detail.
>>
>>
>> egg: is an access method, just like http: or whatever.  It doesn't say 
>> what the URI describes, just how to find it.
> 
> 
> Ah, but that's just it.  The project name is a URN, not a URL, precisely 
> because it *doesn't* describe how to locate the resource, it just names 
> the resource and tells the system to go find it.

Well, sure it says how to find it -- load pkg_resources, get the package 
by name, etc.  There's always a "system, go do stuff for me" step, 
that's how computers work.


-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From pje at telecommunity.com  Tue Aug 23 19:08:21 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 23 Aug 2005 13:08:21 -0400
Subject: [Web-SIG] PasteDeploy 0.1
In-Reply-To: <430B48B9.90607@colorstudy.com>
References: <5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com>
	<5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com>
	<5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com>
	<5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com>

At 11:03 AM 8/23/2005 -0500, Ian Bicking wrote:
>Phillip J. Eby wrote:
>>>It's important to me, and it's not intuitive to me what you envision. So 
>>>I feel a need to services in action, replacing global configuration.
>>
>>Using a config service in a factory to get a default argument value:
>>     def some_app_factory(parent_component, **args):
>>         config = parent_component.get_service("global_config")
>>         args.setdefault('someparam', config['someparam'])
>
>So "parent_component" is some special object created by the config loader...

Each component in a "pipeline" receives the previous non-wrapper component 
in the pipeline as its parent.  The top-level parent would be an object 
whose get_service() always returns None or raises an error or something 
like that.  (I'm being vague because we haven't started nailing down a 
precise "services" spec and don't want to mix it in with the syntax 
discussion for now.)


>>Registering a config service (old syntax):
>>     [globalconfigservice from SomeEgg]
>>     someparam = "foo"
>
>...and I assume in this case globalconfigservice does something along the 
>lines:
>
>def globalconfigservice(parent_component, next, **args):
>     config = parent_component.get_service('global_config').copy()
>     config.update(args)
>     component = Component(parent_component)
>     component.save_service('global_config', config)
>     return next(component)
>
>Obviously I'm making up the component interface here.

I was thinking something more like this:

     class globalconfigservice:
         def __init__(self, parent_component, **args):
             self._parent = parent_component
             self._data = args

         def get_service(self, key):
             if key=='global_config':
                 return self
             return self._parent.get_service(key)

         def __getitem__(self,key):
             try:
                 return self._data[key]
             except KeyError:
                 previous = self._parent.get_service('global_config')
                 if previous is None:
                     raise
                 result = self._data[key] = previous[key]
                 return result

This isn't a wrapper, so it doesn't know about the "next" component, and 
doesn't need to.  Parent components can be shared by multiple 
children.  Wrappers, on the other hand, transform their child, and are not 
considered a parent component.

>>     [...next component in the stack...]
>>The config service would respond to 'get_service("global_config")' by 
>>returning self.
>>The idea is that when you chain non-wrapper components in a pipeline, 
>>each one gets the previous component as its "parent component", so you 
>>can "acquire" services from your parents.  Components nearer to you (i.e. 
>>more local) can override more global service definitions.
>
>OK, well now I'm a bit confused... is globalconfigservice a wrapper?  I 
>assume globalconfigservice can't modify the parent_component it is passed, 
>and has to create a new one?

No. The globalconfigservice *becomes* the parent_component of the 
components that follow it, until another non-wrapper component is defined 
(which then becomes the parent of those that follow it, and so on).


>Hmm... it would be nice to allow configuration filenames to be 
>variables.  Though "in" and "from" don't scream "config file" and "egg" to 
>me -- they are both equally vague terms.  I'd rather see "in egg" and "in 
>file".

I'd rather just use 'from ProjectNameHere' and 'from "config_URL_here"', 
since these two syntaxes can cover everything you or I have thus far imagined.


>>An interesting question is whether you should be able to refer to nested 
>>definitions as factory prototypes (ala your auth2/auth) or whether only 
>>top-level names should be usable.  For example in this:
>>     foo: bar from baz
>>     spam:
>>         foo: snickety from lemon
>>         scuzz: foo
>>         sprim:
>>             thingy: foo
>>Does "scuzz: foo" refer to the inner foo or the outer foo?  What about 
>>"thingy: foo"?
>>I'm inclined to say that both refer to the spam: foo rather than the 
>>outermost "foo".  i.e., more or less the same rules as Python scopes.
>
>I agree.  Will spam.foo be an unambiguous representation?  It seems like 
>it should be.  Would there be a global object, like globals.foo?

I guess there could be, but then I lean towards making a file be one object 
by default.  If you want a named top-level, you could do:

main from:
     bar is squidge from spim
     main is bar:
        foo = "whee"

That is, we could allow targetless "from" to promote a name from a new 
child context.


>>One minor problem with this syntax overall, though, is that it's a bit 
>>context-dependent.  Whether "foo:" means "define foo" or "create a foo" 
>>is just a matter of alternating layers.  It would be better if the syntax 
>>were less ambiguous, e.g.:
>
>I don't see the distinction between "define" and "create".

By define I mean "bind the following to the name foo", and by create I mean 
"create an instance using the foo factory".


>   By this distinction do you mean that pieces of the loading process 
> lazy?  Can all parts be lazy?  (I.e., the config file defines named 
> factories, the body of sections isn't evaluated until those factories are 
> invoked)

No; I was strictly speaking of the context-specific nature of that specific 
syntax, because it alternates layers of defining names and invoking 
factories, such that a given snippet of syntax can't be independently 
understood by a reader.


>>     main :=
>>         login wrapper from Paste:
>>             # blah
>>         urlmap from Paste:
>>             "/"     := static
>>             "/blog" := main in "blog.ini"
>>             "/cms"  :=
>>                  auth wrapper from Paste:
>>                      require_role = "admin"
>>                  filebrowser from FileBrowser:
>>                      document_root = static.document_root
>
>...for instance, when this was "main = pipeline:", it was clear this was 
>just another "create", except using "pipeline" to create the object, and 
>pipeline looks at the section contents.  The unnamed sections below it are 
>just like positional parameters (would named sections be ordered? -- I've 
>always wanted ordered class statements, I imagine I'd like to keep order 
>here too)

I don't really want them to be positional parameters, I want them to 
stack.  If pipelines were rare, I'd just nest them and use e.g. a 'next' 
keyword.  However, nested pipelines mean you have to indent everything 
every time you add a new wrapper, which would be like having to do "else: 
if:" instead of "elif:".


>I don't have any attachment to "pipeline", but I think some word is fine 
>in that position, and I don't see why this is a particularly "special" 
>construct (except of course that it should be builtin).  Would this be 
>allowed?:
>
>main = urlmap from Paste:
>   "/" = static from Paste:
>     document_root = "/home/me/htdocs"

This syntax is ambiguous, because you don't know if the thing after the '=' 
should be parsed as a Python expression or as a constructor expression, at 
least not without significant parser lookahead.  Significant lookahead 
isn't that good for a human reader, either.  That's why I think we need 
syntax to distinguish "object definition" from "value assignment".


>>But that doesn't actually seem to help visually, and makes it harder to 
>>write because you have to remember all the time whether you need ":" or 
>>":=".  Maybe this would be better:
>>     main is:
>>         login wrapper from Paste:
>>             # blah
>>         urlmap from Paste:
>>             match_mode = "longest"
>>             "/" is static
>>             "/blog" is main in "blog.ini"
>>             "/cms" is:
>>                  auth wrapper from Paste:
>>                      require_role = "admin"
>>                  filebrowser from FileBrowser:
>>                      document_root = static.document_root
>
>While I'm not attached to "pipeline", "is" is about as vague as "in" and 
>"from".

Well, I'm fine with dropping "in", so we would have only two special 
keywords, "is" and "from", and they're not interchangeable, so there's a 
minimum of ambiguity.  Also, I chose "from" because of the similarity to 
importing, and "is" implies object identity as well as definition (e.g. 
"the definition of main is...").

(One of the things I'm trying to do with this syntax, btw, is stick with 
Python's tokens and keywords, so that the tokenize module can do most of 
the heavy lifting, and I'd also prefer we didn't introduce new reserved 
words that aren't keywords in Python.)


>Assuming "main" was a special magic name for the primary application.  I 
>would certainly assume that reading the config file (even I'd never seen 
>these config files before).  I, for instance, do not like Python's "if 
>__name__=='__main__'" idiom; I think using a conventional name to indicate 
>the primary function of a file is just fine.

Well, __name__=='__main__' doesn't apply here.  I see this as the 
difference between def statements and regular statements in a 
module.  Function bodies aren't executed unless they're used, so it seems 
wrong to me to have a def main.  If the magic name were __main__ I could 
accept it more, except for the fact that it would then highlight the point 
that if the idiom is common enough to need a magic name, then it's common 
enough to warrant a way of doing it without a name!


>>Okay.  The "in" syntax I gave above allows that, although I could also go 
>>for only using "from", as long as config URLs are quoted strings.  I also 
>>think the strings should be relative or absolute URLs, rather than 
>>filenames.  (So that '/' has the same meaning on all platforms.)  That 
>>will be something of a pain for Windows users who may need to include 
>>drive letters, but oh well.  We can always treat the letters A-Z as a 
>>special "file:" protocol to fix that.  :)
>
>By URLs, do you just mean that they use URL syntax, URL quoting of 
>filenames, etc?

Yes.  And that relative URLs are interpreted as relative to the URL that 
was used to load the file they're in.  But also that absolute URLs are 
allowed, which may include application or framework/specific URLs, and the 
loading facility should be hookable to do the actual URL joining and 
retrieving.  ZConfig works like this, and PEAK hooks into it so that all of 
PEAK's special urls like "pkgfile:" and such can be used.  I've definitely 
got an eye on using this format we're discussing as a nice schema-free 
alternative to ZConfig.


>   That's fine by me; I normalize \ to / in paste.deploy and run 
> urllib.unquote on the result already.  I'm not sure what to do with \'s; 
> they are dumb and annoying and I hate them, but when they slip into the 
> system it should at least handle them reasonably.

I think \ should have its normal meaning in a string literal, unless a 
"raw" literal is used.


>While it is slightly annoying to keep track of it, I think it's important 
>that filenames be defined as relative to the config file that they are 
>contained in.  The current working directory is useless, and always using 
>absolute filenames makes config files very hard to reuse.

Agreed; they should be interpreted as URLs relative to the current 
file.  ZConfig (and PEAK's wrapping of it) both use this approach and it 
works well.


>>>>Conversely, if I assume that some further description is required, I 
>>>>would want to say "pypi:" or "project:" or something else of that sort, 
>>>>because "egg" isn't the essential nature of the thing; the name is a 
>>>>*project* name, while eggs are an implementation detail.
>>>
>>>egg: is an access method, just like http: or whatever.  It doesn't say 
>>>what the URI describes, just how to find it.
>>
>>Ah, but that's just it.  The project name is a URN, not a URL, precisely 
>>because it *doesn't* describe how to locate the resource, it just names 
>>the resource and tells the system to go find it.
>
>Well, sure it says how to find it -- load pkg_resources, get the package 
>by name, etc.  There's always a "system, go do stuff for me" step, that's 
>how computers work.

I'm referring here to the technical meaning of a "naming" system versus an 
"addressing" system.  An addressing system identifies a canonical "naming 
authority" that provides global uniqueness, whereas a "naming" system only 
implies the context in which the name may be understood.  You can read up 
the RFCs on URNs vs. URLs (which are both subtypes of URI), or you can read 
up on JNDI, LDAP, x.500 and other "naming" services if you don't believe 
me.  An 'egg:' URI would be a URN, not a URL, and the 'egg' makes no sense 
in either case, because an egg is a resource *type*, not a naming or 
addressing scheme.  Thus, if I were to create a URI scheme for eggs, I 
would use a name like 'pypi:' or 'py-project:' or something like that, to 
denote the naming scheme.


From ianb at colorstudy.com  Tue Aug 23 22:16:00 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 23 Aug 2005 15:16:00 -0500
Subject: [Web-SIG] [Paste] Re:  PasteDeploy 0.1
In-Reply-To: <Pine.LNX.4.62.0508231558290.8482@hydrogen.sabren.com>
References: <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com>
	<Pine.LNX.4.62.0508231558290.8482@hydrogen.sabren.com>
Message-ID: <430B8400.4050003@colorstudy.com>

Michal Wallace wrote:
> I'm working on a similar problem, and I really like your approach
> here, but I feel like I'm going to have to reinvent the wheel for 
> my particular framework, because it's RESTlike. See, in addition 
> to each URL, I'd like to be able to dispatch based on the HTTP 
> method (GET,PUT,POST,DELETE...)
> 
> What would be nice (for me) is if you could do something like:
> 
> 
>    GET / = config:static_root.ini
>    POST /cms = config:filebrowser.ini
>    * /blog = config:blog.ini

This shouldn't be a problem (in paste.deploy or the alternatives we're 
discussing) -- the "urlmap" I refer to is just (not very complicated) 
Python code.  You could do the same thing dispatching on HTTP methods. 
With paste.deploy you'd do something like:

[composit:main]
use = egg:MyFramework#httpdispatch
GET = config:static_root.ini ...
...

If you want to do both at once (dispatch both on path like urlmap, and 
on HTTP method) you'd have to make something like urlmap that also keeps 
track of methods, and use "GET / = ...".  Potentially urlmap could 
support all of these (maybe not all as egg:Paste#urlmap, but with the 
same basic code).  Right now it matches based on path prefix and domain, 
and I've meant to add ports, and HTTP method would be easy enough.

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From pje at telecommunity.com  Tue Aug 23 22:27:05 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 23 Aug 2005 16:27:05 -0400
Subject: [Web-SIG] [Paste] Re:  PasteDeploy 0.1
Message-ID: <5.1.1.6.0.20050823162659.02024808@mail.telecommunity.com>

At 04:05 PM 8/23/2005 -0400, Michal Wallace wrote:
>What would be nice (for me) is if you could do something like:
>
>    GET / = config:static_root.ini
>    POST /cms = config:filebrowser.ini
>    * /blog = config:blog.ini
>
>Where "*" indicates "any HTTP method"... And of
>course "*" could be the default, so if you don't
>care about methods people could just use the
>existing syntax.

This is accomodated fairly easy within the syntax currently being discussed:

     methodmap from Paste:
         GET is urlmap from Paste:
             "/" is main from "static_root.ini"
         POST is urlmap from Paste:
              "/cms" is main from "filebrowser.ini"
         "*" is urlmap from Paste:
              "/blog" is main from "blog.ini"

Although this might also be spelled:

      main from:
         byURL is url_dispatcher from Paste
         byMethod is method_dispatcher from Paste

         main is byMethod:
             GET is byURL:
                 "/" is main from "static_root.ini"
             POST is byURL:
                  "/cms" is main from "filebrowser.ini"
             "*" is byURL:
                  "/blog" is main from "blog.ini"


From ianb at colorstudy.com  Tue Aug 23 22:37:17 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 23 Aug 2005 15:37:17 -0500
Subject: [Web-SIG] [Paste] Re:  PasteDeploy 0.1
In-Reply-To: <Pine.LNX.4.62.0508231558290.8482@hydrogen.sabren.com>
References: <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com>
	<Pine.LNX.4.62.0508231558290.8482@hydrogen.sabren.com>
Message-ID: <430B88FD.3000702@colorstudy.com>

Michal Wallace wrote:
>    GET / = config:static_root.ini
>    POST /cms = config:filebrowser.ini
>    * /blog = config:blog.ini

One interesting thing about this sort of thing is, REST or no, you 
probably aren't going to do method-based dispatch on a server level, 
since it's hard to actually partition applications that way.  For 
example, you could almost put a transparent webdav layer on top of 
something else, except GET is overloaded, and you'd actually end up with 
some user-agent-based dispatch, which doesn't seem particularly RESTful.

But I can imagine using this deployment format as an internal format 
when setting up your otherwise-encapsulated application.


-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From ianb at colorstudy.com  Tue Aug 23 22:41:27 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 23 Aug 2005 15:41:27 -0500
Subject: [Web-SIG] [Paste] Re:  PasteDeploy 0.1
In-Reply-To: <Pine.LNX.4.62.0508231558290.8482@hydrogen.sabren.com>
References: <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com>
	<Pine.LNX.4.62.0508231558290.8482@hydrogen.sabren.com>
Message-ID: <430B89F7.4070407@colorstudy.com>

Whoops, hit the send key before I had finished the response...

Michal Wallace wrote:
>    GET / = config:static_root.ini
>    POST /cms = config:filebrowser.ini
>    * /blog = config:blog.ini

One interesting thing about this sort of thing is, REST or no, you 
probably aren't going to do method-based dispatch on a server level, 
since it's hard to actually partition applications that way.  For 
example, you could almost put a transparent webdav layer on top of 
something else, except GET is overloaded, and you'd actually end up with 
some user-agent-based dispatch, which doesn't seem particularly RESTful.

But I can imagine using this deployment format as an internal format 
when setting up your otherwise-encapsulated application.  This is the 
way some of the regex-based dispatching frameworks work (like Django), 
or something like the Rails Routes port could work.  These require 
configuration, and the configuration we're discussing here is actually 
pretty reasonable for those kinds of systems.

When you are doing that, I'd guess you'd put the configuration in your 
package (in the .egg-info directory or elsewhere), and then create a 
little shell of a function that loads the application described by the 
configuration file.

Anyway, another use case to keep in mind; I'd thought about 
configuration files contained inside distributions, but I hadn't 
actually thought of a good reason for it until now.

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From ianb at colorstudy.com  Wed Aug 24 00:27:39 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 23 Aug 2005 17:27:39 -0500
Subject: [Web-SIG] PasteDeploy 0.1
In-Reply-To: <5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com>
References: <5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com>
	<5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com>
	<5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com>
	<5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com>
	<5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com>
Message-ID: <430BA2DB.6070003@colorstudy.com>

Phillip J. Eby wrote:
> I was thinking something more like this:
> 
>     class globalconfigservice:
>         def __init__(self, parent_component, **args):
>             self._parent = parent_component
>             self._data = args
> 
>         def get_service(self, key):
>             if key=='global_config':
>                 return self
>             return self._parent.get_service(key)
> 
>         def __getitem__(self,key):
>             try:
>                 return self._data[key]
>             except KeyError:
>                 previous = self._parent.get_service('global_config')
>                 if previous is None:
>                     raise
>                 result = self._data[key] = previous[key]
>                 return result
> 
> This isn't a wrapper, so it doesn't know about the "next" component, and 
> doesn't need to.  Parent components can be shared by multiple children.  
> Wrappers, on the other hand, transform their child, and are not 
> considered a parent component.

So services (aka components) are just a objects with .get_service(key) 
methods?  Is there any other API or semantics implied?

>> OK, well now I'm a bit confused... is globalconfigservice a wrapper?  
>> I assume globalconfigservice can't modify the parent_component it is 
>> passed, and has to create a new one?
> 
> 
> No. The globalconfigservice *becomes* the parent_component of the 
> components that follow it, until another non-wrapper component is 
> defined (which then becomes the parent of those that follow it, and so on).

Does the configuration somehow indicate that something produces a 
component, as opposed to producing the object-in-question (WSGI 
application for us)?  I'm not clear how an application, an application 
wrapper, and a component wrapper are distinguished.

>> Hmm... it would be nice to allow configuration filenames to be 
>> variables.  Though "in" and "from" don't scream "config file" and 
>> "egg" to me -- they are both equally vague terms.  I'd rather see "in 
>> egg" and "in file".
> 
> 
> I'd rather just use 'from ProjectNameHere' and 'from "config_URL_here"', 
> since these two syntaxes can cover everything you or I have thus far 
> imagined.

What exactly do you envision for config URLs?

>>>     main :=
>>>         login wrapper from Paste:
>>>             # blah
>>>         urlmap from Paste:
>>>             "/"     := static
>>>             "/blog" := main in "blog.ini"
>>>             "/cms"  :=
>>>                  auth wrapper from Paste:
>>>                      require_role = "admin"
>>>                  filebrowser from FileBrowser:
>>>                      document_root = static.document_root
>>
>>
>> ...for instance, when this was "main = pipeline:", it was clear this 
>> was just another "create", except using "pipeline" to create the 
>> object, and pipeline looks at the section contents.  The unnamed 
>> sections below it are just like positional parameters (would named 
>> sections be ordered? -- I've always wanted ordered class statements, I 
>> imagine I'd like to keep order here too)
> 
> 
> I don't really want them to be positional parameters, I want them to 
> stack.  If pipelines were rare, I'd just nest them and use e.g. a 'next' 
> keyword.  However, nested pipelines mean you have to indent everything 
> every time you add a new wrapper, which would be like having to do 
> "else: if:" instead of "elif:".

A stack and positional parameters are nearly the same thing...

def pipeline(*args):
     app = args[-1]
     wrappers = args[:-1]
     wrappers.reverse()
     for wrapper in wrappers:
         app = wrapper(app)
     return wrapper

...?  Throw in a couple other arguments and whatnot for keywords or 
whatever, it doesn't matter.

Another case where I would use positional parameters to do something 
different would be a cascading dispatcher, like:

main is cascade from Paste:
     static from Paste:
         document_root = "/..."
     blog from MyBlog:
         ...
     catch = 404

I can phrase the same thing in other ways (in paste.deploy it uses the 
sorted keys that start with "app"), but it seems like an unexpected 
bonus if the format is general enough to do this.

>> I don't have any attachment to "pipeline", but I think some word is 
>> fine in that position, and I don't see why this is a particularly 
>> "special" construct (except of course that it should be builtin).  
>> Would this be allowed?:
>>
>> main = urlmap from Paste:
>>   "/" = static from Paste:
>>     document_root = "/home/me/htdocs"
> 
> 
> This syntax is ambiguous, because you don't know if the thing after the 
> '=' should be parsed as a Python expression or as a constructor 
> expression, at least not without significant parser lookahead.  
> Significant lookahead isn't that good for a human reader, either.  
> That's why I think we need syntax to distinguish "object definition" 
> from "value assignment".

OK, I see the issue now.  I guess "is" is fine; I think using different 
punctuation like := is much too subtle.

>>> But that doesn't actually seem to help visually, and makes it harder 
>>> to write because you have to remember all the time whether you need 
>>> ":" or ":=".  Maybe this would be better:
>>>     main is:
>>>         login wrapper from Paste:
>>>             # blah
>>>         urlmap from Paste:
>>>             match_mode = "longest"
>>>             "/" is static
>>>             "/blog" is main in "blog.ini"
>>>             "/cms" is:
>>>                  auth wrapper from Paste:
>>>                      require_role = "admin"
>>>                  filebrowser from FileBrowser:
>>>                      document_root = static.document_root
>>
>>
>> While I'm not attached to "pipeline", "is" is about as vague as "in" 
>> and "from".
> 
> 
> Well, I'm fine with dropping "in", so we would have only two special 
> keywords, "is" and "from", and they're not interchangeable, so there's a 
> minimum of ambiguity.  Also, I chose "from" because of the similarity to 
> importing, and "is" implies object identity as well as definition (e.g. 
> "the definition of main is...").

If you really want similarities, invert "is" and call it "as".  "foo 
from AnEgg as main:".  But I think that's backwards, so I wouldn't 
really advocate for it.

> (One of the things I'm trying to do with this syntax, btw, is stick with 
> Python's tokens and keywords, so that the tokenize module can do most of 
> the heavy lifting, and I'd also prefer we didn't introduce new reserved 
> words that aren't keywords in Python.)
> 
> 
>> Assuming "main" was a special magic name for the primary application.  
>> I would certainly assume that reading the config file (even I'd never 
>> seen these config files before).  I, for instance, do not like 
>> Python's "if __name__=='__main__'" idiom; I think using a conventional 
>> name to indicate the primary function of a file is just fine.
> 
> 
> Well, __name__=='__main__' doesn't apply here.  I see this as the 
> difference between def statements and regular statements in a module.  
> Function bodies aren't executed unless they're used, so it seems wrong 
> to me to have a def main.  If the magic name were __main__ I could 
> accept it more, except for the fact that it would then highlight the 
> point that if the idiom is common enough to need a magic name, then it's 
> common enough to warrant a way of doing it without a name!

I think we disagree about one-app-per-file, and perhaps you also have a 
notion that doesn't come out in all of your examples that you want a 
stack represented at the top-level of the file...?  That is, like:

   auth from Paste:
     ...
   # wraps...
   session from Session:
     ...
   # wraps
   main from MyApp:
     ...


If that's what you are getting at, I *really* don't like that.  Config 
files don't use top-level ordering often at all.  The few cases where 
order matters, it's purely as priority for overlapping options (like 
rewrite rules).  And those few cases suck anyway because of the 
ambiguity of overlap, so it's kind of the exception that proves the rule.

I'm okay with ordering *under* the top-level names, like:

   main is:
     auth from Paste:
       ...
     session from Session:
       ...

... it doesn't appeal to me, but it doesn't bother me.

I don't want "main" (or worse "__main__") to be special, just to be 
conventional, like as a default to a keyword argument.

It's an ugly wart that every (good!) script in Python looks like:

   def main():
       ...

   if __name__ == '__main__':
       main()

That's nothing but stupid boilerplate, because otherwise you can't get 
at that function if you put everything in the "if" statement.  In the 
same way, I want to be able to  be able to pick pieces out of a 
configuration file without creating the main application, and I want to 
be able to look in the main application without creating it (since it's 
mostly opaque once it's been created).

__main__ is completely unnecessary, as "main" seems quite special on its 
own without scary underscores.  It's a very natural name, and one that 
should be intuitive to anyone reading the file.  That it has a name 
shows that it is a distinct entity, but a series of unnamed entries in 
the config file doesn't imply that in the same way.

>>   That's fine by me; I normalize \ to / in paste.deploy and run 
>> urllib.unquote on the result already.  I'm not sure what to do with 
>> \'s; they are dumb and annoying and I hate them, but when they slip 
>> into the system it should at least handle them reasonably.
> 
> 
> I think \ should have its normal meaning in a string literal, unless a 
> "raw" literal is used.

Hmm... it doesn't really matter to me, since I never use Windows.  But 
whenever I gaze upon Windows filenames in Python they hurt my eyes.

I agree anything in "" should be a string literal, with all the string 
literal rules.  Maybe these don't have to be string literals.

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From pje at telecommunity.com  Wed Aug 24 01:32:39 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 23 Aug 2005 19:32:39 -0400
Subject: [Web-SIG] PasteDeploy 0.1
In-Reply-To: <430BA2DB.6070003@colorstudy.com>
References: <5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com>
	<5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com>
	<5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com>
	<5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com>
	<5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com>
	<5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com>

At 05:27 PM 8/23/2005 -0500, Ian Bicking wrote:
>So services (aka components) are just a objects with .get_service(key) 
>methods?  Is there any other API or semantics implied?

Not at the handwavy level we're currently discussing them with, no.


>>No. The globalconfigservice *becomes* the parent_component of the 
>>components that follow it, until another non-wrapper component is defined 
>>(which then becomes the parent of those that follow it, and so on).
>
>Does the configuration somehow indicate that something produces a 
>component, as opposed to producing the object-in-question (WSGI 
>application for us)?  I'm not clear how an application, an application 
>wrapper, and a component wrapper are distinguished.

In the syntax I've been using to date, "wrapper" simply indicates that the 
component wishes to receive the components following it as an argument, 
replacing them with the wrapper's return value.  All non-wrappers are just 
components.

As I've been thinking through the implementation some more, I've realized 
that the "wrapper" keyword isn't really needed, if the construction 
responsibilities are divided a bit differently than I first had in 
mind.  More on that in a later post.


>>I'd rather just use 'from ProjectNameHere' and 'from "config_URL_here"', 
>>since these two syntaxes can cover everything you or I have thus far imagined.
>
>What exactly do you envision for config URLs?

In the simple case, they should just be relative URLs.


>Another case where I would use positional parameters to do something 
>different would be a cascading dispatcher, like:
>
>main is cascade from Paste:
>     static from Paste:
>         document_root = "/..."
>     blog from MyBlog:
>         ...
>     catch = 404

This syntax is ambiguous at 1-token lookahead, because you can't tell up 
front whether "static" is supposed to be a name that's being assigned (and 
therefore followed by "is" or "="), or whether it's a factory name (in 
which case it may be followed by "." and more identifiers, possibly 
followed by "from").

There might be a way to disambiguate it by complicating the grammar, I 
suppose, but I'm not sure I like it.  The way I've currently conceived of 
the grammar is that you can have either assignment (namespace) scopes or 
sequence scopes.  In my way of thinking, the top-level is a sequence scope, 
and everything else is a namespace scope, unless you introduce a sequence 
scope using "is:".  Thus, I see your example above as simply beginning 
"main is:", and then the contents can be a sequence.


>I think we disagree about one-app-per-file, and perhaps you also have a 
>notion that doesn't come out in all of your examples that you want a stack 
>represented at the top-level of the file...?  That is, like:
>
>   auth from Paste:
>     ...
>   # wraps...
>   session from Session:
>     ...
>   # wraps
>   main from MyApp:
>     ...
>
>
>If that's what you are getting at, I *really* don't like that.  Config 
>files don't use top-level ordering often at all.

That depends quite a lot on what the configuration file does, and its format.

However, if you would like to make it not be that way, all you have to do is:

     main from:
         # named stuff here

My reasoning for this is as follows.  In the simplest possible case, a user 
should be able to deploy an application using only this, as their entire file:

     app from SomeCoolApp

In other words, the above is the "hello world" of this language.  Your 
variation would be:

     main is app from SomeCoolApp

Not a lot of difference at this initial level, but now let's add a 
filter.  My way:

     login from Paste
     app from SomeCoolApp

Your way:

     main is:
         login from Paste
         app from SomeCoolApp

The big difference between your take and my take on this is that I'm 
viewing a file as specifying an object, while you're viewing it as defining 
a namespace of objects.


>   The few cases where order matters, it's purely as priority for 
> overlapping options (like rewrite rules).  And those few cases suck 
> anyway because of the ambiguity of overlap, so it's kind of the exception 
> that proves the rule.

But pipelines are sequences too.


>That's nothing but stupid boilerplate, because otherwise you can't get at 
>that function if you put everything in the "if" statement.  In the same 
>way, I want to be able to  be able to pick pieces out of a configuration 
>file without creating the main application, and I want to be able to look 
>in the main application without creating it (since it's mostly opaque once 
>it's been created).

You're making the assumption that what you "get" is the created object, 
while I'm assuming that what you get is a partially-applied factory, with 
properties that return configuration values or other factories.  You still 
have to call the factory to create the objects.

IOW, the way I see it is that you parse a configuration file by providing 
some scope-and-context information, and you get a factory object back.  If 
the factory object is a namespace, then you can access its properties to 
get values or child factories.  So, to create a library configuration file, 
I'd assume something like:

     some_factory:
         foo is blah:
             ...
         bar is feh:
             ...

What 'some_factory' actually creates is unimportant if it never gets 
called, and if you're just pulling pieces out of it in another 
configuration file, it won't get called.


>__main__ is completely unnecessary, as "main" seems quite special on its 
>own without scary underscores.  It's a very natural name, and one that 
>should be intuitive to anyone reading the file.  That it has a name shows 
>that it is a distinct entity, but a series of unnamed entries in the 
>config file doesn't imply that in the same way.

Yeah, it's just that it seems weird to me to have URLs represent namespaces 
that contain objects, but not be able to have URLs refer to objects!  That 
seems downright strange.

It also seems to me that the common case will be to define a single 
pipeline in a file (often with just a single component!), and that making 
the library developer's job easier (by avoiding the 'some_factory:' wrapper 
at the top level) makes the deployer's job harder (by requiring a "main 
is:" wrapper).

That pretty much seems like the tradeoff; either the multi-config developer 
has to do an extra indent, or else the deployer does.  My inclination is to 
favor the deployer.


>Maybe these don't have to be string literals.

They do if we want to keep it compatible with Python's tokenizer, and I 
definitely want that.  For one thing, it potentially allows implementing a 
pgen-based C parser for this.

Speaking of parsers, here's my current idea of the grammar:

   sequence ::= object+
   object   ::= qname source? (suite | NEWLINE)
   source   ::= "from" (STRING | project)?
   suite    ::= ":" INDENT assign+ DEDENT
   assign   ::= (NAME | STRING) ( ("=" testlist NEWLINE) | ("is" objects) )
   objects  ::= object | ":" INDENT sequence DEDENT
   qname    ::= NAME ("." NAME)*

   project  ::= NAME ("-" NAME)* versions? extras?
   versions ::= cmpop version ("," cmpop version)* ","?
   version  ::= INT | FLOAT | STRING    # maybe just string?
   cmpop    ::= "<" | "<=" | "==" | "!=" | ">=" | ">"
   extras   ::= "[" NAME ("," NAME)* ","? "]"

As you can see, the core syntax is just seven productions, not counting the 
five for egg project requirements and the "testlist" productions from the 
Python expression grammar.  So, it's pretty darn simple as languages go.

My rough concept of the semantics is that suites represent functions, and 
definitions are a cross between setting function attributes on the function 
defined by the enclosing suite, and setting a default value for a keyword 
argument within that enclosing function.  i.e.:

    foo:
        bar is baz:
           spam = 23

is roughly equivalent to:

    def __main__(**kw):
        kw.setdefault('bar', __main__.bar())

    def bar(**kw):
        kw.setdefault('spam', 23)
        return baz(**kw)

    bar.spam = 23

    __main__.bar = bar

For sequences of definitions, you get a function whose attributes come from 
the namespace of the last suite in the sequence.

This is all *rough* semantics, mind you; it will almost certainly *not* be 
implemented using Python functions, because of the need to manage many 
levels of nested scopes, and the calling signatures won't exactly match 
this either.  I'm just giving this "as functions" sketch to give an idea of 
why the whole thing can readily be introspected as data if you want it to be.


From renesd at gmail.com  Wed Aug 24 02:17:30 2005
From: renesd at gmail.com (Rene Dudfield)
Date: Wed, 24 Aug 2005 10:17:30 +1000
Subject: [Web-SIG] cusom config files. was (PasteDeploy 0.1)
In-Reply-To: <5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com>
References: <4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com>
	<5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com>
	<5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com>
	<430BA2DB.6070003@colorstudy.com>
	<5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com>
Message-ID: <64ddb72c05082317175d69d49@mail.gmail.com>

Hey,

are custom config files with custom parsers needed or wanted for configuration?

Would not a .ini, python, xml, sql db, file system, or even apache
style config file be better?

If a common format is used then:
1) less code to maintain.
2) less to learn/document.


Cheers,

From pje at telecommunity.com  Wed Aug 24 02:46:39 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 23 Aug 2005 20:46:39 -0400
Subject: [Web-SIG] cusom config files. was (PasteDeploy 0.1)
In-Reply-To: <64ddb72c05082317175d69d49@mail.gmail.com>
References: <5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com>
	<4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com>
	<5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com>
	<5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com>
	<430BA2DB.6070003@colorstudy.com>
	<5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050823202008.01b1c980@mail.telecommunity.com>

At 10:17 AM 8/24/2005 +1000, Rene Dudfield wrote:
>Hey,
>
>are custom config files with custom parsers needed or wanted for 
>configuration?
>
>Would not a .ini, python, xml, sql db, file system, or even apache
>style config file be better?
>
>If a common format is used then:
>1) less code to maintain.
>2) less to learn/document.

That would only be true if there were a common format that worked.  The 
main problem is that all of those formats simply push the complexity from 
the syntax to the semantic level!

One previous proposal was for an .ini variant that could handle pipelines 
easily, but could not do URL dispatch without awkward hacks to the .ini 
syntax.  .ini files are extremely difficult to use for any kind of nesting.

Python files are possible, and that approach has been discussed a bit, but 
the full Python language may be a little overpowered for configuration, 
while at the same time not offering convenient constructs for simple things.

XML is too verbose, redundant, and strict, and simply pushes the issue to 
defining the XML schema involved.  Also, the very use of XML tends to 
attract XML geeks who then nitpick about whether you're using XSD or DTDs 
properly and why you shouldn't use attributes for data, blah blah blah.  ;)

I'm not sure what you mean by SQL DB, but if you mean putting the 
configuration in a database, I don't see why that would be useful or 
good.  Similarly, I don't know what you mean by "file system".

Apache-style configuration (like ZConfig) can also get very ugly very 
quickly when nesting gets involved, and it has no built-in way to reference 
items within the configuration, so like XML and .ini files it forces you to 
invent your own reference semantics layered atop the basic syntax.

(You didn't mention YAML, but I'll point out anyway that it has way too 
many subsyntaxes, punctuation tricks and suchlike to be easy for humans to 
write, while not expanding on the capabilities of XML that much.)

Really the problem is that of the basic possible syntaxes, Python and XML 
are the only ones that come close to having adequate expressive power.  XML 
falls short of being able to implement the more complex use cases without 
creating some sort of mini-programming language within XML, and Python 
requires verbose procedural constructs to create declarative hierarchies 
that would be easy in XML.

Thus, the proposal that I've been fronting at the moment is actually a 
hybrid of XML-like structure and Python-like language characteristics.  If 
it fails, I'm not sure what I'd fall back to.

The nice thing about this "Python data language" is that I can see a lot of 
applications besides web stuff.  For example, Chandler's UI really wants to 
have a more declarative format than can easily be done in pure Python, but 
a more computationally-flexible format than can easily be done in XML.  I 
can basically see this "data language" being used for a lot of things that 
otherwise would be done crudely with .ini, .xml, ZConfig, or one of the 
other "standard" formats.  Consider, for example, the grotesque hack of 
.ini syntax used by the "logging" module to define loggers, handlers, and 
filters -- and then consider what it could look like if it used this "data 
language" instead.

I would say that there is definitely a real need for a declarative Python 
object definition syntax that supports nesting and internal references, and 
so if we can come up with something good, it can and should *become* a 
standard for such purposes, well beyond the scope of its initial mission of 
being a WSGI deployment syntax.


From renesd at gmail.com  Wed Aug 24 03:14:30 2005
From: renesd at gmail.com (Rene Dudfield)
Date: Wed, 24 Aug 2005 11:14:30 +1000
Subject: [Web-SIG] cusom config files. was (PasteDeploy 0.1)
In-Reply-To: <5.1.1.6.0.20050823202008.01b1c980@mail.telecommunity.com>
References: <4308E29F.6040607@colorstudy.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com>
	<5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com>
	<5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com>
	<430BA2DB.6070003@colorstudy.com>
	<5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com>
	<64ddb72c05082317175d69d49@mail.gmail.com>
	<5.1.1.6.0.20050823202008.01b1c980@mail.telecommunity.com>
Message-ID: <64ddb72c05082318145a9cafce@mail.gmail.com>

On 8/24/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> 
> I'm not sure what you mean by SQL DB, but if you mean putting the
> configuration in a database, I don't see why that would be useful or
> good.  Similarly, I don't know what you mean by "file system".
> 

By sql db I meant storing configuration in a database.  Which has many
advantages including scaling, searching, ACID, permissions etc etc.

By filesystem I mean djb, /proc/ and others style.  An example for
virtual hosts might be:

virtual_hosts/
virtual_hosts/1/
virtual_hosts/1/name
virtual_hosts/1/ip_address
virtual_hosts/1/port
virtual_hosts/1/directory
virtual_hosts/1/access_log_path


Good luck with your configurationing!

From michal at sabren.com  Wed Aug 24 05:46:57 2005
From: michal at sabren.com (Michal Wallace)
Date: Tue, 23 Aug 2005 23:46:57 -0400 (EDT)
Subject: [Web-SIG] cusom config files. was (PasteDeploy 0.1)
In-Reply-To: <5.1.1.6.0.20050823202008.01b1c980@mail.telecommunity.com>
References: <5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com>
	<4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com>
	<5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com>
	<5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com>
	<430BA2DB.6070003@colorstudy.com>
	<5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com>
	<5.1.1.6.0.20050823202008.01b1c980@mail.telecommunity.com>
Message-ID: <Pine.LNX.4.62.0508232343560.22836@hydrogen.sabren.com>

On Tue, 23 Aug 2005, Phillip J. Eby wrote:

> I would say that there is definitely a real need for a declarative Python 
> object definition syntax that supports nesting and internal references, and 
> so if we can come up with something good, it can and should *become* a 
> standard for such purposes, well beyond the scope of its initial mission of 
> being a WSGI deployment syntax.

Well, if that's all you want to do, then 
why not just add some syntactic sugar 
to pickle?

Sincerely,
 
Michal J Wallace
Sabren Enterprises, Inc.
-------------------------------------
contact: michal at sabren.com
hosting: http://www.cornerhost.com/
my site: http://www.withoutane.com/
-------------------------------------


From pje at telecommunity.com  Wed Aug 24 07:55:02 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 24 Aug 2005 01:55:02 -0400
Subject: [Web-SIG] cusom config files. was (PasteDeploy 0.1)
In-Reply-To: <Pine.LNX.4.62.0508232343560.22836@hydrogen.sabren.com>
References: <5.1.1.6.0.20050823202008.01b1c980@mail.telecommunity.com>
	<5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com>
	<4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com>
	<5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com>
	<5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com>
	<430BA2DB.6070003@colorstudy.com>
	<5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com>
	<5.1.1.6.0.20050823202008.01b1c980@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050824015409.01b1eb18@mail.telecommunity.com>

At 11:46 PM 8/23/2005 -0400, Michal Wallace wrote:
>On Tue, 23 Aug 2005, Phillip J. Eby wrote:
>
> > I would say that there is definitely a real need for a declarative Python
> > object definition syntax that supports nesting and internal references, 
> and
> > so if we can come up with something good, it can and should *become* a
> > standard for such purposes, well beyond the scope of its initial 
> mission of
> > being a WSGI deployment syntax.
>
>Well, if that's all you want to do, then
>why not just add some syntactic sugar
>to pickle?

pickles aren't a declarative format; they're procedural.


From ianb at colorstudy.com  Wed Aug 24 08:01:17 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed, 24 Aug 2005 01:01:17 -0500
Subject: [Web-SIG] cusom config files. was (PasteDeploy 0.1)
In-Reply-To: <64ddb72c05082318145a9cafce@mail.gmail.com>
References: <4308E29F.6040607@colorstudy.com>	<64ddb72c050821233445da5238@mail.gmail.com>	<5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>	<5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com>	<5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com>	<5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com>	<430BA2DB.6070003@colorstudy.com>	<5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com>	<64ddb72c05082317175d69d49@mail.gmail.com>	<5.1.1.6.0.20050823202008.01b1c980@mail.telecommunity.com>
	<64ddb72c05082318145a9cafce@mail.gmail.com>
Message-ID: <430C0D2D.8050907@colorstudy.com>

Rene Dudfield wrote:
> On 8/24/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> 
>>I'm not sure what you mean by SQL DB, but if you mean putting the
>>configuration in a database, I don't see why that would be useful or
>>good.  Similarly, I don't know what you mean by "file system".
>>
> 
> 
> By sql db I meant storing configuration in a database.  Which has many
> advantages including scaling, searching, ACID, permissions etc etc.

Do you mean like putting the configuration files in a database?  That 
shouldn't be a problem if there's a consistent way to access files 
(pkg_resources?) that handles (or has an interface for) virtual file 
systems.  If it doesn't go in initially, I expect it would be a simple 
refactoring otherwise.

If you don't intend to use text configuration files, then you'd have to 
code your own logic to put the pieces together.  This is perfectly fine 
to do, and quite reasonable as well.  If, for instance, you were doing 
some system where new applications were deployed automatically based on 
a very constrained configuration, you can easily do that 
programmatically in Python without involving any configuration files.

-- 
Ian Bicking  /  ianb at colorstudy.com  / http://blog.ianbicking.org

From michal at sabren.com  Wed Aug 24 09:05:36 2005
From: michal at sabren.com (Michal Wallace)
Date: Wed, 24 Aug 2005 03:05:36 -0400 (EDT)
Subject: [Web-SIG] cusom config files. was (PasteDeploy 0.1)
In-Reply-To: <5.1.1.6.0.20050824015409.01b1eb18@mail.telecommunity.com>
References: <5.1.1.6.0.20050823202008.01b1c980@mail.telecommunity.com>
	<5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com>
	<4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com>
	<64ddb72c050821233445da5238@mail.gmail.com>
	<5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com>
	<5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com>
	<5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com>
	<5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com>
	<430BA2DB.6070003@colorstudy.com>
	<5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com>
	<5.1.1.6.0.20050823202008.01b1c980@mail.telecommunity.com>
	<5.1.1.6.0.20050824015409.01b1eb18@mail.telecommunity.com>
Message-ID: <Pine.LNX.4.62.0508240213470.10075@hydrogen.sabren.com>

On Wed, 24 Aug 2005, Phillip J. Eby wrote:

> At 11:46 PM 8/23/2005 -0400, Michal Wallace wrote:
> > On Tue, 23 Aug 2005, Phillip J. Eby wrote:
>
> > > I would say that there is definitely a real need for a
> > > declarative Python object definition syntax that supports
> > > nesting and internal references, and so if we can come up
> > > with something good, it can and should *become* a standard for
> > > such purposes, well beyond the scope of its initial mission
> > > of being a WSGI deployment syntax.
>
> > Well, if that's all you want to do, then
> > why not just add some syntactic sugar
> > to pickle?
> 
> pickles aren't a declarative format; they're procedural.


Huh. So it is. I didn't know that. :)

I guess my real point is that it seems like 
a huge leap to come up with a whole new language, 
when python itself can do the job just fine. 

For example, if you set up a coding standard 
where data classes have an empty constructor,
then you can do something like this:


class Instance(Class, **kwargs):
    def __init__(self):
        self.class_ = Class
        self.kwargs = kwargs

    def eval(self):
        obj = self.class_()
        for k, v in kw.items():
            setattr(obj, kw)
        return obj


and maybe this for forward references:

class Promise(thunk):
    def __init__(self, thunk):
        self.thunk = thunk

    def eval(self):
        return self.thunk()


Then you can make all kinds of complicated things
declaratively:

class Node:
   pass

def aComplicatedStructure():
   loop = Instance(Node, next=Promise(lambda: loop))
   return Instance(Node, child=loop, other={"a":"b"})


The only thing missing is to walk the tree and replace 
any Promise or Instance node with the result of its 
eval().

I'm sure there's a way to do all that without the 
restriction on __init__, too... Just add another 
class along those lines that handles parameters to
the constructor.

Now, I'm *not* saying this is the way to go for WSGI.
But if you're going to shoot for the moon and propose 
a standard to use for *everything*, I think plain old 
python is more than adequate.

Sincerely,
 
Michal J Wallace
Sabren Enterprises, Inc.
-------------------------------------
contact: michal at sabren.com
hosting: http://www.cornerhost.com/
my site: http://www.withoutane.com/
-------------------------------------


From ianb at colorstudy.com  Mon Aug 29 02:01:21 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Sun, 28 Aug 2005 19:01:21 -0500
Subject: [Web-SIG] WSGI config, transparency
Message-ID: <43125051.8040202@colorstudy.com>

Anyway, I'm okay the discussion has died down a bit -- I'll keep working 
on paste.deploy and see how that works out, and revisit this later. 
Right now I'm more interested in how this effects the rest of the "system".

One issue I've come upon is how to make applications and frameworks both 
encapsulated and transparent.  Specifically if I have an application 
which uses a framework, and the framework uses several pieces of 
middleware, I need both *some* of the framework configuration to be 
exposed, and some application configuration parameters to be exposed.

This can continue further when one logical application is composed of 
subapplications.  For instance, imagine I have an admin interface built 
on Subway, with a web frontend in Wareweb, and a WebDAV interface from 
PyFileServer.  The three applications/frameworks can create a single 
logical application.  But how do I present a unified face for the 
application?

With global and flat configuration, the default is nearly complete 
transparency, with some potential for collision.  Without global 
configuration the default is an opaque system, with no possibility of 
collision.  In a practical sense the global configuration is easier to 
get working, and more adaptable for the system administrator.

Anyway, that's the issue I'm thinking about now.  In paste.deploy it's 
kind of handled by:

   [app:someapp]
   set master_setting = foo
   ... include subapp somehow ...

   [app:subapp]
   get some_local_setting = master_setting

-- 
Ian Bicking  /  ianb at colorstudy.com  / http://blog.ianbicking.org

From ianb at colorstudy.com  Tue Aug 30 00:57:36 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon, 29 Aug 2005 17:57:36 -0500
Subject: [Web-SIG] Session interface, v2
In-Reply-To: <4303FEC5.3050408@colorstudy.com>
References: <4303FEC5.3050408@colorstudy.com>
Message-ID: <431392E0.4010001@colorstudy.com>

Ian Bicking wrote:
> Same location:
> 
> http://svn.colorstudy.com/home/ianb/scarecrow_session_interface.py

BTW, session stores aren't really what I'm focused on at the moment.  I 
just thought I'd be helpful moving the discussion forward past general 
requirements to something easier to discuss, like an interface.  But I 
doubt I'll be working on this anytime soon, as there's lots of other 
projects that I'm more focused on right now.

So I'd encourage anyone interested in this to start some work on it, 
perhaps using this interface as a starting point.

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org