From ianb at colorstudy.com Tue Aug 2 06:28:51 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Mon, 01 Aug 2005 23:28:51 -0500 Subject: [Web-SIG] WSGI: Another level of indirection Message-ID: <42EEF683.6040306@colorstudy.com> Maybe a way to handle this configuration is to put in another level of abstraction, sad as that is. I'm thinking configuration files could have something like PEP 263's encodings, except that it would be an indication of who knows how to build the WSGI application from the file. So it might look like: # -*- wsgi-build: paste.wsgi_deploy:DeploymentConfig -*- Which would work with the experimental stuff I mentioned before. It should also work with .ini files, Python source, and probably other configuration file syntaxes. At some point perhaps we'll come up with a standard (aka default) builder, but this could remain useful despite that. It also means I can go forward with this right now and still be future compatible. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From pje at telecommunity.com Tue Aug 2 06:46:19 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 02 Aug 2005 00:46:19 -0400 Subject: [Web-SIG] WSGI: Another level of indirection In-Reply-To: <42EEF683.6040306@colorstudy.com> Message-ID: <5.1.1.6.0.20050802003254.026944e0@mail.telecommunity.com> At 11:28 PM 8/1/2005 -0500, Ian Bicking wrote: >Maybe a way to handle this configuration is to put in another level of >abstraction, sad as that is. > >I'm thinking configuration files could have something like PEP 263's >encodings, except that it would be an indication of who knows how to >build the WSGI application from the file. So it might look like: > ># -*- wsgi-build: paste.wsgi_deploy:DeploymentConfig -*- > >Which would work with the experimental stuff I mentioned before. It >should also work with .ini files, Python source, and probably other >configuration file syntaxes. At some point perhaps we'll come up with a >standard (aka default) builder, but this could remain useful despite >that. Now you're *really* scaring me. Honestly, there's no difference between this proposal and saying that we'll use "#!" lines to operationally determine the format by specifying an interpreter for it. There's really no *abstraction* taking place here. > It also means I can go forward with this right now and still be >future compatible. I can understand the desire, but I think it would be a bad idea to give this any kind of official standing or allow it to warp the process of getting to a workable deployment standard. Better for you to develop your format(s) and try to make them that convinces everyone they're worth standardizing on, knowing that if you fail, your format will be a dead end. :) That should provide you with extra motivation to make it a really good format for the rest of us. ;) I haven't had a chance to have a serious look in detail at your last format proposal, but hope to soon. From ianb at colorstudy.com Tue Aug 2 18:12:29 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 02 Aug 2005 11:12:29 -0500 Subject: [Web-SIG] WSGI: Another level of indirection In-Reply-To: <5.1.1.6.0.20050802003254.026944e0@mail.telecommunity.com> References: <5.1.1.6.0.20050802003254.026944e0@mail.telecommunity.com> Message-ID: <42EF9B6D.1080906@colorstudy.com> Phillip J. Eby wrote: > At 11:28 PM 8/1/2005 -0500, Ian Bicking wrote: > >> Maybe a way to handle this configuration is to put in another level of >> abstraction, sad as that is. >> >> I'm thinking configuration files could have something like PEP 263's >> encodings, except that it would be an indication of who knows how to >> build the WSGI application from the file. So it might look like: >> >> # -*- wsgi-build: paste.wsgi_deploy:DeploymentConfig -*- >> >> Which would work with the experimental stuff I mentioned before. It >> should also work with .ini files, Python source, and probably other >> configuration file syntaxes. At some point perhaps we'll come up with a >> standard (aka default) builder, but this could remain useful despite >> that. > > > Now you're *really* scaring me. Honestly, there's no difference between > this proposal and saying that we'll use "#!" lines to operationally > determine the format by specifying an interpreter for it. There's > really no *abstraction* taking place here. Well, #! is constrained to executable paths, unlike the comment. But if it wasn't, sure this is just like that... but I don't see the problem (or the reason for such shock ;). Deep down #! is a good feature for executables, and this is just the analog. The API would go: app = load_wsgi_app_from_file('foo.ini') def load_wsgi_app_from_file(filename): f = open(filename) for line in f: if not line.strip(): continue assert line.startswith('#'), "No interpreter found" if '-*-' in line: interp_spec = line.split('-*-')[1].strip() break if ' ' in interp_spec: interp = pkg_resources.load_entry_point(interp_spec.split()[0], 'wsgi.config_interpreter', interp_spec.split()[1]) else: interp = pkg_resources.parse('x='+interp_spec).load(False) f.close() return interp(filename) >> It also means I can go forward with this right now and still be >> future compatible. > > > I can understand the desire, but I think it would be a bad idea to give > this any kind of official standing or allow it to warp the process of > getting to a workable deployment standard. Better for you to develop > your format(s) and try to make them that convinces everyone they're > worth standardizing on, knowing that if you fail, your format will be a > dead end. :) That should provide you with extra motivation to make it > a really good format for the rest of us. ;) I bring this up because I'm not sure there is a One Best Way for the deployment. This is also something I can apply to the deployment configuration I already have in Paste (not the experimental stuff, but the configuration files in Paste). I think other legacy systems (and *every* current framework has something like this) can very possibly be handled the same way, requiring only the addition of one comment line to current configurations. This also leaves the possibility of flattening the configuration some without trying to jam incompatible features in. And load_wsgi_app_from_file, under whatever name, is a function that needs to exist in any spec. Standardizing it first doesn't seem that strange to me. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From ianb at colorstudy.com Tue Aug 2 18:16:29 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 02 Aug 2005 11:16:29 -0500 Subject: [Web-SIG] WSGI: Another level of indirection In-Reply-To: <42EF9B6D.1080906@colorstudy.com> References: <5.1.1.6.0.20050802003254.026944e0@mail.telecommunity.com> <42EF9B6D.1080906@colorstudy.com> Message-ID: <42EF9C5D.8090601@colorstudy.com> Ian Bicking wrote: > I think other legacy systems (and > *every* current framework has something like this) can very possibly be > handled the same way, requiring only the addition of one comment line to > current configurations. It occurs to me that # is a comment most places, but not in XML files, so some alternate way of annotating XML files is also necessary. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From jjinux at gmail.com Thu Aug 4 02:02:42 2005 From: jjinux at gmail.com (Shannon -jj Behrens) Date: Wed, 3 Aug 2005 17:02:42 -0700 Subject: [Web-SIG] Fwd: ANN: cssutils 0.8a2 (alpha release) In-Reply-To: <42ECF61C.808@t-online.de> References: <42ECF61C.808@t-online.de> Message-ID: Hmm, I thought we *didn't* have a way to parse CSS. I guess that's no longer true. -jj ---------- Forwarded message ---------- From: Christof Date: Jul 31, 2005 9:02 AM Subject: ANN: cssutils 0.8a2 (alpha release) To: python-announce-list at python.org what is it ---------- A Python package to parse and build CSS Cascading Style Sheets. Partly implements the DOM Level 2 Stylesheets and DOM Level 2 CSS interfaces. The implementation uses some Python standard features like standard lists for classes like css.CSSRuleList and is hopefully a bit easier to use. changes since the last release ------------------------------ **MAJOR API CHANGE** reflecting DOM Level 2 Stylesheets and DOM Level 2 CSS see http://cthedot.de/cssutils/ for a complete list of changes, examples, etc. license ------- cssutils is published under the LGPL. download -------- download cssutils 0.8a2 alpha - 050731 from http://cthedot.de/cssutils/ This is an alpha release so use at your own risk! Some parts will not work as expected... Any bug report is welcome. cssutils needs * Python 2.3 (tested with Python 2.4.1 on Windows XP only) * maybe PyXML (tested with PyXML 0.8.4 installed) any comment will be appreciated, thanks christof hoeke

cssutils 0.8a2 - cssutils - CSS Cascading Style Sheets library for Python (31-Jul-05) -- http://mail.python.org/mailman/listinfo/python-announce-list Support the Python Software Foundation: http://www.python.org/psf/donations.html -- I have decided to switch to Gmail, but messages to my Yahoo account will still get through. From james at pythonweb.org Sun Aug 7 18:23:59 2005 From: james at pythonweb.org (James Gardner) Date: Sun, 07 Aug 2005 17:23:59 +0100 Subject: [Web-SIG] WSGI deployment config In-Reply-To: <42EA57D2.1060902@colorstudy.com> References: <42E95EC4.9040906@colorstudy.com> <42EA57D2.1060902@colorstudy.com> Message-ID: <42F6359F.9000504@pythonweb.org> Hi All, Have there been any more developments off-list about the format of the config file for WSGI deployment? I'd like to apply the entry points labeling idea to sections in the pipeline config file and propose the following extensions to the format. Here is an example to start with: [database: connection from database == 0.6.0] host = 'mysqldb.pythonweb.org' user = 'foo' password = 'bar' [connection from database == 0.6.0] extra-non-standard-params = 'params' [application: testApplication from web == 0.6.0] message = 'Hello World!' Each section represents the configuration of one piece of middleware as before. Standard configuration sections are labeled and non-standard extensions to standard sections use the same deployment string but with no label so in the example extra-non-standard-params = 'params' is considered a non-standard extension to database configuration. This has three advantages: 1. Standardisation between similar WSGI middleware components becomes easier because we could all agree to name standard database connection parameters as database so middleware can be more interoperable. Non-standard extensions can be named in a similar config section but without the database label so that we define an extensible base standard. 2. Configuration can be accessed in code by name eg config.get('database') or config.getAll('database') to get custom extensions too. This means that whatever version of a package you are using you can still refer to the correct configuration easily and also use the configuration file in external scripts eg. to setup necessary database tables etc without creating the full middleware chain. 3. It allows us to create a configuration hierarchy. I've written a WSGI framework named Bricks http://www.pythonweb.org/bricks/ and the way it works is to have a global config file for all applications at a site and then a local config file if the application needs to override global settings or provide extra middleware. The logic behind this is that things like database connections are likely to be used by all applications across a site and a new application you have installed from a third party is not going to have the correct database settings so you would want to use the settings defined in the global config file. Using the new config file format we could simply say that if a global configuration does not already have a named config section which appears in a local config file then the local configuration is added below the last piece of global configuration that matched (or at the end if no matches were found). We can also define an extension to this basic format: always include and always exclude determined by a + or - sign just before the entry point name so that we can also override global settings in a local config file and provide a flexible configuration chain of as many config files as we liked. Here is an example to illustrate. We have an application which doesn't need authorisation middleware but does need a session store. It needs a database connection but is only capable of interacting with the one it specifies, it also needs some configuration of its own. It is installed on a site with other applications which use database, session and auth middleware. The site administrator wants all applications to use GZip encoded output. global.wsgi: [gzip: gzip from compression==0.1.0] [database: connection from database== 0.6.0] adapter = 'mysql' database = 'test' user = 'foo' password = 'bar' [auth: auth from database==0.6.0] extra-non-standard-params = 'params' [session: session from web==0.6.0] params = 'interesting' local.wsgi: # Override any other database definitions [+database: connection from database==0.6.0] adapter = 'engine' database = 'default' # Define a session configuration to be use if no other is available [session: session from web==0.6.0] params = 'default' [application: appSettings from app==0.1.0] name = 'foo' # The user installing the application wants to specifically # exclude auth middleware since they know it isn't needed [-auth] If the global configuration file wasn't present, the application configuration would look like this: [database: connection from database==0.6.0] adapter = 'engine' database = 'default' [session: session from web==0.6.0] params = 'default' [application: appSettings from app==0.1.0] name = 'foo' But when it is installed on a site with the global config file it looks like this: [gzip: gzip from compression==0.1.0] [database: connection from database==0.6.0] adapter = 'engine' database = 'default' [session: session from web==0.6.0] params = 'interesting' [application: appSettings from app==0.1.0] name = 'foo' So as you can see this allows a flexible deployment heirachy. What do you think? On a broader point I'd like to see pipeline configuration (what we are talking about here) quite separate from the actual deployment details such as where the application is going to be installed on a URL. I don't think there is any need to standardise the latter as long as all frameworks are capable of using the basic pipeline and deploying it in a way they see fit, otherwise you start making the main application configuration too framework-specific. James From pje at telecommunity.com Sun Aug 7 19:16:53 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun, 07 Aug 2005 13:16:53 -0400 Subject: [Web-SIG] WSGI deployment config In-Reply-To: <42F6359F.9000504@pythonweb.org> References: <42EA57D2.1060902@colorstudy.com> <42E95EC4.9040906@colorstudy.com> <42EA57D2.1060902@colorstudy.com> Message-ID: <5.1.1.6.0.20050807124830.026a44c0@mail.telecommunity.com> At 05:23 PM 8/7/2005 +0100, James Gardner wrote: >This has three advantages: > >1. Standardisation between similar WSGI middleware components becomes >easier because we could all agree to name standard database connection >parameters as database so middleware can be more interoperable. >Non-standard extensions can be named in a similar config section but >without the database label so that we define an extensible base standard. The point isn't to have a standardized format or globally accessible configuration, it's to *hide* configuration so that other objects don't have to know about it. >2. Configuration can be accessed in code by name eg config.get('database') >or config.getAll('database') to get custom extensions too. This means that >whatever version of a package you are using you can still refer to the >correct configuration easily and also use the configuration file in >external scripts eg. to setup necessary database tables etc without >creating the full middleware chain. That doesn't require access to the data as data; the database should just be a service. An example of why: PEAK uses database connection URLs like "postgres://foo:bar at example.com/dbname" to designate databases, so it would be a step back to force PEAK users to use your user/password/etc. configuration scheme in order to be able to interoperate. It makes more sense, therefore, to have configuration be private to components unless those components *want* to share that configuration. However, the way they share it might be different than the input. For example, PEAK has a URL connection class that has user/password/etc. attributes on it, so it could certainly implement an interface to provide that information to components that want it. But that doesn't mean that the *source* configuration was done that way. Preserving a separation between interface and implementation is vital to the maintainability of the overall system. >3. It allows us to create a configuration hierarchy. I've written a WSGI >framework named Bricks http://www.pythonweb.org/bricks/ and the way it >works is to have a global config file for all applications at a site and >then a local config file if the application needs to override global >settings or provide extra middleware. The logic behind this is that things >like database connections are likely to be used by all applications across >a site and a new application you have installed from a third party is not >going to have the correct database settings so you would want to use the >settings defined in the global config file. Using the new config file >format we could simply say that if a global configuration does not already >have a named config section which appears in a local config file then the >local configuration is added below the last piece of global configuration >that matched (or at the end if no matches were found). I'm -1 on exposing the data as direct configuration. It should be opaque, and accessed as *services*. Otherwise you're just reinventing the worst problems of Zope 2-era design. We probably *do* need a way to declare services (like your database example), and a service discovery API. We *don't* want to make deployment data into directly-accessible configuration. This doesn't mean you can't create service objects whose whole job is to provide configuration data in some way, it just means that the deployment parameters themselves should be opaque. The reason for this is that without encapsulation, you get spaghetti dependencies, and it becomes difficult to change things programmatically if you have no way to influence data dynamically. This was a really big problem in older versions of Zope 2, that encouraged acquisition of random configuration properties. There's really no point in us repeating that mistake. Here's what I'd suggest as an alternative, using a slight syntax tweak: [sql service from somedbpackage] conn = "some://url" # or you can do it the awkward way instead # ... etc. So "service" or "service from" are the keywords to define a service. For "service from", the first part is looked up in a wsgi.service_factories entry point group. For "service", it's just imported. Either way, the factory is invoked with the previous service provider to create a kind of "service chain". The current head of the service chain is passed into middleware and application factories as the first parameter, so they can use it to find services. We then define a simple API for walking the service chain and locating services by name or other keys. This approach is capable of doing everything you've proposed, except that it doesn't provide access to the private configuration data of individual services. It would be possible, however, to load the service chain from a deployment file without instantiating applications or middleware, in order to e.g. run utility programs. You can still include arbitrary configuration if you want, just by creating a service whose job is to provide such information. The only other piece I think we're missing is a way to handle branching, because our pipeline configuration is quite linear. There's no obvious way to branch at the moment, except by having a way to configure a middleware component to refer to other pipelines. From james at pythonweb.org Sun Aug 7 20:33:53 2005 From: james at pythonweb.org (James Gardner) Date: Sun, 07 Aug 2005 19:33:53 +0100 Subject: [Web-SIG] WSGI deployment config In-Reply-To: <5.1.1.6.0.20050807124830.026a44c0@mail.telecommunity.com> References: <42EA57D2.1060902@colorstudy.com> <42E95EC4.9040906@colorstudy.com> <42EA57D2.1060902@colorstudy.com> <5.1.1.6.0.20050807124830.026a44c0@mail.telecommunity.com> Message-ID: <42F65411.4040901@pythonweb.org> Phillip J. Eby wrote: > This approach is capable of doing everything you've proposed, except > that it doesn't provide access to the private configuration data of > individual services. It would be possible, however, to load the > service chain from a deployment file without instantiating > applications or middleware, in order to e.g. run utility programs. > You can still include arbitrary configuration if you want, just by > creating a service whose job is to provide such information. OK, fair point and I'm perfectly happy with this. > The point isn't to have a standardized format or globally accessible > configuration, it's to *hide* configuration so that other objects > don't have to know about it. What exactly are you defining as a service then? A service would have to have some way of providing its useful code to utilities etc as well as deploying middleware. In the original model each WSGI middleware component might rely on other ones, both the middleware component and middleware it relies on might need configuration. We can describe the whole middleware chain in a config file so that it can all be configured at once. I might be missing the point but is your idea of services that by passing them the service chain they have an opportunity to decide what services to load based on services they rely on and thereby bypass some of the configuration? Surely almost all middleware would need at least some configuration so you are unlikely to make the config file much shorter? > The only other piece I think we're missing is a way to handle > branching, because our pipeline configuration is quite linear. > There's no obvious way to branch at the moment, except by having a way > to configure a middleware component to refer to other pipelines. I don't think I've quite caught your full vision here. Using the services idea my understanding is just that an application needs certain services to function and also certain configuration for those services before it can run, since many applications on the same site may need the same services configured in the same way it is useful to be able to share configuration and to do that it is helpful for a local application to inherit configuration from another source of components, possibly in the way I suggested. I don't think branching really fits into that model so how are you envisaging deployments? Cheers, James From pje at telecommunity.com Sun Aug 7 21:45:07 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun, 07 Aug 2005 15:45:07 -0400 Subject: [Web-SIG] WSGI deployment config In-Reply-To: <42F65411.4040901@pythonweb.org> References: <5.1.1.6.0.20050807124830.026a44c0@mail.telecommunity.com> <42EA57D2.1060902@colorstudy.com> <42E95EC4.9040906@colorstudy.com> <42EA57D2.1060902@colorstudy.com> <5.1.1.6.0.20050807124830.026a44c0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050807145244.0269bd38@mail.telecommunity.com> At 07:33 PM 8/7/2005 +0100, James Gardner wrote: >Phillip J. Eby wrote: > >>This approach is capable of doing everything you've proposed, except that >>it doesn't provide access to the private configuration data of individual >>services. It would be possible, however, to load the service chain from >>a deployment file without instantiating applications or middleware, in >>order to e.g. run utility programs. >>You can still include arbitrary configuration if you want, just by >>creating a service whose job is to provide such information. > >OK, fair point and I'm perfectly happy with this. > >>The point isn't to have a standardized format or globally accessible >>configuration, it's to *hide* configuration so that other objects don't >>have to know about it. > >What exactly are you defining as a service then? A component that needs to be available to one or more other components, based on some lookup key (like a name or an interface). > A service would have to have some way of providing its useful code to > utilities etc as well as deploying middleware. I think maybe you're confusing something here. I'm suggesting that there be a chain of service providers, and that the WSGI API to load a pipeline should return both a top-down middleware-to-app chain, and a bottom-up service-to-service chain. Thus, a utility program could load a WSGI file and gain access to the service chain, ignoring the middleware. But, I'm not saying that services are *part* of the middleware chain; middleware components get created with access to the middleware chain, but the services themselves are not middleware. > In the original model each WSGI middleware component might rely on other > ones, both the middleware component and middleware it relies on might > need configuration. We can describe the whole middleware chain in a > config file so that it can all be configured at once. I might be missing > the point but is your idea of services that by passing them the service > chain they have an opportunity to decide what services to load based on > services they rely on and thereby bypass some of the configuration? I don't understand you. They just get what services they need from the chain. They don't "bypass" configuration they never cared about in the first place. > Surely almost all middleware would need at least some configuration so > you are unlikely to make the config file much shorter? But their configuration is in the parameters that get passed to their factory, e.g. [fooware from blah] something1 = "feh" # etc. >>The only other piece I think we're missing is a way to handle branching, >>because our pipeline configuration is quite linear. >>There's no obvious way to branch at the moment, except by having a way to >>configure a middleware component to refer to other pipelines. > >I don't think I've quite caught your full vision here. Using the services >idea my understanding is just that an application needs certain services >to function and also certain configuration for those services before it >can run, since many applications on the same site may need the same >services configured in the same way it is useful to be able to share >configuration and to do that In which case, there should be a mechanism for configuring things based on other service lookups, e.g. [spazware from spiz] fidgety = lookup("fizzit.ping") If we allowed 'lookup()' to mean, "search the service chain above me for a configuration service and return the value of 'fizzit.ping'. My point here isn't to propose that this be the API, I'm just presenting a general concept. "Wiring" of configuration by simply acquiring values from a global namespace doesn't work well even for applications developed entirely by a single person; it definitely doesn't scale to plug-and-play of components developed by an entire community. >it is helpful for a local application to inherit configuration from >another source of components, possibly in the way I suggested. I don't >think branching really fits into that model so how are you envisaging >deployments? The branching was for saying things like "/foo goes to pipeline A, and /bar goes to pipeline B". It's becoming clear to me, though, that we need to *ban* the word "configuration" from this discussion, because it's way too overloaded, and everybody brings unique baggage to it. If we don't use that word, we'll have to actually explain what we really mean. :) So, in that spirit, I will now rephrase my proposal so as not use the word "configuration". A "pipeline spec" describes how to deploy a WSGI application, optionally with middleware filters and services, by providing parameters to designated factories. There are three kinds of factories: application, middleware, and service. All three kinds are invoked with the parameters defined in the spec and the most-recently specified service object. Middleware factories also receive the *next* middleware or application component defined below them in the spec. An example middleware factory signature: def make_middleware(last_service, application_to_wrap, **params): Example application and service factory signatures: def make_app(last_service, **params): def make_service(last_service, **params): Just as the middleware-to-application links form a "downward" chain of responsibility for handling WSGI requests, the service-to-service links form an "upward" chain of responsibility for acquiring service components. There needs to be a specification for how to search the chain; for example we could have a 'get_service(key)' method required on service components, and if the service doesn't recognize the key it just calls 'last_service.get_service()'. In some circumstances, the same value or object is needed as a factory parameter for more than one component. In these cases, it would be useful to be able to have a way to specify shared parameters in the specification. Ordinarily, these shared parameters will be defined at some "high level" of the overall system, such as in a server-wide pipeline spec, and then acquired in "low level" pipeline specs for specific areas of the server or individual application components. We can thus envision a "shared parameter service" interface for publishing values that need to be used often, and an API in the pipeline spec to indicate that a parameter should be retrieved from the nearest shared parameter service that offers a value for a given name. This approach is superior to using a common namespace for parameters, because the level of abstraction at which shared parameters are defined is more likely to be concepts like "system administrator e-mail", but that value might then be used for more specific component parameters like "email errors to" and "administrator login ID". So, being able to say that the "email_errors_to" parameter for a given component should be looked up from "sysadmin_email" in the shared parameter service allows for parameters to be cleanly shared between components. From ianb at colorstudy.com Mon Aug 8 19:47:56 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Mon, 08 Aug 2005 12:47:56 -0500 Subject: [Web-SIG] WSGI deployment config In-Reply-To: <5.1.1.6.0.20050807145244.0269bd38@mail.telecommunity.com> References: <5.1.1.6.0.20050807124830.026a44c0@mail.telecommunity.com> <42EA57D2.1060902@colorstudy.com> <42E95EC4.9040906@colorstudy.com> <42EA57D2.1060902@colorstudy.com> <5.1.1.6.0.20050807124830.026a44c0@mail.telecommunity.com> <5.1.1.6.0.20050807145244.0269bd38@mail.telecommunity.com> Message-ID: <42F79ACC.7080706@colorstudy.com> OK, this is starting to become a bit more clear to me... Phillip J. Eby wrote: >> A service would have to have some way of providing its useful code to >>utilities etc as well as deploying middleware. > > > I think maybe you're confusing something here. I'm suggesting that there > be a chain of service providers, and that the WSGI API to load a pipeline > should return both a top-down middleware-to-app chain, and a bottom-up > service-to-service chain. Thus, a utility program could load a WSGI file > and gain access to the service chain, ignoring the middleware. > > But, I'm not saying that services are *part* of the middleware chain; > middleware components get created with access to the middleware chain, but > the services themselves are not middleware. So, thinking back to the transaction middleware I speculated about: http://blog.ianbicking.org/more-perfect-app-server-wsgi-transactions.html In your model with services, I think you are suggesting some middleware like this will still exist. In fact, it would look very close to the way it looks in that example, except that instead of putting the Manager in the WSGI environment, some service would create the manager, and both the middleware and a transaction-user would use this service to get the manager. (In case it creates confusion, I think Zope uses a different term for the manager; maybe it is simply a "transaction", I can't remember now) So for many services some middleware would still be necessary, if the service was able to do anything to the request. That middleware would be mostly a shell. That's fine with me -- that's how I'm writing most of my middleware anyway, except that the "service" part is relatively ad hoc, and if you use it outside of the web environment you have to wire up the configuration on your own. Which isn't what I want either. If you *don't* want a middleware for every request/response-modifying service, then you'd need some uber-middleware like I mentioned back in http://mail.python.org/pipermail/web-sig/2005-July/001532.html -- in addition to saving some frames in the call stack, that would probably make pipeline specification easier. But maybe not a whole lot easier, as there's usually additional details (like ordering) that are necessary to specify in the context of a web request. >>it is helpful for a local application to inherit configuration from >>another source of components, possibly in the way I suggested. I don't >>think branching really fits into that model so how are you envisaging >>deployments? > > > The branching was for saying things like "/foo goes to pipeline A, and /bar > goes to pipeline B". The spec I gave in "WSGI deployment: an experiment" (http://mail.python.org/pipermail/web-sig/2005-July/001598.html) handles arbitrary kinds of branching, basically by naming both applications and middleware filters, and allowing application factories to call back into the configuration file. So pipeline is just another application factory, just like urlmap or other kinds of branching. Maybe this could be handled with an application-building service since we're passing services around anyway. > A "pipeline spec" describes how to deploy a WSGI application, optionally > with middleware filters and services, by providing parameters to designated > factories. There are three kinds of factories: application, middleware, > and service. All three kinds are invoked with the parameters defined in > the spec and the most-recently specified service object. Middleware > factories also receive the *next* middleware or application component > defined below them in the spec. > > An example middleware factory signature: > > def make_middleware(last_service, application_to_wrap, **params): It might add to the consistency if make_middleware takes the same parameters as the other two factories, except it builds "middleware" (or "middleware filters" to make the term less enterprisy) which are functions that, when passed in an application, return an application that wraps that application. Though I would not object to a method instead of just calling the factory; I think we risk a maze of function calls, all looking the same. Then the higher-level operation is "build something of type foo", where foo is a WSGI application, a WSGI middleware filter, a service, or something else. > Example application and service factory signatures: > > def make_app(last_service, **params): > > def make_service(last_service, **params): > > Just as the middleware-to-application links form a "downward" chain of > responsibility for handling WSGI requests, the service-to-service links > form an "upward" chain of responsibility for acquiring service > components. There needs to be a specification for how to search the chain; > for example we could have a 'get_service(key)' method required on service > components, and if the service doesn't recognize the key it just calls > 'last_service.get_service()'. > > In some circumstances, the same value or object is needed as a factory > parameter for more than one component. In these cases, it would be useful > to be able to have a way to specify shared parameters in the > specification. Ordinarily, these shared parameters will be defined at some > "high level" of the overall system, such as in a server-wide pipeline spec, > and then acquired in "low level" pipeline specs for specific areas of the > server or individual application components. We can thus envision a > "shared parameter service" interface for publishing values that need to be > used often, and an API in the pipeline spec to indicate that a parameter > should be retrieved from the nearest shared parameter service that offers a > value for a given name. > > This approach is superior to using a common namespace for parameters, > because the level of abstraction at which shared parameters are defined is > more likely to be concepts like "system administrator e-mail", but that > value might then be used for more specific component parameters like "email > errors to" and "administrator login ID". So, being able to say that the > "email_errors_to" parameter for a given component should be looked up from > "sysadmin_email" in the shared parameter service allows for parameters to > be cleanly shared between components. I think this would address some of the configuration concerns I've had. I don't mind being very explicit in my code about how configuration is acquired; I just don't want to push that work onto the person doing the configuration, and I want sensible (and possibly derivative) defaults. While Zope 2 gets hairy in its use of Acquisition -- essentially adding dynamic scoping to the core of the system -- the basic technique is not necessary correct. Lisps get by okay with dynamic scopes, but they clearly mark variables as being so typed (like *current-output-stream*). If we add dynamic-scope-like-functionality, we just need to make sure it's clear where it's being used, and that it's not the default so it isn't used when not necessary. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From pje at telecommunity.com Tue Aug 9 03:20:33 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 08 Aug 2005 21:20:33 -0400 Subject: [Web-SIG] WSGI deployment config In-Reply-To: <42F79ACC.7080706@colorstudy.com> References: <5.1.1.6.0.20050807145244.0269bd38@mail.telecommunity.com> <5.1.1.6.0.20050807124830.026a44c0@mail.telecommunity.com> <42EA57D2.1060902@colorstudy.com> <42E95EC4.9040906@colorstudy.com> <42EA57D2.1060902@colorstudy.com> <5.1.1.6.0.20050807124830.026a44c0@mail.telecommunity.com> <5.1.1.6.0.20050807145244.0269bd38@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050808151235.025bf4b8@mail.telecommunity.com> I imAt 12:47 PM 8/8/2005 -0500, Ian Bicking wrote: >OK, this is starting to become a bit more clear to me... Cool. :) Sometimes the best way to get rid of confusing communication is to make the communication even more difficult. :) (e.g., banning the word "configuration") >In your model with services, I think you are suggesting some middleware >like this will still exist. In fact, it would look very close to the way >it looks in that example, except that instead of putting the Manager in >the WSGI environment, some service would create the manager, and both the >middleware and a transaction-user would use this service to get the manager. Yes. >So for many services some middleware would still be necessary, if the >service was able to do anything to the request. Well yeah, if you want to wrap an app rather than just use the service. > That middleware would be mostly a shell. That's fine with me -- that's > how I'm writing most of my middleware anyway, except that the "service" > part is relatively ad hoc, and if you use it outside of the web > environment you have to wire up the configuration on your own. Which > isn't what I want either. Note that you can use pipeline specs to configure arbitrary service chains, without WSGI even being involved. So, to a certain extent, services can stand on their own. What's interesting about that (to me anyway) is that if there are bridges that allow PEAK or Zope services to be used as WSGI services, then pipelines can be used to bridge various frameworks' service systems - without a web application in sight. >If you *don't* want a middleware for every request/response-modifying >service, then you'd need some uber-middleware like I mentioned back in >http://mail.python.org/pipermail/web-sig/2005-July/001532.html -- in >addition to saving some frames in the call stack, that would probably make >pipeline specification easier. But maybe not a whole lot easier, as >there's usually additional details (like ordering) that are necessary to >specify in the context of a web request. Well, to me, the "uber middleware" is just an object with a generic function for its __call__ method, that has "before", "after", and "around" methods registered to do stuff like transaction wrapping, error handling, and any other sort of middleware-ish things. So, it's not very "uber" in implementation complexity from my POV to have such a thing, and it takes care of many of the stacking issues. >The spec I gave in "WSGI deployment: an experiment" >(http://mail.python.org/pipermail/web-sig/2005-July/001598.html) handles >arbitrary kinds of branching, basically by naming both applications and >middleware filters, and allowing application factories to call back into >the configuration file. So pipeline is just another application factory, >just like urlmap or other kinds of branching. > >Maybe this could be handled with an application-building service since >we're passing services around anyway. Hm. An interesting point. I haven't yet seen a branching/alternatives syntax I like though. The big problem IMO is that a branching mechanism requires nesting ability, whereas pipelines are "flat and happy". :) Unfortunately, .ini syntax rapidly breaks down when nesting begins, which makes me tend to think that we should have a separate "site map" file that maps locations and other rules to groups of pipelines. >>A "pipeline spec" describes how to deploy a WSGI application, optionally >>with middleware filters and services, by providing parameters to >>designated factories. There are three kinds of factories: application, >>middleware, and service. All three kinds are invoked with the parameters >>defined in the spec and the most-recently specified service >>object. Middleware factories also receive the *next* middleware or >>application component defined below them in the spec. >>An example middleware factory signature: >> def make_middleware(last_service, application_to_wrap, **params): > >It might add to the consistency if make_middleware takes the same >parameters as the other two factories, except it builds "middleware" (or >"middleware filters" to make the term less enterprisy) which are functions >that, when passed in an application, return an application that wraps that >application. Though I would not object to a method instead of just >calling the factory; I think we risk a maze of function calls, all looking >the same. I don't see a problem with the signature being different, to be honest. Making it the same implies a similarity that doesn't exist. If we were to change for consistency's sake, we should instead change the application factory signature to match that of middleware, and use 'None' for the 'application_to_wrap' in that case. Applications and middleware are more alike than either of them are like services. >I think this would address some of the configuration concerns I've had. I >don't mind being very explicit in my code about how configuration is >acquired; I just don't want to push that work onto the person doing the >configuration, and I want sensible (and possibly derivative) defaults. You can do that, sure. >While Zope 2 gets hairy in its use of Acquisition -- essentially adding >dynamic scoping to the core of the system -- the basic technique is not >necessary correct. Lisps get by okay with dynamic scopes, but they >clearly mark variables as being so typed (like >*current-output-stream*). If we add dynamic-scope-like-functionality, we >just need to make sure it's clear where it's being used, and that it's not >the default so it isn't used when not necessary. Right - explicit indirection or redirection avoids a lot of problems here. From ianb at colorstudy.com Tue Aug 9 05:17:45 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Mon, 08 Aug 2005 22:17:45 -0500 Subject: [Web-SIG] WSGI deployment config In-Reply-To: <5.1.1.6.0.20050808151235.025bf4b8@mail.telecommunity.com> References: <5.1.1.6.0.20050807145244.0269bd38@mail.telecommunity.com> <5.1.1.6.0.20050807124830.026a44c0@mail.telecommunity.com> <42EA57D2.1060902@colorstudy.com> <42E95EC4.9040906@colorstudy.com> <42EA57D2.1060902@colorstudy.com> <5.1.1.6.0.20050807124830.026a44c0@mail.telecommunity.com> <5.1.1.6.0.20050807145244.0269bd38@mail.telecommunity.com> <5.1.1.6.0.20050808151235.025bf4b8@mail.telecommunity.com> Message-ID: <42F82059.2060402@colorstudy.com> Phillip J. Eby wrote: > I imAt 12:47 PM 8/8/2005 -0500, Ian Bicking wrote: > >> OK, this is starting to become a bit more clear to me... > > > Cool. :) Sometimes the best way to get rid of confusing communication > is to make the communication even more difficult. :) (e.g., banning > the word "configuration") Well, it wasn't really that per se; after reading through the latest thread between you and James it became a bit clearer what you meant by services. You've been a bit vague about services up until now (and I'm not familiar with Zope or PEAK services, so I've just been guessing at what you've meant). >> So for many services some middleware would still be necessary, if the >> service was able to do anything to the request. > > > Well yeah, if you want to wrap an app rather than just use the service. I'm thinking of any service that needs to modify the request and response, or watch the request in some way (e.g., a transaction service that needs to watch for unexpected exceptions, or a session service that needs to add a cookie to the response when starting a new session). Services certainly have a much larger scope than that, but then most of that larger scope is workable as mere "libraries" (except for the configuration problem, which services do address). >> If you *don't* want a middleware for every request/response-modifying >> service, then you'd need some uber-middleware like I mentioned back in >> http://mail.python.org/pipermail/web-sig/2005-July/001532.html -- in >> addition to saving some frames in the call stack, that would probably >> make pipeline specification easier. But maybe not a whole lot easier, >> as there's usually additional details (like ordering) that are >> necessary to specify in the context of a web request. > > > Well, to me, the "uber middleware" is just an object with a generic > function for its __call__ method, that has "before", "after", and > "around" methods registered to do stuff like transaction wrapping, error > handling, and any other sort of middleware-ish things. So, it's not > very "uber" in implementation complexity from my POV to have such a > thing, and it takes care of many of the stacking issues. Uber in that it doesn't have any specific purpose, and really leads into the direction of framework instead of library. All the middleware to date are targetted at providing one bit of functionality; there's a certain clarity to that. A single more powerful middleware is interesting; but it's also harder to imagine it being complete. >> The spec I gave in "WSGI deployment: an experiment" >> (http://mail.python.org/pipermail/web-sig/2005-July/001598.html) >> handles arbitrary kinds of branching, basically by naming both >> applications and middleware filters, and allowing application >> factories to call back into the configuration file. So pipeline is >> just another application factory, just like urlmap or other kinds of >> branching. >> >> Maybe this could be handled with an application-building service since >> we're passing services around anyway. > > > Hm. An interesting point. I haven't yet seen a branching/alternatives > syntax I like though. The big problem IMO is that a branching mechanism > requires nesting ability, whereas pipelines are "flat and happy". :) I find the use of named applications/filters to make the nesting reasonable. I'm happy enough with the syntax I propose. But then, I also thing that there's still something to the idea of another layer of indirection, and configuration files that self-identify. I really see no reason to think we can fully identify the Right Way to configure applications (including all meanings of "configure") here and now. I'm happy with One Good Way, and future extensibility. So I still think "# -*- config.loader:ref -*-" is a good idea. > Unfortunately, .ini syntax rapidly breaks down when nesting begins, > which makes me tend to think that we should have a separate "site map" > file that maps locations and other rules to groups of pipelines. Nesting is one way of looking at it; but then mere references work okay as well. I think at any point we want to be able to say "get the thing from this file" instead of "get the thing from this section". Given that it doesn't seem like nesting is a good fit. Are there any specific problems you have with my previous proposal? -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From jjinux at gmail.com Fri Aug 12 12:11:25 2005 From: jjinux at gmail.com (Shannon -jj Behrens) Date: Fri, 12 Aug 2005 03:11:25 -0700 Subject: [Web-SIG] and now for something completely different! Message-ID: Hey guys, Maybe I'm just ignorant (highly probable), but I'm really having a hard time keeping up with the "configuration" emails, especially when each of you is using slightly different definitions and trying to reach slightly different goals. Please forgive me for coming out and stating this. With the number of participants in the conversations, it doesn't seem like we're making a huge amount of progress, although perhaps I should shut up and be patient. In the meantime, I'd like to propose that we framework authors try to start sharing our backend session code. Let's just create a library like Apache::Session . As much as possible, I think we can make it framework agnostic, relying on the framework itself to respond to callbacks for doing things like setting session cookies and creating a database cursor. Just like with WSGI, the frameworks need not change their external APIs. Let's keep it simple and just make it a library. (I'm not sure the Twisted folks can participate because things on the Twisted side are always so different, but hopefully I'm wrong.) In any case, it's just a proposal to try to share more code. If I can get two other major frameworks to say they'll commit to working with me and using/contributing to the library, I'll start the endeavor and give them CVS commit rights. We need not write much new code. I'd like to reuse code that each of us already has. This will have the benefit of a lot of peer review. Perhaps this will make for a slightly better (Python Web) world :-D Best Regards, -jj -- I have decided to switch to Gmail, but messages to my Yahoo account will still get through. From fumanchu at amor.org Fri Aug 12 13:14:49 2005 From: fumanchu at amor.org (Robert Brewer) Date: Fri, 12 Aug 2005 04:14:49 -0700 Subject: [Web-SIG] and now for something completely different! Message-ID: <3A81C87DC164034AA4E2DDFE11D258E3772790@exchange.hqamor.amorhq.net> Shannon -jj Behrens wrote: > ...I'd like to propose that we framework authors try to > start sharing our backend session code. Let's just > create a library like Apache::Session > . > As much as possible, I think we can make it framework > agnostic, relying on the framework itself to respond > to callbacks for doing things like setting session > cookies and creating a database cursor. Just like > with WSGI, the frameworks need not change their > external APIs. Let's keep it simple and just make > it a library. Sounds great. Let's see what we can come up with. Robert Brewer CherryPy Team fumanchu at amor.org From ianb at colorstudy.com Fri Aug 12 18:41:56 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 12 Aug 2005 11:41:56 -0500 Subject: [Web-SIG] and now for something completely different! In-Reply-To: References: Message-ID: <42FCD154.6010001@colorstudy.com> Shannon -jj Behrens wrote: > Maybe I'm just ignorant (highly probable), but I'm really having a > hard time keeping up with the "configuration" emails, especially when > each of you is using slightly different definitions and trying to > reach slightly different goals. Please forgive me for coming out and > stating this. No, not at all; it's not been going that fast, and I myself feel simultaneously over- and underwhelmed by the discussion -- it's dense hard to follow, yet indecisive :-/ At this point I'm going to try to do some more formal refactoring in Paste of the configuration experiments I've done so far, and maybe bring it up again when that's more complete. Or something; I'll keep reading if other people put out ideas. > In the meantime, I'd like to propose that we framework authors try to > start sharing our backend session code. Let's just create a library > like Apache::Session > . As much > as possible, I think we can make it framework agnostic, relying on the > framework itself to respond to callbacks for doing things like setting > session cookies and creating a database cursor. Just like with WSGI, > the frameworks need not change their external APIs. Let's keep it > simple and just make it a library. I think that would be useful. Flup has a fairly decoupled session store (http://www.saddi.com/software/flup/ in http://svn.saddi.com/flup/trunk/flup/middleware/session.py). Is there other current work that should be considered? PythonWeb has a session module, but I don't know what its insides look like: http://www.pythonweb.org/projects/webmodules/doc/0.5.3/html_multipage/lib/session.html Paste has one too, but it's Not Very Good ;) I started using the flup session, but I got lazy and never flipped the switch to make it the default. There's been some discussion about sessions in the last few months on the Quixote list as well. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From jjinux at gmail.com Fri Aug 12 19:28:33 2005 From: jjinux at gmail.com (Shannon -jj Behrens) Date: Fri, 12 Aug 2005 10:28:33 -0700 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <42FCD154.6010001@colorstudy.com> References: <42FCD154.6010001@colorstudy.com> Message-ID: If we get CherryPy (awesome, Robert!), Quixote, and Paste onboard, I'll consider it a huge success. -jj On 8/12/05, Ian Bicking wrote: > Shannon -jj Behrens wrote: > > Maybe I'm just ignorant (highly probable), but I'm really having a > > hard time keeping up with the "configuration" emails, especially when > > each of you is using slightly different definitions and trying to > > reach slightly different goals. Please forgive me for coming out and > > stating this. > > No, not at all; it's not been going that fast, and I myself feel > simultaneously over- and underwhelmed by the discussion -- it's dense > hard to follow, yet indecisive :-/ > > At this point I'm going to try to do some more formal refactoring in > Paste of the configuration experiments I've done so far, and maybe bring > it up again when that's more complete. Or something; I'll keep reading > if other people put out ideas. > > > In the meantime, I'd like to propose that we framework authors try to > > start sharing our backend session code. Let's just create a library > > like Apache::Session > > . As much > > as possible, I think we can make it framework agnostic, relying on the > > framework itself to respond to callbacks for doing things like setting > > session cookies and creating a database cursor. Just like with WSGI, > > the frameworks need not change their external APIs. Let's keep it > > simple and just make it a library. > > I think that would be useful. Flup has a fairly decoupled session store > (http://www.saddi.com/software/flup/ in > http://svn.saddi.com/flup/trunk/flup/middleware/session.py). Is there > other current work that should be considered? PythonWeb has a session > module, but I don't know what its insides look like: > http://www.pythonweb.org/projects/webmodules/doc/0.5.3/html_multipage/lib/session.html > > Paste has one too, but it's Not Very Good ;) I started using the flup > session, but I got lazy and never flipped the switch to make it the > default. There's been some discussion about sessions in the last few > months on the Quixote list as well. > > -- > Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org > -- I have decided to switch to Gmail, but messages to my Yahoo account will still get through. From james at pythonweb.org Fri Aug 12 19:40:32 2005 From: james at pythonweb.org (James Gardner) Date: Fri, 12 Aug 2005 18:40:32 +0100 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <42FCD154.6010001@colorstudy.com> References: <42FCD154.6010001@colorstudy.com> Message-ID: <42FCDF10.1030103@pythonweb.org> Ian Bicking wrote: >PythonWeb has a session >module, but I don't know what its insides look like: >http://www.pythonweb.org/projects/webmodules/doc/0.5.3/html_multipage/lib/session.html > I was going to suggest it might be worth looking at the PythonWeb web.session module as a basis. The version in 0.5.3 is fairly well developed after long discussions with Felix Schwarz on the pythonweb mailing list. The API is separate from the implementation so you can write different drivers for different storage mechanisms. I wrote a driver to use an SQL database engine and that driver itself uses the PythonWeb database module which provides an abstraction layer to work on multiple engines. Osvaldo Santana Neto kindly donated a file based driver. There is also a WSGI implementation to use the session module at: http://www.pythonweb.org/projects/webmodules/doc/0.5.3/html_multipage/lib/example-wsgiSession.html The module uses the concept of a manager to manage multiple stores. The idea is that different applications have different stores so that their keys don't over-write each others by mistake but that all those applications can share the same session cookie and expire at the same time. You can also set the cookie properties and have the time the session stores expire different from the time the cookie expires if you really want too. I think it makes a good starting point anyway, the docs are quite comprehensive and I'd also be happy to give CVS access to anyone who wanted it. Unfortunately I don't use any other session software so I don't know how well web.session compares to others. If we base the new session module on something else I'd also be happy to update the web modules and bricks to use the new session module (possibly as a driver) instead if it provided the same features. Sharing code is definitely a good idea, but I'd also like to agree a new WSGI standard because apart from end user benefits I think that will massively speed up the rate at which different framework authors use each other's code in their own projects and the more that happens the more things will get naturally integrated anyway. James P.S. I'm currently updating all the components on pythonweb.org to use the new Eggs format at http://peak.telecommunity.com/DevCenter/PythonEggs . They are a very exciting technology and if you are keen on experimenting with them and want to have a go with web.session you can test the 0.6.0 alpha of the web module (which includes web.session) by installing the latest version of setuptools and running the following command: python easy_install.py web If that doesn't work you'll have to use the old 0.5.3 web modules (the session module is actually unchanged). The eggs themselves are at http://www.pythonweb.org/pythonweb/release/ for those who are interested. From mso at oz.net Fri Aug 12 23:08:31 2005 From: mso at oz.net (mso@oz.net) Date: Fri, 12 Aug 2005 14:08:31 -0700 (PDT) Subject: [Web-SIG] and now for something completely different! In-Reply-To: <42FCD154.6010001@colorstudy.com> References: <42FCD154.6010001@colorstudy.com> Message-ID: <33071.161.55.66.150.1123880911.squirrel@www.oz.net> Ian Bicking wrote: > Paste has one too, but it's Not Very Good ;) I started using the flup session, but I got lazy and never flipped the switch to make it the default. There's been some discussion about sessions in the last few months on the Quixote list as well. session2 is at http://quixote.idyll.org/ . It was made due to the lack of persistent session stores in Quixote. There's a threefold structure: Session: Copy of Quixote's session class. You can set attributes but not keys. DictSession also allows keys. There's a .user attribute (default None) and a .set_user(user) method, but those can be ignored. SessionManager: Interface between the framework and store. The implementation is Quixote-specific, but one could probably make an abstract superclass or WSGI class. SessionStore: Base class of storage backends: DirectorySessionStore, DurusSessionStore, MySQLSessionStore, PostgresSessionStore, ShelveSessionStore. If a Quixote application were installed in Paste and used a third-party session manager, the session object would have to: - allow arbitary attributes. - default .user to None. - have a .set_user(user) method that merely sets .user. Otherwise people would have to modify their applications. -- -- Mike Orr From ksenia at ksenia.nl Fri Aug 12 23:42:52 2005 From: ksenia at ksenia.nl (Ksenia Marasanova) Date: Fri, 12 Aug 2005 23:42:52 +0200 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <33071.161.55.66.150.1123880911.squirrel@www.oz.net> References: <42FCD154.6010001@colorstudy.com> <33071.161.55.66.150.1123880911.squirrel@www.oz.net> Message-ID: Op 12-aug-2005, om 23:08 heeft mso at oz.net het volgende geschreven: > If a Quixote application were installed in Paste and used a third- > party > session manager, the session object would have to: > - allow arbitary attributes. > - default .user to None. > - have a .set_user(user) method that merely sets .user. > Otherwise people would have to modify their applications. Actually I migrated lately few old applications from Quixote1 "native" sessions to Flup Session middleware :) Except from arbitrary attributes that I don't have, this is it: from flup import session from quixote import publish def _get_user(self): if hasattr(self._user): if self._user is not None: # some app-specific code to get user from db return user def _set_user(self, user): # user is SQLObject instance, we can only store ID if user is None: self._user = None else: self._user = user.id def set_user(self, user): self.user = user session.Session.user = property(_get_user, _set_user) session.Session.set_user = set_user class MyPublisher(publish.Publisher): def start_request(self, request): request.session = request.environ ['com.saddi.service.session'].session publish.Publisher.start_request(self, request) From renesd at gmail.com Sat Aug 13 06:24:06 2005 From: renesd at gmail.com (Rene Dudfield) Date: Sat, 13 Aug 2005 14:24:06 +1000 Subject: [Web-SIG] and now for something completely different! In-Reply-To: References: Message-ID: <64ddb72c05081221246c60c756@mail.gmail.com> Ok, here's my super list of wanted session features. Multiple reader, single writer locking. Or MVCC would be nice :) Otherwise if you use it for multiple requests at once(as in with ajax apps) everything slows way down. Having in the api a way to say 'I am just opening this for reading' would be really nice. Then backends that can implement this functionality can implement it. Backends that can't can implement locking however they want and ignore the read/write options passed. Performance, and locking for session objects is quite hard to get right if they are to be used by lots of different people, apps, and frameworks. Also having a specific close() method, rather than relying on garbage collection is important. Lazy opening of sessions is also good. So if it isn't touched then don't bother opening it. Support for cookie based, and url based sessions is also very important. It is also important to be able to chose which method you want to use. Security features like ip address, and referer checking can probably be implemented separately. As well as only allowing a user to get a session on one computer. These are optional things, but should be possible to do with whatever the session design is. Allowing a single browser to have multiple sessions open at once would also be good. This way you can avoid name clashes when mixing applications. Or for having separate session configurations for different parts of your application. Eg. database sessions for admin section, and memory based ones for your front end. Cheers. On 8/12/05, Shannon -jj Behrens wrote: > Hey guys, > > Maybe I'm just ignorant (highly probable), but I'm really having a > hard time keeping up with the "configuration" emails, especially when > each of you is using slightly different definitions and trying to > reach slightly different goals. Please forgive me for coming out and > stating this. > > With the number of participants in the conversations, it doesn't seem > like we're making a huge amount of progress, although perhaps I should > shut up and be patient. > > In the meantime, I'd like to propose that we framework authors try to > start sharing our backend session code. Let's just create a library > like Apache::Session > . As much > as possible, I think we can make it framework agnostic, relying on the > framework itself to respond to callbacks for doing things like setting > session cookies and creating a database cursor. Just like with WSGI, > the frameworks need not change their external APIs. Let's keep it > simple and just make it a library. > > (I'm not sure the Twisted folks can participate because things on the > Twisted side are always so different, but hopefully I'm wrong.) > > In any case, it's just a proposal to try to share more code. If I can > get two other major frameworks to say they'll commit to working with > me and using/contributing to the library, I'll start the endeavor and > give them CVS commit rights. We need not write much new code. I'd > like to reuse code that each of us already has. This will have the > benefit of a lot of peer review. > > Perhaps this will make for a slightly better (Python Web) world :-D > > Best Regards, > -jj > > -- > I have decided to switch to Gmail, but messages to my Yahoo account will > still get through. > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: http://mail.python.org/mailman/options/web-sig/renesd%40gmail.com > From titus at caltech.edu Sun Aug 14 19:54:25 2005 From: titus at caltech.edu (Titus Brown) Date: Sun, 14 Aug 2005 10:54:25 -0700 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <42FCD154.6010001@colorstudy.com> References: <42FCD154.6010001@colorstudy.com> Message-ID: <20050814175425.GF17009@caltech.edu> -> I think that would be useful. Flup has a fairly decoupled session store -> (http://www.saddi.com/software/flup/ in -> http://svn.saddi.com/flup/trunk/flup/middleware/session.py). Is there -> other current work that should be considered? PythonWeb has a session -> module, but I don't know what its insides look like: -> http://www.pythonweb.org/projects/webmodules/doc/0.5.3/html_multipage/lib/session.html -> -> Paste has one too, but it's Not Very Good ;) I started using the flup -> session, but I got lazy and never flipped the switch to make it the -> default. There's been some discussion about sessions in the last few -> months on the Quixote list as well. I've been decoupled from Web-SIG e-mails for the last two months, but Mike Orr and I built a simple session store for Quixote that has a fairly simple and generic storage API: http://cafepy.com/quixote_extras/titus/session2/session2/store/SessionStore.py With the comments deleted, here's the core API: class SessionStore: def load_session(self, id, default=None): pass def save_session(self, session): pass def delete_session(self, session): pass def has_session(self, id): return self.load_session(id, None) The only constraint is that 'id' must be a string in order for it to work with all of the session stores. We have implemented stores for postgres, durus, mysql, directory/file, and shelve persistence mechanisms. cheers, --titus From speno at isc.upenn.edu Sun Aug 14 22:05:26 2005 From: speno at isc.upenn.edu (John Speno) Date: Sun, 14 Aug 2005 16:05:26 -0400 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <20050814175425.GF17009@caltech.edu> References: <42FCD154.6010001@colorstudy.com> <20050814175425.GF17009@caltech.edu> Message-ID: Another session related wish: A few CherryPy users have requested[1] that there be an API for registering callbacks on sessions with the intent that those callbacks are invoked when a session is destroyed. Apparently this is something they are familiar with in the java servlet world. [1]http://www.cherrypy.org/ticket/250 From jjinux at gmail.com Mon Aug 15 19:17:41 2005 From: jjinux at gmail.com (Shannon -jj Behrens) Date: Mon, 15 Aug 2005 10:17:41 -0700 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <20050814175425.GF17009@caltech.edu> References: <42FCD154.6010001@colorstudy.com> <20050814175425.GF17009@caltech.edu> Message-ID: Heh, I'm overwhelmed by too much code and not enough direction. Naturally, I've got nice session code in Aquarium as well. *Sigh* this Python Web thing is going to be the death of me! -jj On 8/14/05, Titus Brown wrote: > -> I think that would be useful. Flup has a fairly decoupled session store > -> (http://www.saddi.com/software/flup/ in > -> http://svn.saddi.com/flup/trunk/flup/middleware/session.py). Is there > -> other current work that should be considered? PythonWeb has a session > -> module, but I don't know what its insides look like: > -> http://www.pythonweb.org/projects/webmodules/doc/0.5.3/html_multipage/lib/session.html > -> > -> Paste has one too, but it's Not Very Good ;) I started using the flup > -> session, but I got lazy and never flipped the switch to make it the > -> default. There's been some discussion about sessions in the last few > -> months on the Quixote list as well. > > I've been decoupled from Web-SIG e-mails for the last two months, but > Mike Orr and I built a simple session store for Quixote that has a > fairly simple and generic storage API: > > http://cafepy.com/quixote_extras/titus/session2/session2/store/SessionStore.py > > With the comments deleted, here's the core API: > > class SessionStore: > def load_session(self, id, default=None): > pass > > def save_session(self, session): > pass > > def delete_session(self, session): > pass > > def has_session(self, id): > return self.load_session(id, None) > > The only constraint is that 'id' must be a string in order for it to > work with all of the session stores. > > We have implemented stores for postgres, durus, mysql, directory/file, > and shelve persistence mechanisms. > > cheers, > --titus > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: http://mail.python.org/mailman/options/web-sig/jjinux%40gmail.com > -- I have decided to switch to Gmail, but messages to my Yahoo account will still get through. From chrism at plope.com Mon Aug 15 19:25:33 2005 From: chrism at plope.com (Chris McDonough) Date: Mon, 15 Aug 2005 13:25:33 -0400 Subject: [Web-SIG] and now for something completely different! In-Reply-To: References: <42FCD154.6010001@colorstudy.com> <20050814175425.GF17009@caltech.edu> Message-ID: <1124126733.30493.17.camel@localhost.localdomain> I've also got reams of code in Zope for sessions. Maybe we should just wait til the next PyCon and have a consolidation sprint. - C On Mon, 2005-08-15 at 10:17 -0700, Shannon -jj Behrens wrote: > Heh, I'm overwhelmed by too much code and not enough direction. > Naturally, I've got nice session code in Aquarium as well. *Sigh* > this Python Web thing is going to be the death of me! > > -jj > > On 8/14/05, Titus Brown wrote: > > -> I think that would be useful. Flup has a fairly decoupled session store > > -> (http://www.saddi.com/software/flup/ in > > -> http://svn.saddi.com/flup/trunk/flup/middleware/session.py). Is there > > -> other current work that should be considered? PythonWeb has a session > > -> module, but I don't know what its insides look like: > > -> http://www.pythonweb.org/projects/webmodules/doc/0.5.3/html_multipage/lib/session.html > > -> > > -> Paste has one too, but it's Not Very Good ;) I started using the flup > > -> session, but I got lazy and never flipped the switch to make it the > > -> default. There's been some discussion about sessions in the last few > > -> months on the Quixote list as well. > > > > I've been decoupled from Web-SIG e-mails for the last two months, but > > Mike Orr and I built a simple session store for Quixote that has a > > fairly simple and generic storage API: > > > > http://cafepy.com/quixote_extras/titus/session2/session2/store/SessionStore.py > > > > With the comments deleted, here's the core API: > > > > class SessionStore: > > def load_session(self, id, default=None): > > pass > > > > def save_session(self, session): > > pass > > > > def delete_session(self, session): > > pass > > > > def has_session(self, id): > > return self.load_session(id, None) > > > > The only constraint is that 'id' must be a string in order for it to > > work with all of the session stores. > > > > We have implemented stores for postgres, durus, mysql, directory/file, > > and shelve persistence mechanisms. > > > > cheers, > > --titus > > _______________________________________________ > > Web-SIG mailing list > > Web-SIG at python.org > > Web SIG: http://www.python.org/sigs/web-sig > > Unsubscribe: http://mail.python.org/mailman/options/web-sig/jjinux%40gmail.com > > > > From titus at caltech.edu Mon Aug 15 19:32:45 2005 From: titus at caltech.edu (Titus Brown) Date: Mon, 15 Aug 2005 10:32:45 -0700 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <1124126733.30493.17.camel@localhost.localdomain> References: <42FCD154.6010001@colorstudy.com> <20050814175425.GF17009@caltech.edu> <1124126733.30493.17.camel@localhost.localdomain> Message-ID: <20050815173245.GA19517@caltech.edu> -> I've also got reams of code in Zope for sessions. -> -> Maybe we should just wait til the next PyCon and have a consolidation -> sprint. -> -> On Mon, 2005-08-15 at 10:17 -0700, Shannon -jj Behrens wrote: -> > Heh, I'm overwhelmed by too much code and not enough direction. -> > Naturally, I've got nice session code in Aquarium as well. *Sigh* -> > this Python Web thing is going to be the death of me! I'd be surprised if the session *storage* code turned out to be all that different between these frameworks. I'm willing to change function & class names if it means I'd be using/testing/building on other people's work. Session code itself is a much stickier wicket, as far as I can tell. --titus From ianb at colorstudy.com Mon Aug 15 19:33:44 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Mon, 15 Aug 2005 12:33:44 -0500 Subject: [Web-SIG] and now for something completely different! In-Reply-To: References: <42FCD154.6010001@colorstudy.com> <20050814175425.GF17009@caltech.edu> Message-ID: <4300D1F8.8090007@colorstudy.com> Shannon -jj Behrens wrote: > Heh, I'm overwhelmed by too much code and not enough direction. > Naturally, I've got nice session code in Aquarium as well. *Sigh* > this Python Web thing is going to be the death of me! If everyone is reasonably comfortable with what sessions should do, can we just design an API and figure out the implementation later? -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From jjinux at gmail.com Mon Aug 15 20:22:33 2005 From: jjinux at gmail.com (Shannon -jj Behrens) Date: Mon, 15 Aug 2005 11:22:33 -0700 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <4300D1F8.8090007@colorstudy.com> References: <42FCD154.6010001@colorstudy.com> <20050814175425.GF17009@caltech.edu> <4300D1F8.8090007@colorstudy.com> Message-ID: The only thing I'm still concerned about is the locking. I lock access to the set of sessions when creating or deleting one, but I don't bother locking access to a single session. I think other people may have more strict requirements. I agree with Titus that we should stick to worrying about the backend storage at this point since it's less of a monster. -jj On 8/15/05, Ian Bicking wrote: > Shannon -jj Behrens wrote: > > Heh, I'm overwhelmed by too much code and not enough direction. > > Naturally, I've got nice session code in Aquarium as well. *Sigh* > > this Python Web thing is going to be the death of me! > > If everyone is reasonably comfortable with what sessions should do, can > we just design an API and figure out the implementation later? -- I have decided to switch to Gmail, but messages to my Yahoo account will still get through. From fumanchu at amor.org Mon Aug 15 20:25:54 2005 From: fumanchu at amor.org (Robert Brewer) Date: Mon, 15 Aug 2005 11:25:54 -0700 Subject: [Web-SIG] and now for something completely different! Message-ID: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> Ian Bicking wrote: > Shannon -jj Behrens wrote: > > Heh, I'm overwhelmed by too much code and not enough direction. > > Naturally, I've got nice session code in Aquarium as well. *Sigh* > > this Python Web thing is going to be the death of me! > > If everyone is reasonably comfortable with what sessions > should do, can we just design an API and figure out the > implementation later? That depends on where you draw the line between the two. ;) It's pretty easy to define an "implementation-less" API that consists of: create, read, update, delete. The first critical implementation discussion (which affects the API) should be around concurrency, and if multiple locking strategies need to be supported. In flup, for example, the entire session store is locked if the same session is requested more than once simultaneously. Pythonweb doesn't seem to mention concurrency at all. Paste mentions it's not supported. ;) Quixote's session2 stores have flags for multithreading/multiprocess but seem to not actually do anything with those flags. The concern is not only response time, but atomicity. In the comments for Aquarium's SessionContainer: "Concerning locking: in general, a global lock (of some sort) should be used so that creating, deleting, reading, and writing sessions is serialized. However, it is not necessary to have a lock for each session. If a user wishes to use two browser windows at the same time, the last writer wins." That is a design decision which not all frameworks (or other consumers of our session lib) might share. Apparently, given the current Python session modules out there, it's common to survive without caring? I know Mike Robinson has worked many long nights trying to make a session module for CherryPy which can consistently pass simple hit-counter tests. ;) Personally, I'd like to pursue an MROW solution. It would be nice if our final product supported multiple concurrency strategies. The decision about which strategy to use could be left to framework authors (who would wish to begin migration by maintaining maximum backward-compatibility), or to their users, if those options can be described simply enough. Robert Brewer System Architect Amor Ministries fumanchu at amor.org From mso at oz.net Mon Aug 15 22:40:11 2005 From: mso at oz.net (mso@oz.net) Date: Mon, 15 Aug 2005 13:40:11 -0700 (PDT) Subject: [Web-SIG] and now for something completely different! In-Reply-To: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> Message-ID: <33026.161.55.66.150.1124138411.squirrel@www.oz.net> Robert Brewer wrote: > Quixote's session2 stores have flags for > multithreading/multiprocess but seem to not actually do anything with > those flags. Correct, the flags are just indications to the caller. The caller might raise an exception if a thread-unsafe store is paired with a multithreaded server. There's no database locking code, although Postgres uses a transaction for the immediate operation. > Apparently, given the current Python > session modules out there, it's common to survive without caring? I haven't seen locking in any of the modules I've used, nor any particular errors caused by this. Is it defined what behavior the server should have if the user has the same site opened in two tabs and clicks back and forth? -- -- Mike Orr From ianb at colorstudy.com Mon Aug 15 22:46:19 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Mon, 15 Aug 2005 15:46:19 -0500 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> Message-ID: <4300FF1B.7080308@colorstudy.com> Robert Brewer wrote: >>If everyone is reasonably comfortable with what sessions >>should do, can we just design an API and figure out the >>implementation later? > > > That depends on where you draw the line between the two. ;) It's pretty > easy to define an "implementation-less" API that consists of: create, > read, update, delete. Yes, but we're all clever enough to know that's incomplete ;) > The first critical implementation discussion (which affects the API) > should be around concurrency, and if multiple locking strategies need to > be supported. In flup, for example, the entire session store is locked > if the same session is requested more than once simultaneously. > Pythonweb doesn't seem to mention concurrency at all. Paste mentions > it's not supported. ;) Quixote's session2 stores have flags for > multithreading/multiprocess but seem to not actually do anything with > those flags. I think it definitely is wrong to lock the session for concurrent reads -- that's a likely case, and can unnecessarily serialize access to things like images, or block a website during a long download (if that download uses the session, which is quite possible if the download requires authentication information). > The concern is not only response time, but atomicity. In the comments > for Aquarium's SessionContainer: > > "Concerning locking: in general, a global lock (of some sort) > should be used so that creating, deleting, reading, and writing > sessions is serialized. However, it is not necessary to have > a lock for each session. If a user wishes to use two browser > windows at the same time, the last writer wins." > > That is a design decision which not all frameworks (or other consumers > of our session lib) might share. Apparently, given the current Python > session modules out there, it's common to survive without caring? I know > Mike Robinson has worked many long nights trying to make a session > module for CherryPy which can consistently pass simple hit-counter > tests. ;) Personally, I'd like to pursue an MROW solution. In practice race conditions are very uncommon. Simultaneous requests from the same session are uncommon, since what few simultaneous requests that occur are likely to be for boring resources like images. If you have an image bug on a page that also writes the session, maybe you'd have a problem. I'd be okay saying "don't do that" because usually people don't do that, so it's not very compelling. It's possible that Ajax techniques would make concurrency more likely, but I'm not sure. One realistic case might be an upload-notification system, where the file is uploaded into a hidden iframe and the resources being submitted to could write to the session to signal when the upload was finished; but the user might be doing something in another frame at the same time. For that case I think you could just not use the session (I don't think it's a good communication medium for stuff like that). But with frames and multiple windows at least it's vaguely possible concurrent writes could happen. OTOH conflict errors are the wrong answer to concurrent writes in a signficant number of cases, where a little lossiness is preferable. Generally it becomes more complex/interesting if you have transactional sessions. > It would be nice if our final product supported multiple concurrency > strategies. The decision about which strategy to use could be left to > framework authors (who would wish to begin migration by maintaining > maximum backward-compatibility), or to their users, if those options can > be described simply enough. I'm -1 on multiple strategies, unless there's a really good reason for it. I'd like to see if we can do the Best Most Complete strategy without making compromises or creating a too-difficult API; if so, then why not use that? -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From jonathan at carnageblender.com Mon Aug 15 22:51:42 2005 From: jonathan at carnageblender.com (Jonathan Ellis) Date: Mon, 15 Aug 2005 13:51:42 -0700 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <4300FF1B.7080308@colorstudy.com> References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <4300FF1B.7080308@colorstudy.com> Message-ID: <1124139102.28016.240738111@webmail.messagingengine.com> On Mon, 15 Aug 2005 15:46:19 -0500, "Ian Bicking" said: > > That is a design decision which not all frameworks (or other consumers > > of our session lib) might share. Apparently, given the current Python > > session modules out there, it's common to survive without caring? I know > > Mike Robinson has worked many long nights trying to make a session > > module for CherryPy which can consistently pass simple hit-counter > > tests. ;) Personally, I'd like to pursue an MROW solution. > > In practice race conditions are very uncommon. Simultaneous requests > from the same session are uncommon, since what few simultaneous requests > that occur are likely to be for boring resources like images. If you > have an image bug on a page that also writes the session, maybe you'd > have a problem. I'd be okay saying "don't do that" because usually > people don't do that, so it's not very compelling. I wouldn't be okay with non-threadsafe sessions. -Jonathan From ianb at colorstudy.com Mon Aug 15 22:57:55 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Mon, 15 Aug 2005 15:57:55 -0500 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <1124139102.28016.240738111@webmail.messagingengine.com> References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <4300FF1B.7080308@colorstudy.com> <1124139102.28016.240738111@webmail.messagingengine.com> Message-ID: <430101D3.4010001@colorstudy.com> Jonathan Ellis wrote: > On Mon, 15 Aug 2005 15:46:19 -0500, "Ian Bicking" > said: > >>>That is a design decision which not all frameworks (or other consumers >>>of our session lib) might share. Apparently, given the current Python >>>session modules out there, it's common to survive without caring? I know >>>Mike Robinson has worked many long nights trying to make a session >>>module for CherryPy which can consistently pass simple hit-counter >>>tests. ;) Personally, I'd like to pursue an MROW solution. >> >>In practice race conditions are very uncommon. Simultaneous requests >>from the same session are uncommon, since what few simultaneous requests >>that occur are likely to be for boring resources like images. If you >>have an image bug on a page that also writes the session, maybe you'd >>have a problem. I'd be okay saying "don't do that" because usually >>people don't do that, so it's not very compelling. > > > I wouldn't be okay with non-threadsafe sessions. Non-threadsafe in what manner? Certainly they should be usable in threaded environments, and should never blow up or anything. I just assume that. The question is whether, if there's two concurrent writers (threaded or multiprocess), they should be serialized (and how), or if one of them simply clobbers the other. Threads or multiprocess, it's really the same issue. Except perhaps for isolation -- threads could *potentially* see changes in other threads, but that's not possible for multiple processes. So probably they should always be isolated; not a big deal, but something to consider. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From pje at telecommunity.com Mon Aug 15 23:11:23 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 15 Aug 2005 17:11:23 -0400 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <4300FF1B.7080308@colorstudy.com> References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> Message-ID: <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com> At 03:46 PM 8/15/2005 -0500, Ian Bicking wrote: >Robert Brewer wrote: > >>If everyone is reasonably comfortable with what sessions > >>should do, can we just design an API and figure out the > >>implementation later? > > > > > > That depends on where you draw the line between the two. ;) It's pretty > > easy to define an "implementation-less" API that consists of: create, > > read, update, delete. > >Yes, but we're all clever enough to know that's incomplete ;) Personally, I think the most important part of session services is just managing the session itself; start, begin, timeout, and getting an identifier in and out of the request/response. For me, create/read/update/delete/persist/GC responsibility belongs entirely to the application. To put it another way: I don't believe in session variables, only session-specific application objects. An ecommerce application should have persistent carts and items and the like; the only purpose of a session is to find out which cart to look at. In this way, concurrency and all the other questions being raised here are irrelevant. Or at least they're irrelevant to the session management part, anyway. :) So I'd personally prefer that any session service standards distinguish between management of the session itself, from storage of data associated with the session. The latter is just a standard object-persistence or object-relational problem and can easily be dealt with as such, distinct from session management issues like cookies vs. URLs, timeouts, ID generation, and so forth. (Note that even GC of abandoned sessions is highly subject to business rules, and it would be crazy for us to try and encompass the possible rule variations within a relatively simple component specification.) While it may be nice to have persistence services that are optimized for session-like use cases, it doesn't make a lot of sense to tightly couple them to session management. Just like WSGI splits things into application and server, I think a session spec should split them into client-state-management and server-state-storage, so that we can mix and match from the best of both worlds. Of course, I personally prefer to use whatever the application's storage is for my session management, so I'll probably have little reason to get involved in the "storage" side of the session equation. Indeed, I'd argue that applications that *don't* put their session data in the application's main DB should have very very good reasons for doing so, and I've never heard a good enough reason yet. :) Well, there's, "my application's DB suxors", but that means you ought to upgrade the application DB instead if you can. :) From jonathan at carnageblender.com Mon Aug 15 23:41:07 2005 From: jonathan at carnageblender.com (Jonathan Ellis) Date: Mon, 15 Aug 2005 14:41:07 -0700 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <430101D3.4010001@colorstudy.com> References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <4300FF1B.7080308@colorstudy.com> <1124139102.28016.240738111@webmail.messagingengine.com> <430101D3.4010001@colorstudy.com> Message-ID: <1124142067.1070.240740982@webmail.messagingengine.com> On Mon, 15 Aug 2005 15:57:55 -0500, "Ian Bicking" said: > Jonathan Ellis wrote: > > On Mon, 15 Aug 2005 15:46:19 -0500, "Ian Bicking" > >>In practice race conditions are very uncommon. Simultaneous requests > >>from the same session are uncommon, since what few simultaneous requests > >>that occur are likely to be for boring resources like images. If you > >>have an image bug on a page that also writes the session, maybe you'd > >>have a problem. I'd be okay saying "don't do that" because usually > >>people don't do that, so it's not very compelling. > > > > > > I wouldn't be okay with non-threadsafe sessions. > > Non-threadsafe in what manner? Certainly they should be usable in > threaded environments, and should never blow up or anything. I just > assume that. > > The question is whether, if there's two concurrent writers (threaded or > multiprocess), they should be serialized (and how), or if one of them > simply clobbers the other. Well, if your goal is "usable in [concurrent] environments," you're really talking about serializing anyway. Consider some hypothetical API: def session_for_user(uname): if not session_exists(uname): create_session(uname): return session_retrieve(uname) Depending on how soon session_exists can tell that a session is being created, if two requests for the same session come in close enough together (and it's worth remembering that this could easily be the result of a single browser hitting refresh on a very heavily loaded machine), the second request could get either an incompletely initialized session object, or a different session object entirely. -Jonathan From ianb at colorstudy.com Tue Aug 16 00:08:12 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Mon, 15 Aug 2005 17:08:12 -0500 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com> References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com> Message-ID: <4301124C.7040708@colorstudy.com> Phillip J. Eby wrote: > Of course, I personally prefer to use whatever the application's storage > is for my session management, so I'll probably have little reason to get > involved in the "storage" side of the session equation. Indeed, I'd > argue that applications that *don't* put their session data in the > application's main DB should have very very good reasons for doing so, > and I've never heard a good enough reason yet. :) Well, there's, "my > application's DB suxors", but that means you ought to upgrade the > application DB instead if you can. :) There's useful reasons for non-application code to store things in the session, and the particulars of the application storage aren't really applicable. For instance, with this pattern: http://blog.ianbicking.org/web-application-patterns-status-notification.html -- you put transient messages in the session. But there's no point to using a fancy application session storage which means documentation and configuration and whatnot. Maybe you have no impediments to throwing random data into your application data stores, but I do. I think there's quite a few other use cases for this same kind of thing which I think implies that there should be a standard generic location to store session information. Or you can ignore that and use the session ID only. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From pje at telecommunity.com Tue Aug 16 00:52:35 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 15 Aug 2005 18:52:35 -0400 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <4301124C.7040708@colorstudy.com> References: <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com> <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com> At 05:08 PM 8/15/2005 -0500, Ian Bicking wrote: >Phillip J. Eby wrote: >>Of course, I personally prefer to use whatever the application's storage >>is for my session management, so I'll probably have little reason to get >>involved in the "storage" side of the session equation. Indeed, I'd >>argue that applications that *don't* put their session data in the >>application's main DB should have very very good reasons for doing so, >>and I've never heard a good enough reason yet. :) Well, there's, "my >>application's DB suxors", but that means you ought to upgrade the >>application DB instead if you can. :) > >There's useful reasons for non-application code to store things in the >session, and the particulars of the application storage aren't really >applicable. For instance, with this pattern: >http://blog.ianbicking.org/web-application-patterns-status-notification.html >-- you put transient messages in the session. If I needed to do what you're doing on that page, I'd probably just put the message in a cookie, and reset it once it was used. In other words, a session isn't necessary just to have client-specific state, especially for something so short-lived as that example. > But there's no point to using a fancy application session storage which > means documentation and configuration and whatnot. Maybe you have no > impediments to throwing random data into your application data stores, > but I do. The reason I enforce this particular discipline is specifically to *prevent* "random data" from being added *anywhere*. A session object that you can just throw any old data into is sloppy from my POV, because scaling most session backend systems well is a hard problem. If you are making a small-scale quick-and-dirty system, okay, whatever, but in the megahits/month range and up, I think session variable design needs to be much more systematic to ensure it can be scaled. Therefore, my philosophy is that every bit of client-specific state goes either in the application DB, or it goes in the browser. Anywhere in-between the two is a liability from my perspective, because it introduces a new tier that needs to be factored into design of the app's transaction model, scaling and reliability plans, etc. Ergo, there darn well better be a really good reason for introducing that extra tier. (And you'll notice the existence of this tier produces exactly the problems I know that it's "common wisdom" that sessions are supposed to be an important thing to have, especially since ASP and PHP provide them out of the box. (And at least PHP lets you implement the storage however you like!) But I view sessions of that kind with roughly the same disdain as I view Perl or Tcl's weak typing; they mask problems that I want to know about. I'm well aware that I'm in the minority on this point, but that doesn't mean I'm not still right. :) (And I'm also aware that "scaling down" is important, but the rule that all state goes either in the browser or the application DB scales down just as well as it scales up.) From mso at oz.net Tue Aug 16 01:05:43 2005 From: mso at oz.net (mso@oz.net) Date: Mon, 15 Aug 2005 16:05:43 -0700 (PDT) Subject: [Web-SIG] and now for something completely different! In-Reply-To: <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com> References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com> Message-ID: <33293.161.55.66.150.1124147143.squirrel@www.oz.net> Phillip J. Eby wrote: > So I'd personally prefer that any session service standards distinguish > between management of the session itself, from storage of data associated > with the session. Yes, but the web-sig needs to define both APIs, and encourage generic implementations of both. Otherwise every framework or every user has to write their own storage backends. Speaking from experience with Quixote, which has no persistent sessions out of the box. Also, the serialization method(s) need to be documented. That's a property of the storage object. All existing ones I know of use pickle (sometimes encapsulated by Durus or shelve), but that may not be the case forever. Plus there's pickle vs cPickle; I've heard the latter has Unicode problems. > Of course, I personally prefer to use whatever the application's storage > is for my session management That's what I've been doing too. session2 is made to play nicely with your application's database, sticking to whatever table you designate for it. > To put it another way: I don't believe in session variables, > only session-specific application objects. An ecommerce application > should > have persistent carts and items and the like; the only purpose of a > session > is to find out which cart to look at. We already have some frameworks with dict-like sessions and others with a standard session object. Assuming we had a hybrid object that accepts both, I don't know why any application *has* to have a custom session object. But there's no reason to arbitrarily preclude it either. > Indeed, I'd argue > that applications that *don't* put their session data in the application's > main DB should have very very good reasons for doing so, and I've never > heard a good enough reason yet. Ian Bicking wrote: > There's useful reasons for non-application code to store things in the > session, and the particulars of the application storage aren't really > applicable. For instance, with this pattern: > http://blog.ianbicking.org/web-application-patterns-status-notification.html > -- you put transient messages in the session. But there's no point to > using a fancy application session storage which means documentation and > configuration and whatnot. Maybe you have no impediments to throwing > random data into your application data stores, but I do. I wouldn't call that example "non-application" code. Setting a message in the session for the subsequent request to display is very useful. "Record added", "Add cancelled", "logged out", etc. I'm not sure third-party code (middleware) should be able to add a message directly, but that may turn out to be a significant feature of certain middleware. -- -- Mike Orr From fumanchu at amor.org Tue Aug 16 01:47:57 2005 From: fumanchu at amor.org (Robert Brewer) Date: Mon, 15 Aug 2005 16:47:57 -0700 Subject: [Web-SIG] and now for something completely different! Message-ID: <3A81C87DC164034AA4E2DDFE11D258E37727AC@exchange.hqamor.amorhq.net> Me: > It would be nice if our final product supported multiple > concurrency strategies. The decision about which strategy > to use could be left to framework authors (who would wish > to begin migration by maintaining maximum > backward-compatibility), or to their users, if those > options can be described simply enough. > > ... > > The concern is not only response time, but atomicity. In > the comments for Aquarium's SessionContainer: > > "Concerning locking: in general, a global lock (of some sort) > should be used so that creating, deleting, reading, and writing > sessions is serialized. However, it is not necessary to have > a lock for each session. If a user wishes to use two browser > windows at the same time, the last writer wins." > > That is a design decision which not all frameworks (or > other consumers of our session lib) might share. > Apparently, given the current Python session modules > out there, it's common to survive without caring? > I know Mike Robinson has worked many long nights > trying to make a session module for CherryPy which > can consistently pass simple hit-counter tests. ;) > Personally, I'd like to pursue an MROW solution. Ian: > In practice race conditions are very uncommon. > Simultaneous requests from the same session are > uncommon, since what few simultaneous requests > that occur are likely to be for boring resources > like images. If you have an image bug on a page > that also writes the session, maybe you'd have a > problem. I'd be okay saying "don't do that" > because usually people don't do that, so it's > not very compelling. Images are only boring until they're not--a Google-style map server for example. > It's possible that Ajax techniques would make > concurrency more likely, but I'm not sure. Most definitely. As page composition swings back to a client-side-pull model, I expect more pages to be written in a lot of javascript querying RESTful HTTP wrappers around data stores, and much of that "pull" will be concurrent. One GET pulls a minimal HTML page; that page includes Javascript that then populates the page with data via multiple AJAX requests. > ...with frames and multiple windows > at least it's vaguely possible concurrent writes > could happen. OTOH conflict errors are the wrong > answer to concurrent writes in a signficant number > of cases, where a little lossiness is preferable. > Generally it becomes more complex/interesting if you have > transactional sessions. > > ... > > I'm -1 on multiple strategies, unless there's a really good > reason for it. I'd like to see if we can do the Best Most > Complete strategy without making compromises or creating > a too-difficult API; if so, then why not use that? I'd be -1 on them too, except that a see a "really good reason": expectations differ wildly because application needs differ wildly. Conflict errors are the right answer in a significant number of cases. Lossiness is unacceptable in many. If we can do "the Best Most Complete strategy", great! But I won't hold my breath. If our common session module meets 75% of the needs of existing frameworks, we've made no progress whatsoever, in my mind. Let's shoot for 90%+. Robert Brewer System Architect Amor Ministries fumanchu at amor.org From grosser.meister.morti at gmx.net Tue Aug 16 03:27:26 2005 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Tue, 16 Aug 2005 03:27:26 +0200 Subject: [Web-SIG] httplib ICY support (3 lines of code) Message-ID: <430140FE.3070808@gmx.net> Hi. I hope this is the right place to poste this. To add ICY support to the httplib module you just have to add 2 lines and 2 charactesr! ;) ICY is a streaming protocoll developed by nullsoft (shoutcast and winamp). It's identically to HTTP/1.0 but the server sends ICY instead of HTTP/1.0. Other differences are additional header fields, but that hasn't to bother httplib. To add ICY support to httplib simple replace at line 308 in httplib.py: if not version.startswith('HTTP/'): with: if version == 'ICY': version = "HTTP/1.0" elif not version.startswith('HTTP/'): Now I can write a little stream-dumping ICY proxy. ;) -panzi From mike_mp at zzzcomputing.com Tue Aug 16 17:48:36 2005 From: mike_mp at zzzcomputing.com (mike bayer) Date: Tue, 16 Aug 2005 11:48:36 -0400 (EDT) Subject: [Web-SIG] and now for something completely different! Message-ID: <45421.66.192.34.8.1124207316.squirrel@66.192.34.8> if I may throw my hat in the ring here, the session object I have built for Myghty accomplishes the following things, which were the important facets of a session for me: - it is neutral of its backend storage system. I developed a simple "storage" API that currently has DBM, memory and plain file-based systems and people have also been clamoring for a memcached version which is easy enough to add. Myghty uses this backend containment system both for its page caching and session libraries. - the backend storage system supplies locking which locks amongst threads and processes; the session implementation insures that this lock is only against its own session ID. I was basically going for an improvement over mod_python's session, which locks all sessions against a single apache global mutex, and stores everyone's session in one huge DBM file. my session object, when using file-based containment, always keeps every session's information in separate files and was modeled after Apache::Session in this regard. - because a "read" operation also registers a "last accessed time" data member, its not using multiple reader/single writer style locking, everyone is a writer. However, since I am sensitive to iframes, ajax calls, and dynamic image calls hitting the same session concurrently within a request which I'd rather not slow down, I do something less than optimal which is I open the session store and read the full thing into memory first when its accessed, and then immediately unlock. This obviously can create problems for an application that is storing huge amounts of data in its session which is not required in full for any one request. Two improvements to this behavor would be to either make the "last accessed time" be written out just once per request and then to allow multiple readers, or to improve the containment API to supply "last accessed time" automatically. I mostly was using Apache::Session as a guide to the architectural features I wanted to see, which include flexibility of containment and locking systems as well as a separation between individual sessions. - mike From jonathan at carnageblender.com Tue Aug 16 18:08:13 2005 From: jonathan at carnageblender.com (Jonathan Ellis) Date: Tue, 16 Aug 2005 09:08:13 -0700 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <45421.66.192.34.8.1124207316.squirrel@66.192.34.8> References: <45421.66.192.34.8.1124207316.squirrel@66.192.34.8> Message-ID: <1124208493.11438.240803964@webmail.messagingengine.com> On Tue, 16 Aug 2005 11:48:36 -0400 (EDT), "mike bayer" said: > - because a "read" operation also registers a "last accessed time" data > member, its not using multiple reader/single writer style locking, > everyone is a writer. However, since I am sensitive to iframes, ajax > calls, and dynamic image calls hitting the same session concurrently > within a request which I'd rather not slow down, I do something less than > optimal which is I open the session store and read the full thing into > memory first when its accessed, and then immediately unlock. This > obviously can create problems for an application that is storing huge > amounts of data in its session which is not required in full for any one > request. I don't think read/write locking for sessions is a Must Have, either. It's nice if it's easy to do (which it is, in a threaded situation), but fundamentally the session is not the right place for caching Lots Of Stuff. -Jonathan From ianb at colorstudy.com Tue Aug 16 18:10:44 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 16 Aug 2005 11:10:44 -0500 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <45421.66.192.34.8.1124207316.squirrel@66.192.34.8> References: <45421.66.192.34.8.1124207316.squirrel@66.192.34.8> Message-ID: <43021004.4090407@colorstudy.com> mike bayer wrote: > I mostly was using Apache::Session as a guide to the architectural > features I wanted to see, which include flexibility of containment and > locking systems as well as a separation between individual sessions. Is there a good API guide to Apache::Session somewhere? -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From ianb at colorstudy.com Tue Aug 16 18:28:04 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 16 Aug 2005 11:28:04 -0500 Subject: [Web-SIG] Session interface Message-ID: <43021414.9080102@colorstudy.com> I wrote a possible interface for sessions: http://aspn.activestate.com/ASPN/CodeDoc/Apache-mod_perl_guide/src/modules.html It's not my most thoughtful effort, but maybe it can be a discussion point. Feel free to offer completely different APIs if you think this one sucks. I basically just threw in properties and methods for all the functionality I've thought of by reading a couple APIs and the discussion here, without actually thinking about how it goes together :-/ In this interface presumably you make subclasses of an abstract class to implement different storage backends and do some kinds of configuration. Thinking on it more, probably a good place to start would be agreeing on specific terminology for the objects involved, since I've seen several different sets of terminology, many of which use the same words for different ideas: Session: An instance of this represents one user/browser's session. SessionStore: An instance of this represents the persistence mechanism. This is a functional component, not embodying any policy. SessionManager: This is a container for sessions, and uses a SessionStore. This contains all the policy for loading, saving, locking, expiring sessions. Does that sound good? Note that the attached interface conflates SessionStore and SessionManager. Some interfaces make an explicit ApplicationSession, which is contained by Session and keyed off some application ID; my interface implies that separation, but does not enforce it, and does not offer any extra functionality at that level (e.g., per-ApplicationSession locks or transactions). From ianb at colorstudy.com Tue Aug 16 18:50:38 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 16 Aug 2005 11:50:38 -0500 Subject: [Web-SIG] Session interface (corrected URL) In-Reply-To: <3A81C87DC164034AA4E2DDFE11D258E37727B3@exchange.hqamor.amorhq.net> References: <3A81C87DC164034AA4E2DDFE11D258E37727B3@exchange.hqamor.amorhq.net> Message-ID: <4302195E.7060704@colorstudy.com> Robert Brewer wrote: >>I wrote a possible interface for sessions: >> > > http://aspn.activestate.com/ASPN/CodeDoc/Apache-mod_perl_guide/src/modul > es.html > > You wrote Apache::Session, ::DBI, ::Request, AND ::SubProcess? I must > remember to put that in my memoirs... Doh! Clearly my copy-and-paste skills are lacking ;) http://svn.colorstudy.com/home/ianb/scarecrow_session_interface.py -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From ianb at colorstudy.com Tue Aug 16 18:54:45 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 16 Aug 2005 11:54:45 -0500 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <45421.66.192.34.8.1124207316.squirrel@66.192.34.8> References: <45421.66.192.34.8.1124207316.squirrel@66.192.34.8> Message-ID: <43021A55.50007@colorstudy.com> mike bayer wrote: > - because a "read" operation also registers a "last accessed time" data > member, its not using multiple reader/single writer style locking, > everyone is a writer. However, since I am sensitive to iframes, ajax > calls, and dynamic image calls hitting the same session concurrently > within a request which I'd rather not slow down, I do something less than > optimal which is I open the session store and read the full thing into > memory first when its accessed, and then immediately unlock. This > obviously can create problems for an application that is storing huge > amounts of data in its session which is not required in full for any one > request. I think we can all agree that we're not expecting sessions to be primary storage for large objects, so we shouldn't worry too much about this. However, as a use case for objects derived from the session, consider an upload form with validation. If someone uploads a large file but has an invalid form, you might want to keep the file around on the server side. You can't put it in the form (hidden or not) because then you needlessly retransfer the file twice. You can't leave the filename in the input field, because browsers don't allow that. So in a lot of ways this is where it would be nice to put a big file in the session. But you should really put it in a temporary directory and put the filename in the session (you could put the filename in a signed field in the form, but ignore that for now). The advantage of putting it in the session is that the session has tracking, a timeout, etc. So with the API I gave you might do: session['upload_filename'] = '/tmp/foo.jpg' session.store.expire_session_callbacks.append(delete_upload_filename) def delete_upload_filename(session_id): session = session_store.load_session_read_only(session_id) if 'upload_filename' in session: filename = session['upload_filename'] if os.path.exists(filename): os.unlink(filename) Though there's a couple issues. The sessino store should be passed along with the session ID. It should be specified that loading a session from this callback will not cancel its expiration. Maybe per-session callbacks should be allowed; in which case the callbacks would have to be identifiable by a string or some pickleable value, since you can't pickle the functions themselves. I suppose you could implement the callback as an instance with a __call__ method, which pickle turns into a class name plus __dict__ values. I hate overusing __call__; if it has to be an instance (to be pickleable), then might as well give it a method name, and maybe call other methods as well. Then it essentially becomes an ad hoc event system. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From pje at telecommunity.com Tue Aug 16 18:55:16 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 16 Aug 2005 12:55:16 -0400 Subject: [Web-SIG] Session interface In-Reply-To: <43021414.9080102@colorstudy.com> Message-ID: <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> At 11:28 AM 8/16/2005 -0500, Ian Bicking wrote: >I wrote a possible interface for sessions: >http://aspn.activestate.com/ASPN/CodeDoc/Apache-mod_perl_guide/src/modules.html Um, wha? >Session: > An instance of this represents one user/browser's session. >SessionStore: > An instance of this represents the persistence mechanism. This > is a functional component, not embodying any policy. >SessionManager: > This is a container for sessions, and uses a SessionStore. This > contains all the policy for loading, saving, locking, expiring > sessions. Which of these is responsible for managing client-side state? (i.e. cookie reading, setting, expiration, and refresh?) Maybe this is clearer in what you actually wrote, but the link above gives no clue. :) From ianb at colorstudy.com Tue Aug 16 19:02:12 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 16 Aug 2005 12:02:12 -0500 Subject: [Web-SIG] Session interface In-Reply-To: <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> References: <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> Message-ID: <43021C14.9060702@colorstudy.com> Phillip J. Eby wrote: >> Session: >> An instance of this represents one user/browser's session. >> SessionStore: >> An instance of this represents the persistence mechanism. This >> is a functional component, not embodying any policy. >> SessionManager: >> This is a container for sessions, and uses a SessionStore. This >> contains all the policy for loading, saving, locking, expiring >> sessions. > > > Which of these is responsible for managing client-side state? (i.e. > cookie reading, setting, expiration, and refresh?) SessionManager is responsible for expiration. I'm not sure what you are thinking of for refresh. Updating last-accessed time? That would be the SessionManager as well. Cookies are not handled at all by these objects -- that's one of those boring details I think is best left to library users (frameworks, services, middleware), or put in another object. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From ianb at colorstudy.com Tue Aug 16 19:05:06 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 16 Aug 2005 12:05:06 -0500 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <43021A55.50007@colorstudy.com> References: <45421.66.192.34.8.1124207316.squirrel@66.192.34.8> <43021A55.50007@colorstudy.com> Message-ID: <43021CC2.3040102@colorstudy.com> Ian Bicking wrote: > Though there's a couple issues. The sessino store should be passed > along with the session ID. It should be specified that loading a > session from this callback will not cancel its expiration. Maybe > per-session callbacks should be allowed; in which case the callbacks > would have to be identifiable by a string or some pickleable value, > since you can't pickle the functions themselves. I suppose you could > implement the callback as an instance with a __call__ method, which > pickle turns into a class name plus __dict__ values. I hate overusing > __call__; if it has to be an instance (to be pickleable), then might as > well give it a method name, and maybe call other methods as well. Then > it essentially becomes an ad hoc event system. A more complete event system would also let people like Phillip who don't want to use ad hoc storage to simply ignore that part, and use the session ID and events to manage data in their application storage. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From pje at telecommunity.com Tue Aug 16 19:23:20 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 16 Aug 2005 13:23:20 -0400 Subject: [Web-SIG] Session interface In-Reply-To: <43021C14.9060702@colorstudy.com> References: <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com> At 12:02 PM 8/16/2005 -0500, Ian Bicking wrote: >Phillip J. Eby wrote: >>>Session: >>> An instance of this represents one user/browser's session. >>>SessionStore: >>> An instance of this represents the persistence mechanism. This >>> is a functional component, not embodying any policy. >>>SessionManager: >>> This is a container for sessions, and uses a SessionStore. This >>> contains all the policy for loading, saving, locking, expiring >>> sessions. >> >>Which of these is responsible for managing client-side state? (i.e. >>cookie reading, setting, expiration, and refresh?) > >SessionManager is responsible for expiration. I'm not sure what you are >thinking of for refresh. Updating last-accessed time? That would be the >SessionManager as well. By refresh, I mean updating a cookie's expiration time. > Cookies are not handled at all by these objects -- that's one of those > boring details I think is best left to library users (frameworks, > services, middleware), or put in another object. Wow. Those boring details, as you call them, are the entire concept of "session" to me. Now that you've posted the right interface URL, I'm looking at it and not seeing anything there that seems related to what I think of as sessions. To me, session management is totally about managing the client-side state, since anything I'm storing on the server is application state and just gets stored the way anything else does. Some of the client-side state concerns: * Triggering actions when the state information isn't available (due to being a new sesssion or a client-side timeout) * Initial expiration vs. refresh policy * signed vs. unsigned data If you handle these well, then simply storing real data in your application DB solves all problems, with no need for any of the objects in the interface you defined. Or, to put it differently, I suppose I could wrap a pure client-side storage solution in the interfaces you propose, but it would be overkill, since concurrency would be a non-issue (among others). It would also be slightly broken in that the interfaces you've written up don't deal with any of the *interesting* details, which (IMO) are all in the client-state policy areas. (All the concurrency/scaling/cache-sharing/etc. issues of session stores vanish if you only have client-stored and db-stored data.) From ianb at colorstudy.com Tue Aug 16 19:46:49 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 16 Aug 2005 12:46:49 -0500 Subject: [Web-SIG] Session interface In-Reply-To: <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com> References: <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com> Message-ID: <43022689.9080504@colorstudy.com> Phillip J. Eby wrote: >> SessionManager is responsible for expiration. I'm not sure what you >> are thinking of for refresh. Updating last-accessed time? That would >> be the SessionManager as well. > > > By refresh, I mean updating a cookie's expiration time. I'm not sure; I always try to make the cookie last longer than the session can. I suppose you could store information about when the cookie is supposed to expire in the session itself (since you can't read expiration times from the cookie). Or you could store the expiration as part of the cookie data; I haven't thought about doing it that way. >> Cookies are not handled at all by these objects -- that's one of >> those boring details I think is best left to library users >> (frameworks, services, middleware), or put in another object. > > > Wow. Those boring details, as you call them, are the entire concept of > "session" to me. Now that you've posted the right interface URL, I'm > looking at it and not seeing anything there that seems related to what I > think of as sessions. OK, maybe not boring, but impossible to put in a library in any useful way. If you do put them in a library, all you've really created is a big document on possible use cases and some really boring (as in trivial) functions -- write_cookie_header(), cookie_header_tuple(), add_session_id_to_url(), read_session_id(), etc. If it built on some other standard (services, middleware, etc), then maybe it would be useful; we have no such standard, so I don't see any useful work to be done there. Instead of inventing a single-use framework to build on, or trying to tackle the larger framework standardization, I'd rather ignore the issue and assume that we attain and save the session ID elsewhere. I think most of us have a clear idea of what we want a session to be, which includes persistence; at least, that's what all the APIs discussed so far have been about, and that's what "session" means in most frameworks. It's not what you want, and that's fine -- I think if you can get a session ID and notification of events you can do what you want to do just fine, and ignore the rest. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From pje at telecommunity.com Tue Aug 16 21:45:47 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 16 Aug 2005 15:45:47 -0400 Subject: [Web-SIG] Session interface In-Reply-To: <43022689.9080504@colorstudy.com> References: <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com> <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com> At 12:46 PM 8/16/2005 -0500, Ian Bicking wrote: >Phillip J. Eby wrote: >>>SessionManager is responsible for expiration. I'm not sure what you are >>>thinking of for refresh. Updating last-accessed time? That would be >>>the SessionManager as well. >> >>By refresh, I mean updating a cookie's expiration time. > >I'm not sure; I always try to make the cookie last longer than the session >can. I suppose you could store information about when the cookie is >supposed to expire in the session itself (since you can't read expiration >times from the cookie). Or you could store the expiration as part of the >cookie data; I haven't thought about doing it that way. Note that if you store state client-side (login info, for example), then cookie expiration is a convenient way to get the client to do your garbage collection. If I want somebody's login to time out after 30 minutes of inactivity (or 8 hours, or whatever), the easy way to do that is to just set the cookie to time out, and refresh the expiration time on each hit. > If it built on some other standard (services, middleware, etc), then > maybe it would be useful; we have no such standard, so I don't see any > useful work to be done there. I suppose you have a point there, in that I'd see such management as a useful place for a middleware plus a service. But that's one reason why I'd like to see the services API spec'd out. :) > Instead of inventing a single-use framework to build on, or trying to > tackle the larger framework standardization, I'd rather ignore the issue > and assume that we attain and save the session ID elsewhere. > >I think most of us have a clear idea of what we want a session to be, >which includes persistence; at least, that's what all the APIs discussed >so far have been about, and that's what "session" means in most >frameworks. It's not what you want, and that's fine -- I think if you can >get a session ID and notification of events you can do what you want to do >just fine, and ignore the rest. Yeah, I'm just trying to point out that you keep saying "we're trying to solve this problem", and I say, "you know, if you do it this way, it's not a problem." And then you say, "yes, but if you do it that way, then there are no problems for us to solve." (i.e., it's "boring", "trivial", etc.) At which point I say, "yes, exactly!", thinking that we now agree that it's silly to do things in a way that makes them into a problem. But apparently you think that means we should instead spend time making problems and solving them, since so many other people have chosen to make their lives hard in that particular way. :) To put it another way, I see an opportunity here to educate developers about better ways of doing things, rather than to institutionalize wasteful ways of doing them. But I realize that I'm apparently the only person who thinks that way, so I'll shut up now. (At least here, anyway; in general I'm still going to talk about sessions being Considered Harmful from both the scalability and simplicity perspectives.) From fumanchu at amor.org Tue Aug 16 22:07:08 2005 From: fumanchu at amor.org (Robert Brewer) Date: Tue, 16 Aug 2005 13:07:08 -0700 Subject: [Web-SIG] Session interface Message-ID: <3A81C87DC164034AA4E2DDFE11D258E37727BF@exchange.hqamor.amorhq.net> Phillip J. Eby wrote: > To put it another way, I see an opportunity here to > educate developers about better ways of doing things, I (and some of the other CherryPy devs) agree that there are better ways, particularly when using a persistent server process. > ...rather than to institutionalize wasteful > ways of doing them. The issue for us as framework developers is that the wasteful ways are *already* institutionalized. Education is a worthy goal, but if it takes 5 years to convince a majority of Python web developers that they don't need sessions, we need safe and strong implementations of sessions in the interim. I think that if we chose to ship CherryPy, for example, without any session functionality, we'd lose the very audience we want to educate. > But I realize that I'm apparently the only > person who thinks that way, so I'll shut up now. > (At least here, anyway; in general I'm still going > to talk about sessions being Considered Harmful from > both the scalability and simplicity perspectives.) Please continue talking about it! [But as you say, probably not within this thread ;)]. I never use sessions, and am interested in communicating the benefits of that approach to a wider audience. Robert Brewer System Architect Amor Ministries fumanchu at amor.org From jonathan at carnageblender.com Tue Aug 16 22:14:18 2005 From: jonathan at carnageblender.com (Jonathan Ellis) Date: Tue, 16 Aug 2005 13:14:18 -0700 Subject: [Web-SIG] Session interface In-Reply-To: <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com> References: <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com> <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com> <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com> Message-ID: <1124223258.7911.240822848@webmail.messagingengine.com> On Tue, 16 Aug 2005 15:45:47 -0400, "Phillip J. Eby" said: > >I'm not sure; I always try to make the cookie last longer than the session > >can. I suppose you could store information about when the cookie is > >supposed to expire in the session itself (since you can't read expiration > >times from the cookie). Or you could store the expiration as part of the > >cookie data; I haven't thought about doing it that way. Sure, sessions are overused and abused. Particularly among certain classes of developers which I won't characterize here. :) But there's a reason they're in such common use; it's a huge waste (particular for low-bandwidth clients) to store anything more than absolutely necessary in a cookie that the client sends repeatedly. Much more efficient to send "here's my token" which the server uses to retrieve the rest. -Jonathan From pje at telecommunity.com Tue Aug 16 22:37:31 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 16 Aug 2005 16:37:31 -0400 Subject: [Web-SIG] Session interface In-Reply-To: <1124223258.7911.240822848@webmail.messagingengine.com> References: <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com> <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com> <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com> <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com> At 01:14 PM 8/16/2005 -0700, Jonathan Ellis wrote: >Sure, sessions are overused and abused. Particularly among certain >classes of developers which I won't characterize here. :) > >But there's a reason they're in such common use; it's a huge waste >(particular for low-bandwidth clients) to store anything more than >absolutely necessary in a cookie that the client sends repeatedly. Much >more efficient to send "here's my token" which the server uses to >retrieve the rest. I agree; and in fact until I saw Ian's status-message example, I've never had need to store anything in a cookie except login credentials or an identifier used to find application objects like a shopping cart. IOW, cookies are fundamentally for short strings. However, if your session data consists solely of short strings, or short-lived medium-size strings (like a status message) then it works out nicely. If you have session data other than short strings, then you should store it with your application data, since it's clearly data that's part of your application. There are plenty of object-relational solutions and you can select your transaction/locking policies to suit your application. You can then handle load balancing at the web tier without having to play session-affinity tricks at the load balancer. The last time I wrote apps using a session store was in 1997, which was also when I wrote a session store of my own as part of a Python ASP emulator. I quickly realized that session stores quickly become persistence systems in their own right, unless you draw the line somewhere. However, if you draw the line at identifiers and other short strings, then you can just draw the line at the client and avoid the whole problem. From pje at telecommunity.com Tue Aug 16 22:40:55 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 16 Aug 2005 16:40:55 -0400 Subject: [Web-SIG] Session interface In-Reply-To: <3A81C87DC164034AA4E2DDFE11D258E37727BF@exchange.hqamor.amo rhq.net> Message-ID: <5.1.1.6.0.20050816163746.030b12c8@mail.telecommunity.com> At 01:07 PM 8/16/2005 -0700, Robert Brewer wrote: >Phillip J. Eby wrote: > > To put it another way, I see an opportunity here to > > educate developers about better ways of doing things, > >I (and some of the other CherryPy devs) agree that there are better >ways, particularly when using a persistent server process. It's nice to know I'm not the only crazy one around here. ;) >The issue for us as framework developers is that the wasteful ways are >*already* institutionalized. Education is a worthy goal, but if it takes >5 years to convince a majority of Python web developers that they don't >need sessions, we need safe and strong implementations of sessions in >the interim. I think that if we chose to ship CherryPy, for example, >without any session functionality, we'd lose the very audience we want >to educate. So make a session store that uses cookies only, and upsell it as your new "RESTful session storage option, with infinite scalability". ;) The flip side is then getting more good relationally-backed persistence systems out there, to take up the complex-objects side of the equation. From jonathan at carnageblender.com Tue Aug 16 22:48:50 2005 From: jonathan at carnageblender.com (Jonathan Ellis) Date: Tue, 16 Aug 2005 13:48:50 -0700 Subject: [Web-SIG] Session interface In-Reply-To: <5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com> References: <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com> <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com> <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com> <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com> <5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com> Message-ID: <1124225330.11574.240825141@webmail.messagingengine.com> On Tue, 16 Aug 2005 16:37:31 -0400, "Phillip J. Eby" said: > At 01:14 PM 8/16/2005 -0700, Jonathan Ellis wrote: > >But there's a reason they're in such common use; it's a huge waste > >(particular for low-bandwidth clients) to store anything more than > >absolutely necessary in a cookie that the client sends repeatedly. Much > >more efficient to send "here's my token" which the server uses to > >retrieve the rest. > > I agree; and in fact until I saw Ian's status-message example, I've never > had need to store anything in a cookie except login credentials or an > identifier used to find application objects like a shopping cart. > > IOW, cookies are fundamentally for short strings. However, if your > session > data consists solely of short strings, or short-lived medium-size strings > (like a status message) then it works out nicely. Sure, but given the choice between N short strings and one, one is better. :) > If you have session data other than short strings, then you should store > it > with your application data, since it's clearly data that's part of your > application. Still, it can be good to have a simple place to store non-permanent information. Is the potential for abuse worth it? Perhaps not. I also can't think of a time when I needed sessions in the past 5 or so years. -Jonathan From mike_mp at zzzcomputing.com Tue Aug 16 23:06:57 2005 From: mike_mp at zzzcomputing.com (mike bayer) Date: Tue, 16 Aug 2005 17:06:57 -0400 (EDT) Subject: [Web-SIG] Session interface In-Reply-To: <5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com> References: <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com> <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com> <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com> <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com> <5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com> Message-ID: <57167.66.192.34.8.1124226417.squirrel@66.192.34.8> Phillip J. Eby said: > I agree; and in fact until I saw Ian's status-message example, I've never > had need to store anything in a cookie except login credentials or an > identifier used to find application objects like a shopping cart. > > IOW, cookies are fundamentally for short strings. However, if your > session > data consists solely of short strings, or short-lived medium-size strings > (like a status message) then it works out nicely. > theres also security considerations regarding using only cookies without server side sessions. For login tokens, if theres no corresponding server-side token to match up that it is in fact a current login and not something left over from a long-closed session, then some kind of clever encryption combined with time information must be used on the client-side token that can guarantee the login is recent and valid. I always use server-side sessions for logins for this reason. I also think server-side sessions are an easy place to store user preferences and permissioning information originally loaded from the database, as a quick and easy way to cut down on repeated database calls per request, which is not as cleanly represented as an extra few thousand characters sent back and forth with every request. all that said, my current employer uses cookie-only sessions for scalability reasons. might this be-all-end-all session API also have a "client-only" implementation available ? - mike From gtalvola at nameconnector.com Tue Aug 16 23:08:59 2005 From: gtalvola at nameconnector.com (Geoffrey Talvola) Date: Tue, 16 Aug 2005 17:08:59 -0400 Subject: [Web-SIG] Session interface Message-ID: <61957B071FF421419E567A28A45C7FE5029D28BE@mailbox.nameconnector.com> Jonathan Ellis wrote: > Still, it can be good to have a simple place to store non-permanent > information. For example... I think a good use of sessions is in remembering selections that have been made earlier on. For example, suppose you have a reporting application where you allow the user to select one or more items to report on from a list box, several filtering options in dropdowns or checkboxes, sorting and grouping behavior, etc. You want to remember those settings so that if the user returns to the report selection page, their last selected settings are pre-selected. But, unless the user chooses to save those settings as a "stored report", you'd like to forget the settings when the user logs out or when they close their browser. Also, assume that your application already has this bundle of selections in the form of a Python object. Isn't the cleanest, easiest, and more efficient way to handle this to simply save the Python object in a session variable? In some cases, for example using Webware's in-memory sessions, for example, this data never has to be marshaled or leave the application server at all. If I didn't have sessions, I think using either cookies or a back-end db would be more work, less clean, and less efficient in this case. - Geoff From jonathan at carnageblender.com Tue Aug 16 23:28:26 2005 From: jonathan at carnageblender.com (Jonathan Ellis) Date: Tue, 16 Aug 2005 14:28:26 -0700 Subject: [Web-SIG] Session interface In-Reply-To: <57167.66.192.34.8.1124226417.squirrel@66.192.34.8> References: <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com> <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com> <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com> <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com> <5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com> <57167.66.192.34.8.1124226417.squirrel@66.192.34.8> Message-ID: <1124227706.16301.240828160@webmail.messagingengine.com> On Tue, 16 Aug 2005 17:06:57 -0400 (EDT), "mike bayer" said: > I also > think server-side sessions are an easy place to store user preferences > and > permissioning information originally loaded from the database, as a quick > and easy way to cut down on repeated database calls per request, which is > not as cleanly represented as an extra few thousand characters sent back > and forth with every request. Now that's an example of when I think sessions are a poor solution. IMO caching objects from the database is the job for the, well, database object cache. :) They are similar but not identical. For instance, while session data typically expires after a certain amount of time, permanent data should never expire unless invalidated by an update. -Jonathan From pje at telecommunity.com Tue Aug 16 23:42:40 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 16 Aug 2005 17:42:40 -0400 Subject: [Web-SIG] Session interface In-Reply-To: <61957B071FF421419E567A28A45C7FE5029D28BE@mailbox.nameconne ctor.com> Message-ID: <5.1.1.6.0.20050816171536.030b1b08@mail.telecommunity.com> At 05:08 PM 8/16/2005 -0400, Geoffrey Talvola wrote: >Jonathan Ellis wrote: > > Still, it can be good to have a simple place to store non-permanent > > information. > >For example... > >I think a good use of sessions is in remembering selections that have been >made earlier on. For example, suppose you have a reporting application >where you allow the user to select one or more items to report on from a >list box, several filtering options in dropdowns or checkboxes, sorting and >grouping behavior, etc. You want to remember those settings so that if the >user returns to the report selection page, their last selected settings are >pre-selected. But, unless the user chooses to save those settings as a >"stored report", you'd like to forget the settings when the user logs out or >when they close their browser. > >Also, assume that your application already has this bundle of selections in >the form of a Python object. > >Isn't the cleanest, easiest, and more efficient way to handle this to simply >save the Python object in a session variable? No. :) I have to admit I'm probably biased by early Zope experience, where cookie variables are as easy to use as form variables or any other kind of variable. Just set the cookies to save the options, then refer to them in the page. Sweet and simple. And if you set the cookie path to the path of the page, then the client doesn't have to send them on every request, only the ones where it makes a difference. > In some cases, for example >using Webware's in-memory sessions, for example, this data never has to be >marshaled or leave the application server at all. > >If I didn't have sessions, I think using either cookies or a back-end db >would be more work, less clean, and less efficient in this case. Maybe that's a limitation of the framework? As I said, I'm probably spoiled by how easily Zope merges GET/POST/cookie variables, such that form variables override cookies, but if the form variable isn't supplied the cookie is used as a default. That one simple behavior made "smart forms" really easy to make in Zope and Zope-like systems. From pje at telecommunity.com Tue Aug 16 23:51:14 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 16 Aug 2005 17:51:14 -0400 Subject: [Web-SIG] Session interface In-Reply-To: <57167.66.192.34.8.1124226417.squirrel@66.192.34.8> References: <5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com> <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com> <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com> <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com> <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com> <5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050816174540.029e77a0@mail.telecommunity.com> At 05:06 PM 8/16/2005 -0400, mike bayer wrote: >theres also security considerations regarding using only cookies without >server side sessions. For login tokens, if theres no corresponding >server-side token to match up that it is in fact a current login and not >something left over from a long-closed session, then some kind of clever >encryption combined with time information must be used on the client-side >token that can guarantee the login is recent and valid. That's why I listed signed vs. unsigned data as one of the concerns that should be part of a client-side session API design. You don't need encryption, btw, you just need a signature. Signatures are easily done by using a hashing algorithm and a secret key. And by easily done, I mean a few lines of Python. Really the only "interesting" part of managing a hash-based signature is where to store the key such that all the server processes can access it, but it isn't part of your source code. You can do that with a file on a single server, but for multiple servers it's back to the DB or else you need a way to push out configuration to the servers. You also need key rotation such that your signatures indicate which key was used to sign them, so that people's keys don't suddenly stop working when you update your key. OTOH, if you have a multi-server setup you probably already know about all these problems and have ways to deal with them. From ianb at colorstudy.com Wed Aug 17 00:22:50 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 16 Aug 2005 17:22:50 -0500 Subject: [Web-SIG] Secret keys (was: Session interface) In-Reply-To: <5.1.1.6.0.20050816174540.029e77a0@mail.telecommunity.com> References: <5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com> <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com> <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com> <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com> <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com> <5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com> <5.1.1.6.0.20050816174540.029e77a0@mail.telecommunity.com> Message-ID: <4302673A.6070009@colorstudy.com> Phillip J. Eby wrote: > Really the only "interesting" part of managing a hash-based signature is > where to store the key such that all the server processes can access it, > but it isn't part of your source code. You can do that with a file on a > single server, but for multiple servers it's back to the DB or else you > need a way to push out configuration to the servers. You also need key > rotation such that your signatures indicate which key was used to sign > them, so that people's keys don't suddenly stop working when you update > your key. It would be nice if there was a standard way to get the "server's" secret key (or key(s)). Or, maybe more abstractly, to sign and confirm the signature of an item, like: signed_data = sign(data) # Raises exception if there's a problem: data = extract_signed_data(signed_data) At that level any key rotation can be hidden. The mechanism is easy, the key management is actually not "hard", but it depends on what your definition of "server" is. That would be a ripe place for standardization; easy to define, useful, multiple implementations expected. But where do you stuff the functions? It almost seems best to have server environments create or monkey patch some single module, since I can't really think of a reason that a single process should have multiple keys (except maybe in Zope, which has intraprocess security). -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From ianb at colorstudy.com Wed Aug 17 00:32:11 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 16 Aug 2005 17:32:11 -0500 Subject: [Web-SIG] Session interface In-Reply-To: <43021414.9080102@colorstudy.com> References: <43021414.9080102@colorstudy.com> Message-ID: <4302696B.6030601@colorstudy.com> Anyone still interested in session libraries? Putting the wisdom of such a thing aside, any thoughts on a library itself? -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From pje at telecommunity.com Wed Aug 17 01:16:06 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 16 Aug 2005 19:16:06 -0400 Subject: [Web-SIG] Secret keys (was: Session interface) In-Reply-To: <4302673A.6070009@colorstudy.com> References: <5.1.1.6.0.20050816174540.029e77a0@mail.telecommunity.com> <5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com> <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com> <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com> <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com> <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com> <5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com> <5.1.1.6.0.20050816174540.029e77a0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050816190849.01b26308@mail.telecommunity.com> At 05:22 PM 8/16/2005 -0500, Ian Bicking wrote: >Phillip J. Eby wrote: >>Really the only "interesting" part of managing a hash-based signature is >>where to store the key such that all the server processes can access it, >>but it isn't part of your source code. You can do that with a file on a >>single server, but for multiple servers it's back to the DB or else you >>need a way to push out configuration to the servers. You also need key >>rotation such that your signatures indicate which key was used to sign >>them, so that people's keys don't suddenly stop working when you update >>your key. > >It would be nice if there was a standard way to get the "server's" secret >key (or key(s)). Or, maybe more abstractly, to sign and confirm the >signature of an item, like: > > signed_data = sign(data) > # Raises exception if there's a problem: > data = extract_signed_data(signed_data) The extraction facility should probably accept an optional timeout, too, so that messages older than the timeout are considered invalid. >At that level any key rotation can be hidden. The mechanism is easy, the >key management is actually not "hard", but it depends on what your >definition of "server" is. That would be a ripe place for >standardization; easy to define, useful, multiple implementations >expected. But where do you stuff the functions? In a WSGI service, as soon as we finish that spec. :) > It almost seems best to have server environments create or monkey patch > some single module, since I can't really think of a reason that a single > process should have multiple keys (except maybe in Zope, which has > intraprocess security). I'm not so much concerned about intraprocess security as I am with associating things with the right applications, and being able to use two independently-developed applications that depend on different key stores. With WSGI, you can run discrete apps in the same server process, so it seems to make more sense to put that in the pipeline than a monkeypatched module. From chrism at plope.com Wed Aug 17 03:42:56 2005 From: chrism at plope.com (Chris McDonough) Date: Tue, 16 Aug 2005 21:42:56 -0400 Subject: [Web-SIG] Session interface In-Reply-To: <5.1.1.6.0.20050816171536.030b1b08@mail.telecommunity.com> References: <5.1.1.6.0.20050816171536.030b1b08@mail.telecommunity.com> Message-ID: <1124242977.30493.80.camel@localhost.localdomain> I haven't been closely following this thread and this may have already been said but IMO sessions are most useful when the querying user is not identified and you need a place to stash data related to that user (e.g. a shopping cart). They are convenient in other cirumstances but rarely necessary. I've never quite understood why people use server-side sessions for authentication. Maybe it's because they're typically so easy to use and have been sold as "the way to maintain state" in a web application to a lot of people. But in reality they can be quite expensive under high load because of their generality and there's almost always a better way. On Tue, 2005-08-16 at 17:42 -0400, Phillip J. Eby wrote: > At 05:08 PM 8/16/2005 -0400, Geoffrey Talvola wrote: > >Jonathan Ellis wrote: > > > Still, it can be good to have a simple place to store non-permanent > > > information. > > > >For example... > > > >I think a good use of sessions is in remembering selections that have been > >made earlier on. For example, suppose you have a reporting application > >where you allow the user to select one or more items to report on from a > >list box, several filtering options in dropdowns or checkboxes, sorting and > >grouping behavior, etc. You want to remember those settings so that if the > >user returns to the report selection page, their last selected settings are > >pre-selected. But, unless the user chooses to save those settings as a > >"stored report", you'd like to forget the settings when the user logs out or > >when they close their browser. > > > >Also, assume that your application already has this bundle of selections in > >the form of a Python object. > > > >Isn't the cleanest, easiest, and more efficient way to handle this to simply > >save the Python object in a session variable? > > No. :) > > I have to admit I'm probably biased by early Zope experience, where cookie > variables are as easy to use as form variables or any other kind of > variable. Just set the cookies to save the options, then refer to them in > the page. Sweet and simple. And if you set the cookie path to the path of > the page, then the client doesn't have to send them on every request, only > the ones where it makes a difference. > > > > In some cases, for example > >using Webware's in-memory sessions, for example, this data never has to be > >marshaled or leave the application server at all. > > > >If I didn't have sessions, I think using either cookies or a back-end db > >would be more work, less clean, and less efficient in this case. > > Maybe that's a limitation of the framework? As I said, I'm probably > spoiled by how easily Zope merges GET/POST/cookie variables, such that form > variables override cookies, but if the form variable isn't supplied the > cookie is used as a default. That one simple behavior made "smart forms" > really easy to make in Zope and Zope-like systems. > > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com > From mso at oz.net Wed Aug 17 06:54:48 2005 From: mso at oz.net (Mike Orr) Date: Tue, 16 Aug 2005 21:54:48 -0700 Subject: [Web-SIG] Session interface In-Reply-To: <43021414.9080102@colorstudy.com> References: <43021414.9080102@colorstudy.com> Message-ID: <4302C318.9050900@oz.net> Regarding Ian's session interface: http://svn.colorstudy.com/home/ianb/scarecrow_session_interface.py Ian Bicking wrote: >Thinking on it more, probably a good place to start would be agreeing on >specific terminology for the objects involved, since I've seen several >different sets of terminology, many of which use the same words for >different ideas: > >Session: > An instance of this represents one user/browser's session. >SessionStore: > An instance of this represents the persistence mechanism. This > is a functional component, not embodying any policy. >SessionManager: > This is a container for sessions, and uses a SessionStore. This > contains all the policy for loading, saving, locking, expiring > sessions. > > At minimum, the SessionManager links the SessionStore, Session, and application together. It can be generic, along with loading/saving/locking. (Although we might allow the application to choose a locking policy.) But expiring is very application-specific, and it may not be the "application" doing it but a separate cron job. Perhaps most applications will be happy with an "expire all sessions unmodified for N minutes", but some will want to inspect the metadata and others the content. So maybe all the SessionManager can do is: .delete_session(id) => pass message directly to SessionStore .iter_sessions() => tuples of (id, metadata) .iter_sessions_with_content() => tuples of (id, metadata, content) ... where metadata includes the access time and whatever else we decide. Of course, iterating the content may be disk/memory intensive. If .delete_expired_sessions() is included, the application would have to subclass SessionManager rather than just using it. That's not necessarily bad but a potential limitation. Or the application could kludge up a policy from your methods: cutoff = time.time() - (60 * 60 * 4) for sid in sm.session_ids(): if sm.last_accessed(sid) < cutoff: sm.delete_session(sid) I suppose kludgy is in the eye of the beholder. This would not be kludgy: cutoff = time.time() - (60 * 60 * 4) for sid, metadata in sm.iter_sessions(): if metadata.atime < cutoff: sm.delete_session(sid) Curses on anybody who says, "What's the difference?" PS. Kudos for using .names_with_underscores rather than .studlyCaps. Your other methods look all right at first glance. We'll know when we port existing frameworks to it whether it's adequate. (Or should that be "when we port it to existing frameworks"? Or "when we make existing frameworks use it as middleware"?) We'll also have to keep an eye on a usage pattern to recommend for future frameworks, and on whether this API has anything to do with the "sessionless" persistance patterns that have also been proposed. Interesting ideas you've had about read/write vs read-only sessions. I'd say let's support read-only sessions, and maybe that will encourage applications to use them. Session ID cookies seem like a generic thing this class should handle, especially for applications that don't otherwise use cookies. XML-RPC encapsulates the XML (an necessary evil); why shouldn't we encapsulate the cookie (another necessary evil)? >Does that sound good? Note that the attached interface conflates >SessionStore and SessionManager. Some interfaces make an explicit >ApplicationSession, which is contained by Session and keyed off some >application ID; my interface implies that separation, but does not >enforce it, and does not offer any extra functionality at that level >(e.g., per-ApplicationSession locks or transactions). > > I'm not sure what you mean by ApplicationSession. Perl's session object is a dictionary, and you can store anything in it. Our top-level object has to be flexible due to grandfathering, unless we want to force applications to translate to/from our session object to their native session format. Yet you define certain attributes/methods the Session must have, which legacy Sessions don't. I guess allow the application to provide a subclass or compatible class, and let it worry about how to upgrade its native session object. Regarding sessionless persistence, that reminds me of a disagreement I had with Titus in designing session2. Quixote provides Session.user default None, but doesn't define what other values it can have. I put a full-fledged User object with username/group/permission info. Titus puts a string name and stores everything else in his application database. So his *SessionStore classes put the name in a VARCHAR column and didn't save the rest of the session data. I argued that "most people will have a User object, and they'll expect the entire Session to be pickled because that's what PHP/Perl do." He relented, so the current *SessionStores can be used either way. Perhaps applications should store all session data directly, keyed by session ID (and perhaps "username"), rather than using pickled Sessions. That would be a good idea for a parallel project. I'm not sure how relevant that would be to this API except to share "cookie code". This API + implementations are required in any case, both because "most users" will not consider Python if it doesn't have "robust session handling", and a common library would allow frameworks to use it rather than reinventing the wheel incompatibly. This is true regardless of the merits of sessions. -- Mike Orr From ianb at colorstudy.com Wed Aug 17 07:31:12 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Wed, 17 Aug 2005 00:31:12 -0500 Subject: [Web-SIG] Session interface In-Reply-To: <4302C318.9050900@oz.net> References: <43021414.9080102@colorstudy.com> <4302C318.9050900@oz.net> Message-ID: <4302CBA0.6040307@colorstudy.com> Mike Orr wrote: > Regarding Ian's session interface: > http://svn.colorstudy.com/home/ianb/scarecrow_session_interface.py > > Ian Bicking wrote: > >> Thinking on it more, probably a good place to start would be agreeing >> on specific terminology for the objects involved, since I've seen >> several different sets of terminology, many of which use the same >> words for different ideas: >> >> Session: >> An instance of this represents one user/browser's session. >> SessionStore: >> An instance of this represents the persistence mechanism. This >> is a functional component, not embodying any policy. >> SessionManager: >> This is a container for sessions, and uses a SessionStore. This >> contains all the policy for loading, saving, locking, expiring >> sessions. >> >> > > > At minimum, the SessionManager links the SessionStore, Session, and > application together. It can be generic, along with > loading/saving/locking. (Although we might allow the application to > choose a locking policy.) That could be a little difficult, since multiple applications may be sharing a session. But at the same time, applications that don't expect ConflictError are going to be pissed if you configure your system for optimistic locking. Of course, given a session ID and a session store, each application could have its own manager. Possibly. Hmm... interesting. In that case each SessionManager needs an id, which is a bit annoying -- it has to be stable and shared, because the same SessionManager has to be identifiable over multiple processes. But I hate inventing IDs all over the place. I feel like I'm pulling string keys out of my ass, and if I'm going to pull things out of my ass I at least don't want to then put them into my code. I sense UUIDs coming on :( That said, this isn't the only place I need strings that are unique to an application instance. > But expiring is very application-specific, > and it may not be the "application" doing it but a separate cron job. > Perhaps most applications will be happy with an "expire all sessions > unmodified for N minutes", but some will want to inspect the metadata > and others the content. So maybe all the SessionManager can do is: > > .delete_session(id) => pass message directly to SessionStore > .iter_sessions() => tuples of (id, metadata) > .iter_sessions_with_content() => tuples of (id, metadata, content) I think metadata is probably good; or lazily-loaded sessions or something. The metadata is important I think, because updating metadata shouldn't be effected by locking and whatnot. I think Mike mentioned a problem with locking and updating the timestamp contained in the session -- we should avoid that. > ... where metadata includes the access time and whatever else we > decide. Of course, iterating the content may be disk/memory intensive. Sure. We could have a callback to do filtering too, maybe with a default filter by expiration time. Or event callbacks. > If .delete_expired_sessions() is included, the application would have to > subclass SessionManager rather than just using it. That's not > necessarily bad but a potential limitation. Or the application could > kludge up a policy from your methods: > > cutoff = time.time() - (60 * 60 * 4) > for sid in sm.session_ids(): > if sm.last_accessed(sid) < cutoff: > sm.delete_session(sid) > > I suppose kludgy is in the eye of the beholder. This would not be kludgy: > > cutoff = time.time() - (60 * 60 * 4) > for sid, metadata in sm.iter_sessions(): > if metadata.atime < cutoff: > sm.delete_session(sid) > > Curses on anybody who says, "What's the difference?" > > PS. Kudos for using .names_with_underscores rather than .studlyCaps. > > Your other methods look all right at first glance. We'll know when we > port existing frameworks to it whether it's adequate. (Or should that > be "when we port it to existing frameworks"? Or "when we make existing > frameworks use it as middleware"?) We'll also have to keep an eye on a > usage pattern to recommend for future frameworks, and on whether this > API has anything to do with the "sessionless" persistance patterns that > have also been proposed. Acquiring or creating a session ID is outside of the scope of this interface, but I think that's much of what would be useful to sessionless users. Or, rather, people who want application-specific sessions. > Interesting ideas you've had about read/write vs read-only sessions. > I'd say let's support read-only sessions, and maybe that will encourage > applications to use them. > > Session ID cookies seem like a generic thing this class should handle, > especially for applications that don't otherwise use cookies. XML-RPC > encapsulates the XML (an necessary evil); why shouldn't we encapsulate > the cookie (another necessary evil)? XML-RPC contains the XML, but it doesn't deal with the transport really. And, just using XML-RPC as an example, what if you want to stuff the session ID inside the XML-RPC request instead of in a cookie header? But anyway, the reason I don't want to handle this is because this would be much easier if building upon a Standard That Does Not Yet Exist, and I'd rather avoid overlapping with that standard. >> Does that sound good? Note that the attached interface conflates >> SessionStore and SessionManager. Some interfaces make an explicit >> ApplicationSession, which is contained by Session and keyed off some >> application ID; my interface implies that separation, but does not >> enforce it, and does not offer any extra functionality at that level >> (e.g., per-ApplicationSession locks or transactions). >> >> > > > I'm not sure what you mean by ApplicationSession. Perl's session object > is a dictionary, and you can store anything in it. Our top-level object > has to be flexible due to grandfathering, unless we want to force > applications to translate to/from our session object to their native > session format. Yet you define certain attributes/methods the Session > must have, which legacy Sessions don't. I guess allow the application > to provide a subclass or compatible class, and let it worry about how to > upgrade its native session object. I was thinking of pythonweb's "Store": http://pythonweb.org/projects/webmodules/doc/0.5.3/html_multipage/lib/node153.html I vaguely suggest in the interface that each application should put all of its data in a single key (based on the application name). Now I think that should be based on a unique name (not the application name, because the application may exist multiple times in the process), and maybe with an entirely different manager. > Regarding sessionless persistence, that reminds me of a disagreement I > had with Titus in designing session2. Quixote provides Session.user > default None, but doesn't define what other values it can have. I put a > full-fledged User object with username/group/permission info. Titus > puts a string name and stores everything else in his application > database. So his *SessionStore classes put the name in a VARCHAR column > and didn't save the rest of the session data. I argued that "most > people will have a User object, and they'll expect the entire Session to > be pickled because that's what PHP/Perl do." He relented, so the > current *SessionStores can be used either way. In the interface I suggest anything pickleable can go in a key. This requirement has been the source of some controversy in Webware, since people wanted to put open file objects and such in the session; mostly people coming from Java where apparently that's the norm. Anyway, it's still possible with this interface to have a store that never pickles anything; I can just hope no one writes code they expect anyone else to use that demands in-memory session storage. Those are lame even when you are using threads. I think the example shows one reason the session shouldn't be considered a public API. I think it's fine to put the username or the user object in the session -- we can debate the pluses and minuses, but it works -- but I think you should definitely wrap that implementation detail in something else. E.g., request.user should return request.session['user'] or something. > Perhaps applications should store all session data directly, keyed by > session ID (and perhaps "username"), rather than using pickled > Sessions. That would be a good idea for a parallel project. I'm not > sure how relevant that would be to this API except to share "cookie > code". This API + implementations are required in any case, both > because "most users" will not consider Python if it doesn't have "robust > session handling", and a common library would allow frameworks to use it > rather than reinventing the wheel incompatibly. This is true regardless > of the merits of sessions. I guess if applications each have their own SessionManager, they could have their own Session classes, and if they wanted to the Session objects could use application-specific storage and even an application-specific API (not just a dictionary interface). I don't know what the point of that would be, though, since it's all application-specific and not generic, so you might as well just use the session ID and ignore the rest of the API. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From mso at oz.net Wed Aug 17 07:33:10 2005 From: mso at oz.net (Mike Orr) Date: Tue, 16 Aug 2005 22:33:10 -0700 Subject: [Web-SIG] Session interface In-Reply-To: <4302C318.9050900@oz.net> References: <43021414.9080102@colorstudy.com> <4302C318.9050900@oz.net> Message-ID: <4302CC16.2050206@oz.net> Mike Orr wrote: >Regarding sessionless persistence, that reminds me of a disagreement I >had with Titus in designing session2. Quixote provides Session.user >default None, but doesn't define what other values it can have. I put a >full-fledged User object with username/group/permission info. Titus >puts a string name and stores everything else in his application >database. So his *SessionStore classes put the name in a VARCHAR column >and didn't save the rest of the session data. I argued that "most >people will have a User object, and they'll expect the entire Session to >be pickled because that's what PHP/Perl do." He relented, so the >current *SessionStores can be used either way. > >Perhaps applications should store all session data directly, keyed by >session ID (and perhaps "username"), rather than using pickled >Sessions. That would be a good idea for a parallel project. I'm not >sure how relevant that would be to this API except to share "cookie >code". This API + implementations are required in any case, both >because "most users" will not consider Python if it doesn't have "robust >session handling", and a common library would allow frameworks to use it >rather than reinventing the wheel incompatibly. This is true regardless >of the merits of sessions. > > Another thing about sessionless persistence. I find sessions useful for storing miscellaneous data that would otherwise be sent to the browser and back. Usually it's not a question of byte size but rather: (A) I don't want the user to see the data directly -- it contains more information about the application/server structure than I care to divulve, and (B) I don't want the user manipulating the data and sending back something invalid or in the wrong state -- which I would then have to error-check. I could store the data in my relational database, but then I'd have to make a half-dozen tables for: .user : a User instance. .search : the latest search results (list of record IDs), the last page viewed (positive int), and the criteria to redo the search or repopulate the search form (dict). .message: a message to display at the next request. ... other stuff ?? So I guess sessions are a lazy way to have object-database features in a relational-database application. At least for data that lasts longer than a request but shorter than a session timeout. -- Mike Orr From chrism at plope.com Wed Aug 17 07:48:44 2005 From: chrism at plope.com (Chris McDonough) Date: Wed, 17 Aug 2005 01:48:44 -0400 Subject: [Web-SIG] Session interface In-Reply-To: <4302CBA0.6040307@colorstudy.com> References: <43021414.9080102@colorstudy.com> <4302C318.9050900@oz.net> <4302CBA0.6040307@colorstudy.com> Message-ID: <1124257724.17688.11.camel@plope.dyndns.org> FWIW, some interesting ideas (and not so interesting ideas) for sessioning architecture in general are captured at http://www.zope.org/Wikis/DevSite/Projects/CoreSessionTracking/UseCases and http://www.zope.org/Wikis/DevSite/Projects/CoreSessionTracking/CoreSessionTrackingDiscussion UML that more or less represents Zope's current sessioning model is at: http://www.zope.org/Wikis/DevSite/Projects/CoreSessionTracking/CoreSessionTrackingUML - C On Wed, 2005-08-17 at 00:31 -0500, Ian Bicking wrote: > Mike Orr wrote: > > Regarding Ian's session interface: > > http://svn.colorstudy.com/home/ianb/scarecrow_session_interface.py > > > > Ian Bicking wrote: > > > >> Thinking on it more, probably a good place to start would be agreeing > >> on specific terminology for the objects involved, since I've seen > >> several different sets of terminology, many of which use the same > >> words for different ideas: > >> > >> Session: > >> An instance of this represents one user/browser's session. > >> SessionStore: > >> An instance of this represents the persistence mechanism. This > >> is a functional component, not embodying any policy. > >> SessionManager: > >> This is a container for sessions, and uses a SessionStore. This > >> contains all the policy for loading, saving, locking, expiring > >> sessions. > >> > >> > > > > > > At minimum, the SessionManager links the SessionStore, Session, and > > application together. It can be generic, along with > > loading/saving/locking. (Although we might allow the application to > > choose a locking policy.) > > That could be a little difficult, since multiple applications may be > sharing a session. But at the same time, applications that don't expect > ConflictError are going to be pissed if you configure your system for > optimistic locking. > > Of course, given a session ID and a session store, each application > could have its own manager. Possibly. Hmm... interesting. In that > case each SessionManager needs an id, which is a bit annoying -- it has > to be stable and shared, because the same SessionManager has to be > identifiable over multiple processes. But I hate inventing IDs all over > the place. I feel like I'm pulling string keys out of my ass, and if > I'm going to pull things out of my ass I at least don't want to then put > them into my code. I sense UUIDs coming on :( > > That said, this isn't the only place I need strings that are unique to > an application instance. > > > But expiring is very application-specific, > > and it may not be the "application" doing it but a separate cron job. > > Perhaps most applications will be happy with an "expire all sessions > > unmodified for N minutes", but some will want to inspect the metadata > > and others the content. So maybe all the SessionManager can do is: > > > > .delete_session(id) => pass message directly to SessionStore > > .iter_sessions() => tuples of (id, metadata) > > .iter_sessions_with_content() => tuples of (id, metadata, content) > > I think metadata is probably good; or lazily-loaded sessions or > something. The metadata is important I think, because updating metadata > shouldn't be effected by locking and whatnot. I think Mike mentioned a > problem with locking and updating the timestamp contained in the session > -- we should avoid that. > > > ... where metadata includes the access time and whatever else we > > decide. Of course, iterating the content may be disk/memory intensive. > > Sure. We could have a callback to do filtering too, maybe with a > default filter by expiration time. Or event callbacks. > > > If .delete_expired_sessions() is included, the application would have to > > subclass SessionManager rather than just using it. That's not > > necessarily bad but a potential limitation. Or the application could > > kludge up a policy from your methods: > > > > cutoff = time.time() - (60 * 60 * 4) > > for sid in sm.session_ids(): > > if sm.last_accessed(sid) < cutoff: > > sm.delete_session(sid) > > > > I suppose kludgy is in the eye of the beholder. This would not be kludgy: > > > > cutoff = time.time() - (60 * 60 * 4) > > for sid, metadata in sm.iter_sessions(): > > if metadata.atime < cutoff: > > sm.delete_session(sid) > > > > Curses on anybody who says, "What's the difference?" > > > > PS. Kudos for using .names_with_underscores rather than .studlyCaps. > > > > Your other methods look all right at first glance. We'll know when we > > port existing frameworks to it whether it's adequate. (Or should that > > be "when we port it to existing frameworks"? Or "when we make existing > > frameworks use it as middleware"?) We'll also have to keep an eye on a > > usage pattern to recommend for future frameworks, and on whether this > > API has anything to do with the "sessionless" persistance patterns that > > have also been proposed. > > Acquiring or creating a session ID is outside of the scope of this > interface, but I think that's much of what would be useful to > sessionless users. Or, rather, people who want application-specific > sessions. > > > Interesting ideas you've had about read/write vs read-only sessions. > > I'd say let's support read-only sessions, and maybe that will encourage > > applications to use them. > > > > Session ID cookies seem like a generic thing this class should handle, > > especially for applications that don't otherwise use cookies. XML-RPC > > encapsulates the XML (an necessary evil); why shouldn't we encapsulate > > the cookie (another necessary evil)? > > XML-RPC contains the XML, but it doesn't deal with the transport really. > And, just using XML-RPC as an example, what if you want to stuff the > session ID inside the XML-RPC request instead of in a cookie header? > > But anyway, the reason I don't want to handle this is because this would > be much easier if building upon a Standard That Does Not Yet Exist, and > I'd rather avoid overlapping with that standard. > > >> Does that sound good? Note that the attached interface conflates > >> SessionStore and SessionManager. Some interfaces make an explicit > >> ApplicationSession, which is contained by Session and keyed off some > >> application ID; my interface implies that separation, but does not > >> enforce it, and does not offer any extra functionality at that level > >> (e.g., per-ApplicationSession locks or transactions). > >> > >> > > > > > > I'm not sure what you mean by ApplicationSession. Perl's session object > > is a dictionary, and you can store anything in it. Our top-level object > > has to be flexible due to grandfathering, unless we want to force > > applications to translate to/from our session object to their native > > session format. Yet you define certain attributes/methods the Session > > must have, which legacy Sessions don't. I guess allow the application > > to provide a subclass or compatible class, and let it worry about how to > > upgrade its native session object. > > I was thinking of pythonweb's "Store": > http://pythonweb.org/projects/webmodules/doc/0.5.3/html_multipage/lib/node153.html > > I vaguely suggest in the interface that each application should put all > of its data in a single key (based on the application name). Now I > think that should be based on a unique name (not the application name, > because the application may exist multiple times in the process), and > maybe with an entirely different manager. > > > Regarding sessionless persistence, that reminds me of a disagreement I > > had with Titus in designing session2. Quixote provides Session.user > > default None, but doesn't define what other values it can have. I put a > > full-fledged User object with username/group/permission info. Titus > > puts a string name and stores everything else in his application > > database. So his *SessionStore classes put the name in a VARCHAR column > > and didn't save the rest of the session data. I argued that "most > > people will have a User object, and they'll expect the entire Session to > > be pickled because that's what PHP/Perl do." He relented, so the > > current *SessionStores can be used either way. > > In the interface I suggest anything pickleable can go in a key. This > requirement has been the source of some controversy in Webware, since > people wanted to put open file objects and such in the session; mostly > people coming from Java where apparently that's the norm. Anyway, it's > still possible with this interface to have a store that never pickles > anything; I can just hope no one writes code they expect anyone else to > use that demands in-memory session storage. Those are lame even when > you are using threads. > > I think the example shows one reason the session shouldn't be considered > a public API. I think it's fine to put the username or the user object > in the session -- we can debate the pluses and minuses, but it works -- > but I think you should definitely wrap that implementation detail in > something else. E.g., request.user should return > request.session['user'] or something. > > > Perhaps applications should store all session data directly, keyed by > > session ID (and perhaps "username"), rather than using pickled > > Sessions. That would be a good idea for a parallel project. I'm not > > sure how relevant that would be to this API except to share "cookie > > code". This API + implementations are required in any case, both > > because "most users" will not consider Python if it doesn't have "robust > > session handling", and a common library would allow frameworks to use it > > rather than reinventing the wheel incompatibly. This is true regardless > > of the merits of sessions. > > I guess if applications each have their own SessionManager, they could > have their own Session classes, and if they wanted to the Session > objects could use application-specific storage and even an > application-specific API (not just a dictionary interface). I don't > know what the point of that would be, though, since it's all > application-specific and not generic, so you might as well just use the > session ID and ignore the rest of the API. > From mso at oz.net Wed Aug 17 09:15:55 2005 From: mso at oz.net (Mike Orr) Date: Wed, 17 Aug 2005 00:15:55 -0700 Subject: [Web-SIG] Session interface In-Reply-To: <4302CBA0.6040307@colorstudy.com> References: <43021414.9080102@colorstudy.com> <4302C318.9050900@oz.net> <4302CBA0.6040307@colorstudy.com> Message-ID: <4302E42B.6020803@oz.net> Ian Bicking wrote: >> At minimum, the SessionManager links the SessionStore, Session, and >> application together. It can be generic, along with >> loading/saving/locking. (Although we might allow the application to >> choose a locking policy.) > > > That could be a little difficult, since multiple applications may be > sharing a session. But at the same time, applications that don't > expect ConflictError are going to be pissed if you configure your > system for optimistic locking. > > Of course, given a session ID and a session store, each application > could have its own manager. I wasn't thinking of multi-application sessions, much less whether they would have their own SessionManagers. And since my applications don't have a locking policy, I have no opinion which one is best, if you want to impose one. Certainly it makes sense that applications sharing a session must agree on a locking policy. I'd bias toward a common SessionManager. Expecially to centralize the expiration. Should applications sharing a session be allowed to have different expiration policies? Perhaps SessionManager.delete_expired_sessions() is a good thing after all. Another limitation is the byte size of the session pickle. The SessionStore knows this, and the SessionManager should make it available to the application. Then the application (or launcher script) can raise an exception at startup if it deems the size insufficient. None would mean no limit, of course. This happened to me when I was putting the entire result data in the session, then imported a dataset with 3000 records. So "Browse All" finds 3000 results... and that doesn't fit into a 65535-byte BLOB. The database truncates the pickle, and behold, an obscure error at the next request. I decided a larger session was ridiculous, and switched to storing record IDs only. ("3000 ints is not an unreasonable size!") > I was thinking of pythonweb's "Store": > http://pythonweb.org/projects/webmodules/doc/0.5.3/html_multipage/lib/node153.html > > > I vaguely suggest in the interface that each application should put > all of its data in a single key (based on the application name). Now > I think that should be based on a unique name (not the application > name, because the application may exist multiple times in the > process), and maybe with an entirely different manager. Oooh, I haven't seen PythonWeb before. Well worth coordinating with, if feasable. I wonder how hard it would be to port Quixote to PythonWeb.... Seriously, they have a database-independent interactive tool a la mysql/pgsql/sqlite3? "Wow, that's nifty." But a common templating front-end?? "Get real. I choose a template engine for its unique features, not the lowest common denominator." >> Regarding sessionless persistence, that reminds me of a disagreement >> I had with Titus in designing session2. Quixote provides >> Session.user default None, but doesn't define what other values it >> can have. I put a full-fledged User object with >> username/group/permission info. Titus puts a string name and stores >> everything else in his application database. So his *SessionStore >> classes put the name in a VARCHAR column and didn't save the rest of >> the session data. I argued that "most people will have a User >> object, and they'll expect the entire Session to be pickled because >> that's what PHP/Perl do." He relented, so the current *SessionStores >> can be used either way. > > > In the interface I suggest anything pickleable can go in a key. This > requirement has been the source of some controversy in Webware, since > people wanted to put open file objects and such in the session; mostly > people coming from Java where apparently that's the norm. Anyway, > it's still possible with this interface to have a store that never > pickles anything; I can just hope no one writes code they expect > anyone else to use that demands in-memory session storage. Those are > lame even when you are using threads. > > I think the example shows one reason the session shouldn't be > considered a public API. I think it's fine to put the username or the > user object in the session -- we can debate the pluses and minuses, > but it works -- but I think you should definitely wrap that > implementation detail in something else. E.g., request.user should > return request.session['user'] or something. I'm not sure what you mean. There has to be a public API or the application can't use the session. "Should I set an attribute or a key, or call a method?" ("Coffee, tea, or milk?") There is no request.user. Quixote has a get_user() function but it translates to session.user. That's actually request.session.user but you're supposed to pretend you don't know that. Or are you saying that applications should not set attributes? In that case we might as well use a dictionary as the official Session object, as Perl does. Of course you'd have to put the metadata somewhere.... -- Mike Orr From mike_mp at zzzcomputing.com Wed Aug 17 18:52:16 2005 From: mike_mp at zzzcomputing.com (mike bayer) Date: Wed, 17 Aug 2005 12:52:16 -0400 (EDT) Subject: [Web-SIG] Session interface In-Reply-To: <1124227706.16301.240828160@webmail.messagingengine.com> References: <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com> <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com> <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> <5.1.1.6.0.20050816125208.0308ea08@mail.telecommunity.com> <5.1.1.6.0.20050816130439.03071168@mail.telecommunity.com> <5.1.1.6.0.20050816144007.030acf08@mail.telecommunity.com> <5.1.1.6.0.20050816162658.01b1b990@mail.telecommunity.com> <57167.66.192.34.8.1124226417.squirrel@66.192.34.8> <1124227706.16301.240828160@webmail.messagingengine.com> Message-ID: <42618.66.192.34.8.1124297536.squirrel@66.192.34.8> Jonathan Ellis said: > > Now that's an example of when I think sessions are a poor solution. IMO > caching objects from the database is the job for the, well, database > object cache. :) > > They are similar but not identical. For instance, while session data > typically expires after a certain amount of time, permanent data should > never expire unless invalidated by an update. > putting a few user preferences in the session instead of constructing and/or installing a separate database caching system is cheating, but its a small cheat. I think small cheats are fine to get a job done; the exact specification and design of big architectural features are usually derived from the set of small cheats they are replacing. From jjinux at gmail.com Wed Aug 17 20:34:34 2005 From: jjinux at gmail.com (Shannon -jj Behrens) Date: Wed, 17 Aug 2005 11:34:34 -0700 Subject: [Web-SIG] Session interface In-Reply-To: <61957B071FF421419E567A28A45C7FE5029D28BE@mailbox.nameconnector.com> References: <61957B071FF421419E567A28A45C7FE5029D28BE@mailbox.nameconnector.com> Message-ID: Wow! I'm dumbfounded by this whole conversation! I thought session backends were something innane enough that we could agree on them! I have the same use cases as Geoffrey. No, cookies are not a good replacement for sessions since you have to validate them everytime you use them. You can't trust them unless you encrypt and sign them, and I wasn't aware that that many people were doing that. Neither is relying on a cookie to time out sufficient to control a session timeout. Clients lie. Perhaps I have much to learn. I'm going to sit back and just read :-/ -jj On 8/16/05, Geoffrey Talvola wrote: > Jonathan Ellis wrote: > > Still, it can be good to have a simple place to store non-permanent > > information. > > For example... > > I think a good use of sessions is in remembering selections that have been > made earlier on. For example, suppose you have a reporting application > where you allow the user to select one or more items to report on from a > list box, several filtering options in dropdowns or checkboxes, sorting and > grouping behavior, etc. You want to remember those settings so that if the > user returns to the report selection page, their last selected settings are > pre-selected. But, unless the user chooses to save those settings as a > "stored report", you'd like to forget the settings when the user logs out or > when they close their browser. > > Also, assume that your application already has this bundle of selections in > the form of a Python object. > > Isn't the cleanest, easiest, and more efficient way to handle this to simply > save the Python object in a session variable? In some cases, for example > using Webware's in-memory sessions, for example, this data never has to be > marshaled or leave the application server at all. > > If I didn't have sessions, I think using either cookies or a back-end db > would be more work, less clean, and less efficient in this case. > > - Geoff > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: http://mail.python.org/mailman/options/web-sig/jjinux%40gmail.com > -- I have decided to switch to Gmail, but messages to my Yahoo account will still get through. From titus at caltech.edu Wed Aug 17 21:05:26 2005 From: titus at caltech.edu (Titus Brown) Date: Wed, 17 Aug 2005 12:05:26 -0700 Subject: [Web-SIG] Session interface In-Reply-To: References: <61957B071FF421419E567A28A45C7FE5029D28BE@mailbox.nameconnector.com> Message-ID: <20050817190526.GH30939@caltech.edu> -> Wow! I'm dumbfounded by this whole conversation! I thought session -> backends were something innane enough that we could agree on them! I -> have the same use cases as Geoffrey. No, cookies are not a good -> replacement for sessions since you have to validate them everytime you -> use them. You can't trust them unless you encrypt and sign them, and -> I wasn't aware that that many people were doing that. Neither is -> relying on a cookie to time out sufficient to control a session -> timeout. Clients lie. Perhaps I have much to learn. I'm going to -> sit back and just read :-/ (What he said ;) --titus From jjinux at gmail.com Wed Aug 17 21:17:03 2005 From: jjinux at gmail.com (Shannon -jj Behrens) Date: Wed, 17 Aug 2005 12:17:03 -0700 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com> References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com> <4301124C.7040708@colorstudy.com> <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com> Message-ID: > (And I'm also aware that "scaling down" is important, but the rule that all > state goes either in the browser or the application DB scales down just as > well as it scales up.) What's wrong with storing serialized session state in the database? -jj -- I have decided to switch to Gmail, but messages to my Yahoo account will still get through. From pje at telecommunity.com Wed Aug 17 21:54:34 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 17 Aug 2005 15:54:34 -0400 Subject: [Web-SIG] and now for something completely different! In-Reply-To: References: <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com> <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com> <4301124C.7040708@colorstudy.com> <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com> At 12:17 PM 8/17/2005 -0700, Shannon -jj Behrens wrote: > > (And I'm also aware that "scaling down" is important, but the rule that all > > state goes either in the browser or the application DB scales down just as > > well as it scales up.) > >What's wrong with storing serialized session state in the database? Nothing. My point was that state either belongs to the client, or it belongs to the *application* database. It's web-tier storage that forces you to do session affinity when scaling the number of web servers, and to deal with locking and other issues when scaling processes on a single web server. The database tier is also the best place for persistent storage of users' data because it then reflects a *consistent* state with all the other application data. If you restore it from a backup after a crash, the data is consistent. Likewise, you only have one set of DBAs, and only one system to crashproof. If you're building a system with a lot of users that causes somebody to lose thousands of dollars a minute when the system's down, you really want to minimize the number of moving parts, and have a relatively simple recovery strategy, in which "lose everybody's session data because we can't restore the DB and the session store to the same state" is not a recommended option. Meanwhile, clients scale with the number of clients, so if you can get away with storing something client side, then that works great. Most client-side storage I've done is for stuff that if the client fakes it, you really don't care. If they fake their default reporting selections, for example, who cares? From mike_mp at zzzcomputing.com Thu Aug 18 00:25:19 2005 From: mike_mp at zzzcomputing.com (mike bayer) Date: Wed, 17 Aug 2005 18:25:19 -0400 (EDT) Subject: [Web-SIG] and now for something completely different! In-Reply-To: <5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com> References: <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com> <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com> <4301124C.7040708@colorstudy.com> <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com> <5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com> Message-ID: <61788.66.192.34.8.1124317519.squirrel@66.192.34.8> Phillip J. Eby said: > My point was that state either belongs to the client, or it > belongs to the *application* database. It's web-tier storage that forces > you to do session affinity when scaling the number of web servers, and to > deal with locking and other issues when scaling processes on a single web > server. The database tier is also the best place for persistent storage > of > users' data because it then reflects a *consistent* state with all the > other application data. this is definitely the best approach for a big-time, multi-servered architecture. But even in this case, I think its a good idea to approach per-user-session state information with code that is conceptually aware of it being session-scoped information...meaning even if all my state is in the database, id still want to access state which is session-scoped via a "session" API. having a strong concept of "session scope" makes it easier to model things like data caching for the right amount of time, user interface flow, creating multi-step transactions, etc. the point of the session API with the switchable backend is you can build smaller applications and prototypes with file-based sessions and later expand the backend to talk to a database. an application should ideally be able to put whatever is "session-scoped" into that session without concern for size or efficiency....its the backend's job to be ready for it. there is value in being able to use the concept of "sessions" without having to create a specialized database schema every single time, despite the fact that the specialized schema becomes necessary when you want to scale up. - mike From pje at telecommunity.com Thu Aug 18 00:49:22 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 17 Aug 2005 18:49:22 -0400 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <61788.66.192.34.8.1124317519.squirrel@66.192.34.8> References: <5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com> <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com> <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com> <4301124C.7040708@colorstudy.com> <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com> <5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com> At 06:25 PM 8/17/2005 -0400, mike bayer wrote: >But even in this case, I think its a good idea to approach >per-user-session state information with code that is conceptually aware of >it being session-scoped information...meaning even if all my state is in >the database, id still want to access state which is session-scoped via a >"session" API. having a strong concept of "session scope" makes it easier >to model things like data caching for the right amount of time, user >interface flow, creating multi-step transactions, etc. That really hasn't been my experience. Partly, this is because I tend to use RESTful approaches that put 99% of all statefulness in the browser. For example, if I have a multi-page form, I embed all the previous pages' data as hidden fields on the subsequent pages. The entire form is processed by a single validation routine, so it doesn't matter what the client sends or in what order, because as soon as all the data is both present and valid, the form is done. Similarly, the vast majority of UI flow is easiest to model as URL-per-state, so that the browser is in charge of the flow, and the back button works. As for caching, that's something that you tune when you have to tune it, for whatever you're tuning it for. And that's on the basis of what type of object you're persisting. Note that if you have a Cart type, let's say, then you don't really have a case where some Carts are session-specific and some are not! Session-like behavior is inherent in the object types involved, so there's no real benefit to creating a secondary classification scheme for session scope. The only session API I need in that case is: cart = get_cart(get_cart_id(request)) And since the cart is just another persistent application object, it's part of the same transaction, and I have nothing else to mess around with. You also mentioned prototyping, but a good object persistence toolkit shouldn't be tied strictly to SQL; you ought to be able to plug in a "pickle all the data to disk" mode and use it for *all* your application data, not just the session-specific objects. From fumanchu at amor.org Thu Aug 18 01:05:17 2005 From: fumanchu at amor.org (Robert Brewer) Date: Wed, 17 Aug 2005 16:05:17 -0700 Subject: [Web-SIG] and now for something completely different! Message-ID: <3A81C87DC164034AA4E2DDFE11D258E37727DB@exchange.hqamor.amorhq.net> Phillip J. Eby wrote: > You also mentioned prototyping, but a good object persistence toolkit > shouldn't be tied strictly to SQL; you ought to be able to plug in a > "pickle all the data to disk" mode and use it for *all* your > application data, not just the session-specific objects. And for extra points, a good object-persistence toolkit should let you put some data into a DB and some into shelve and leave some in RAM. You pick. Oh, but this is web-sig, not db-sig. ;) Robert Brewer System Architect Amor Ministries fumanchu at amor.org From jjinux at gmail.com Thu Aug 18 03:08:04 2005 From: jjinux at gmail.com (Shannon -jj Behrens) Date: Wed, 17 Aug 2005 18:08:04 -0700 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com> References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com> <4301124C.7040708@colorstudy.com> <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com> <5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com> <61788.66.192.34.8.1124317519.squirrel@66.192.34.8> <5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com> Message-ID: I checked with a bunch of "really smart people" who are familiar with a variety of Web technologies. I was worried that this idea "sessions are considered evil" was widespread, and I didn't know about it. Apparently, that is not the case. Phillip, I'm not discounting your opinion or even arguing against it, but apparently, the entire world didn't decide to start hating session scope behind my back ;) /me giggles -jj On 8/17/05, Phillip J. Eby wrote: > At 06:25 PM 8/17/2005 -0400, mike bayer wrote: > >But even in this case, I think its a good idea to approach > >per-user-session state information with code that is conceptually aware of > >it being session-scoped information...meaning even if all my state is in > >the database, id still want to access state which is session-scoped via a > >"session" API. having a strong concept of "session scope" makes it easier > >to model things like data caching for the right amount of time, user > >interface flow, creating multi-step transactions, etc. > > That really hasn't been my experience. Partly, this is because I tend to > use RESTful approaches that put 99% of all statefulness in the > browser. For example, if I have a multi-page form, I embed all the > previous pages' data as hidden fields on the subsequent pages. The entire > form is processed by a single validation routine, so it doesn't matter what > the client sends or in what order, because as soon as all the data is both > present and valid, the form is done. Similarly, the vast majority of UI > flow is easiest to model as URL-per-state, so that the browser is in charge > of the flow, and the back button works. > > As for caching, that's something that you tune when you have to tune it, > for whatever you're tuning it for. And that's on the basis of what type of > object you're persisting. Note that if you have a Cart type, let's say, > then you don't really have a case where some Carts are session-specific and > some are not! Session-like behavior is inherent in the object types > involved, so there's no real benefit to creating a secondary classification > scheme for session scope. The only session API I need in that case is: > > cart = get_cart(get_cart_id(request)) > > And since the cart is just another persistent application object, it's part > of the same transaction, and I have nothing else to mess around with. > > You also mentioned prototyping, but a good object persistence toolkit > shouldn't be tied strictly to SQL; you ought to be able to plug in a > "pickle all the data to disk" mode and use it for *all* your application > data, not just the session-specific objects. > > -- I have decided to switch to Gmail, but messages to my Yahoo account will still get through. From pje at telecommunity.com Thu Aug 18 03:32:45 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 17 Aug 2005 21:32:45 -0400 Subject: [Web-SIG] and now for something completely different! In-Reply-To: References: <5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com> <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com> <4301124C.7040708@colorstudy.com> <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com> <5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com> <61788.66.192.34.8.1124317519.squirrel@66.192.34.8> <5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050817211540.027e99d8@mail.telecommunity.com> At 06:08 PM 8/17/2005 -0700, Shannon -jj Behrens wrote: >I checked with a bunch of "really smart people" who are familiar with >a variety of Web technologies. I was worried that this idea "sessions >are considered evil" was widespread, and I didn't know about it. Sadly, it's not widespread, any more than RESTful applications are, or object-publishing, or any of the other "the way the web was won" approaches to web applications. In the Java world, for example, it's just assumed that you have to apply tons of resources and trickery to scale your sessions, because that's just How Things Are. The reason it's How Things Are in Java-land is because Java made sessions part of their servlet and other specs right from the start -- a serious error that I was hoping we could avoid in Python-land. At least PHP gives you session management hooks that make it easy to put session data in the application database! It is, however, becoming gradually known in Java-land that the "physical three-tier model" is insane, and IMO that model is fairly closely related to the idea that you should store sessions in the web tier. I'd guess it's going to be a couple more years before "web tier sessions considered harmful" is known by any but the most cynical veterans of building high-volume, database-intensive applications, though. To be precise, what I object to are: 1. Web-tier sessions that store application data in a different database that may or may not be backed up, and may or may not even be a "decent" database 2. "bag of data" sessions that encourage people to throw arbitrary objects in there without thinking about what the information's real lifetime is. (If it's a preference, you want it to either persist on the client or the server, permanently. If it's credentials, you want it to time out on the client. If it's application state, you really need it in your database for integrity/synchronization reasons. If it's transient state like a status message, it doesn't belong in the DB, it belongs on the client. And so on.) So, given these principles, I don't see much need for a session manager besides client-state management, and a good O-R mapper. If you have those, then the rest is trivial. From ianb at colorstudy.com Thu Aug 18 04:16:32 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Wed, 17 Aug 2005 21:16:32 -0500 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <5.1.1.6.0.20050817211540.027e99d8@mail.telecommunity.com> References: <5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com> <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com> <4301124C.7040708@colorstudy.com> <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com> <5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com> <61788.66.192.34.8.1124317519.squirrel@66.192.34.8> <5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com> <5.1.1.6.0.20050817211540.027e99d8@mail.telecommunity.com> Message-ID: <4303EF80.2060706@colorstudy.com> Phillip J. Eby wrote: > The reason it's How Things Are in Java-land is because Java made sessions > part of their servlet and other specs right from the start -- a serious > error that I was hoping we could avoid in Python-land. Too late; all the major (and even all the minor) Python web programming environments have sessions. > At least PHP gives > you session management hooks that make it easy to put session data in the > application database! That shouldn't be hard here either. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From mike_mp at zzzcomputing.com Thu Aug 18 04:33:09 2005 From: mike_mp at zzzcomputing.com (michael bayer) Date: Wed, 17 Aug 2005 22:33:09 -0400 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com> References: <5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com> <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com> <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com> <4301124C.7040708@colorstudy.com> <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com> <5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com> <5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com> Message-ID: <467390AD-4E57-4A6D-838B-B972EDF84AD3@zzzcomputing.com> On Aug 17, 2005, at 6:49 PM, Phillip J. Eby wrote: > That really hasn't been my experience. Partly, this is because I > tend to use RESTful approaches that put 99% of all statefulness in > the browser. For example, if I have a multi-page form, I embed all > the previous pages' data as hidden fields on the subsequent pages. > The entire form is processed by a single validation routine, so it > doesn't matter what the client sends or in what order, because as > soon as all the data is both present and valid, the form is done. > Similarly, the vast majority of UI flow is easiest to model as URL- > per-state, so that the browser is in charge of the flow, and the > back button works. its usually not my experience either, and I have rarely written any kind of app that uses sessions. 99% of everything I've done relies upon browser state as well. although despite my being there "when the web was won" in 95, I am hesitant to call myself a RESTFUL developer...to me, REST seems to be taking some common sense ideas and turning them into some kind of rigid ideological crusade, which is just as bad as all the other ideological crusades we "web winners" had to fight with IIS and active server pages, EJB, UML, SOAP, etc. the app i work on is a document mangement system where users have to edit large sets of fields, and do alot of reloading in order to load in new sections of the document or save various subsets of data. Its been running and being expanded regularly for several years, and it does it all using client-state only, but it has begun to outgrow that approach; it would be much more succinctly written storing the user's current workspace within something that at least conceptually is a "session". it would also allow popups, IFRAMES and future Ajax controls to all access the same user-workspace without having to perform vast Javascript gymnastics (which it does right now). a document editing system is also a good example of where objects need to be persisted in two different scopes, i.e. a session-scope as well as a permanent scope. I dont really think a session has anything to do with a "physical three-tiered model". physically, it can be whereever you want. i just think its advantageous from a conceptual point of view. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/web-sig/attachments/20050817/fd0c9ffc/attachment.htm From pje at telecommunity.com Thu Aug 18 04:51:35 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 17 Aug 2005 22:51:35 -0400 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <4303EF80.2060706@colorstudy.com> References: <5.1.1.6.0.20050817211540.027e99d8@mail.telecommunity.com> <5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com> <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com> <4301124C.7040708@colorstudy.com> <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com> <5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com> <61788.66.192.34.8.1124317519.squirrel@66.192.34.8> <5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com> <5.1.1.6.0.20050817211540.027e99d8@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050817223310.01b307e8@mail.telecommunity.com> At 09:16 PM 8/17/2005 -0500, Ian Bicking wrote: >Phillip J. Eby wrote: >>The reason it's How Things Are in Java-land is because Java made sessions >>part of their servlet and other specs right from the start -- a serious >>error that I was hoping we could avoid in Python-land. > >Too late; all the major (and even all the minor) Python web programming >environments have sessions. I seem to recall that it's part of the Java servlets *specification*, whereas we did manage to avoid that trap in WSGI. :) >>At least PHP gives you session management hooks that make it easy to put >>session data in the application database! > >That shouldn't be hard here either. Yep. That's why I was pushing for standardizing that part separately from any actual storage facility, and for having good ways of managing the client-side state, which every "session" facility needs. If client-side state management turns out to be more library than framework or spec, so be it; we can nominate it for stdlib inclusion in 2.5, and it's one less thing for people to think about. "Boring" in this case is a good thing, it means we have a solved problem. :) What I *don't* want to standardize is the "bag of persistent objects" session interface as the primary way of accessing session data; I'd rather make the client key <-> retrieval aspect explicit, so that it's clear that you can totally choose how that links up, e.g.: session_id = get_client_state(env, 'session.id', new_hook, timeout_hook) my_bag_of_junk = session_store[session_id] To put it another way: I'd like to distinguish "session variables" (client-side values) from "session objects" (server-side objects), and make the boundary between them very clear in the API. That doesn't mean a session store can't offer a shortcut API, but hopefully the standardization of session object stores is *in terms of* the session variables API, so that e.g. the callbacks you need are the same, you still specify somewhere what session variable you'll use, etc. Note too that focusing our effort at this API level lets us address "interesting" problems such as when redirection is needed to start a session, when we need to replace page content to notify that a session has timed out, etc. These are all client-state management problems and nothing to do with the persistence question, but are more interesting problems to solve (IMO) than re-solving the same old object persistence problems over and over again. From pje at telecommunity.com Thu Aug 18 05:00:49 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 17 Aug 2005 23:00:49 -0400 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <467390AD-4E57-4A6D-838B-B972EDF84AD3@zzzcomputing.com> References: <5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com> <5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com> <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com> <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com> <4301124C.7040708@colorstudy.com> <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com> <5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com> <5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050817225353.01b358e8@mail.telecommunity.com> At 10:33 PM 8/17/2005 -0400, michael bayer wrote: >its usually not my experience either, and I have rarely written any kind >of app that uses sessions. 99% of everything I've done relies upon >browser state as well. although despite my being there "when the web was >won" in 95, I am hesitant to call myself a RESTFUL developer...to me, REST >seems to be taking some common sense ideas and turning them into some kind >of rigid ideological crusade, which is just as bad as all the other >ideological crusades we "web winners" had to fight with IIS and active >server pages, EJB, UML, SOAP, etc. I agree; I just find it useful to use the REST banner because before that word came around, there was nothing to call the approach. I'm a pragmatic RESTee in that browsers don't do PUT and DELETE so POST is pretty much what we have to work with for human-usable applications today. >a document editing system is also a good example of where objects need to >be persisted in two different scopes, i.e. a session-scope as well as a >permanent scope. I dont really think a session has anything to do with a >"physical three-tiered model". physically, it can be whereever you >want. i just think its advantageous from a conceptual point of view. I don't object to server-side objects that are session-specific; I object to the "bag of arbitrary objects" session interface, that is typically stored in a web tier or middle tier. Those are two distinct sins that are usually coupled in what most people think of as "a session". When I say I consider sessions harmful, it's specifically those two characteristics of the common meaning of the term. I'm not saying that I think there's no such thing as a "session" in the sense of a browsing session. Shopping carts would be pretty hard to do, for example, without session-specific server-side objects. I just think that storing the shopping cart data in anything other than your application database is almost certainly a Very Bad Idea. From ianb at colorstudy.com Thu Aug 18 05:21:41 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Wed, 17 Aug 2005 22:21:41 -0500 Subject: [Web-SIG] Session interface, v2 Message-ID: <4303FEC5.3050408@colorstudy.com> Same location: http://svn.colorstudy.com/home/ianb/scarecrow_session_interface.py This version separates out SessionManager from SessionStore, and suggests that managers be per-application (or maybe per-framework). I also expanded docstrings and bunch of other changes. Open questions are marked with ???. I'm also copying the interface below (example at the bottom): class SessionError(Exception): pass class InvalidSession(SessionError): """ Raised when an invalid session ID is used. """ class SessionNotFound(SessionError, LookupError): """ Raised when a session can't be found. """ class ConflictError(SessionError): """ Raised when the ``locking_policy`` is ``optimistic``, and a session being saved is stale. """ def create_session_id(): """Return a unique session ID (an ASCII string). This string must be made up of a-zA-Z0-9_-. ???: Should we allow hints, like ``REMOTE_ADDR``? """ class ISessionListener: """ Objects with this interface can be appended to the ``listener`` attribute of a session manager or session. """ def create_session(session_store, new_session): """Called when a new session is created. """ def delete_session(session_store, session_id): """Called before a session is deleted. This can load the session; this will not affect the ultimate deletion of the session. """ def rollback(session_store, session): """Called before a session is abandoned via .rollback()""" class ISessionManager: """ The session manager represents policy related to sessions; expiration, collection, locking. It also typically belongs to one 'application', and ties together the session store with the session objects. """ id = """The string-identifier for this session manager. All applications that share this session manager need to use the same id when creating the session manager. This string should be made up of a-zA-Z0-9_.- """ locking_policy = """The lock policy. This is one of these strings: ``'optimistic'``: Optimistic locking; concurrent sessions may be opened for writing; however, if a session is saved that was loaded before the last save of the session, a ConflictError will be raised. ``'lossy'``: First-come-first-serve. No locking is done; if a session is written it overwrites any other session data that was written. ``'serialized`'': All sessions opened for writing are serialized; the request is blocked until session is available to be opened. """ session_factory = """A callable to produce sessions This should be a class or object like ``ISession``. """ listeners = """A list of ISessionListeners. When certain events happen, a method on every object in this list will be called. """ store = """A ISessionStore""" def __init__(id, store, session_factory, locking_policy='lossy'): """Initialize the variables ???: Does ``__init__`` need to be standardized? """ def load_session(id): """Return the session from the given ID. This method may block if ``locking_policy`` is ``'serialized'``. ???: Does this always return a new session object? I think it shouldn't. """ def load_session_read_only(id): """Return a read-only version of the session. Read-only sessions do not need to be locked as aggressively. Also, loading a read-only session will not update its last-accessed time, so you may use this to peek at sessions. This cannot ensure that the values stored in the session are immutable, so it is very possible that you could make implicit changes to the session object and then they will be thrown away. """ def create_session(id=None): """Create a new session object for the given id. If ``id`` is None then a new id will be generated. This will call ``session_listener.create(session_store, new_session)`` """ def save_session(session): """Save the given session. This may raise a ``ConflictError`` """ def unlock_session(session): """If the session store is locked for any reason, unlock it. It is not an error if no lock exists on the session. ``save_session()`` implies ``unlock_session()``. This method makes the session obsolete. """ def delete_session(id): """Delete the given session. This is given the id of the session, not the session object itself. This calls ``session_listener.delete(session_store, session_id)``. """ def delete_expired_sessions(): """Scan for and delete any expired sessions. ???: How are sessions defined to be expired? Should listeners participate? Should they be able to cancel an expiration? """ def session_ids(): """Return a list of session IDs. ???: Should this return other metadata, like last accessed time? """ def last_accessed(id): """The integer timestamp when the identified session was last accessed. Loading the session read-only does not update this value, only writing or calling ``touch()`` """ def last_written(id): """The integer timestamp when the session was last written to """ def touch(id): """Update the session's last_accessed time. """ class ISession: id = """The string (str, not unicode) ID of this session""" manager = """Reference to parent ISessionManager object""" read_only = """Boolean, if this session was loaded read-only""" last_accessed = """Last access integer timestamp""" creation_time = """Creation integer timestamp""" loaded_timestamp = """Integer timestamp when session was loaded If the session manager's ``locking_policy`` is ``optimistic``, when the session is saved if the ``last_written`` time is later than this time a ``ConflictError`` will be raised. """ obsolete = """ Boolean; true if this session object has been deleted. All other methods should fail once this is true. This attribute is writable.""" listeners = """A list of ISessionListener instances""" data = """The data being stored. This should be pickleable. The other instance variables are metadata, and are not saved as the 'body' of the session; only this data is. Typically this is a dictionary-like object; however, if you want application-specific storage this object could have a specific interface, so long as your session store understands how to save it. ???: Should there be some way to identify this kind of tightly-bound-to-storage session data from free-form (like a dictionary) session data? If there was, then application-specific storage could use something custom for its sessions, but fall on something more generic (e.g., pickle and stuff the string somewhere) for other sessions. """ # ???: Should the expire time be overloadable on a per-session # basis? If listeners can cancel the expiration, then this can be # done in an ad hoc way # ???: Should there be a way of marking the session "dirty"? Maybe # some soft version of a hash should be kept to detect changes? (a # hash that could hash mutable objects) def __init__(id, manager, read_only, last_accessed, creation_time, data): """Create the session object If the session is new, then ``data`` will be none; otherwise it will contain the unpickled data. """ def __getitem__(name): """Return the object by the given name.""" def __setitem__(name, value): """Add or overwrite the named object. The object should be pickleable. """ def __delitem__(name, value): """Delete the named object.""" def touch(): """Update the session's last_accessed time. Typically just calls ``self.manager.touch(self.id)`` """ def commit(): """Calls ``self.manager.save_session(self)`` """ def rollback(): """Calls ``self.manager.unlock_session(self)``. Also calls ``session_listener.rollback(self)``. """ class ISessionStore: """ This is responsible for storing sessions. """ def save_session(session): """Save the session This uses both ``session.id`` and ``session.store.id`` to save the session. """ def load_session(session_store_id, session_id, read_only, session_factory): """Load the session""" def session_ids(session_store_id): """Returns a list of session IDs ???: Plus other metadata? """ def delete_session(session_store_id, session_id): """Delete the session""" def touch(session_store_id, session_id): """Update the last accessed time for the session""" def write_lock_session(session_store_id, session_id): """Lock the session for writing ???: Should there be a way of loading a session without blocking on a lock (e.g., getting an exception when trying to load a locked exception)? """ """ Example usage:: session_store = (create or identify from configuration) # This is in a typical web framework... def get_session(request): session_id = request.get_cookie('session_id') if session_id is None: session_id = create_session_id() request.response.set_cookie('session_id', session_id) session_manager = get_session_manager(request) session = session_manager.load_session(session_id) # A callback to be run when the request has been finished: request.run_when_done(session_store.save_session, session) return session def get_session_store(request): # The application id should be unique to this instance of the # application. But if you don't mind being a little sloppy # you could use the framework name here (that would make it # possible for an application to clobber the session variables # from another application). appid = get_app_id(request) session_store = SessionManager(appid, get_session_store(request), MySessionClass) return session_store def get_session_store(request): return request.environ['session.store'] class MySessionClass(UserDict): def __init__(self, id, manager, read_only, last_accessed, creation_time, data): self.id = id self.manager = manager self.read_only = read_only self.last_accessed = last_accessed self.creation_time = creation_time if data is None: data = {} self.data = data """ From renesd at gmail.com Thu Aug 18 05:43:38 2005 From: renesd at gmail.com (Rene Dudfield) Date: Thu, 18 Aug 2005 13:43:38 +1000 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <5.1.1.6.0.20050817225353.01b358e8@mail.telecommunity.com> References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com> <4301124C.7040708@colorstudy.com> <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com> <5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com> <5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com> <467390AD-4E57-4A6D-838B-B972EDF84AD3@zzzcomputing.com> <5.1.1.6.0.20050817225353.01b358e8@mail.telecommunity.com> Message-ID: <64ddb72c05081720434ff0868d@mail.gmail.com> Some more requirements for sessions can be found at the php page on sessions. Hash function declaring: Chosing eg md5/sha. Also by using a distributed hash function you can easily route the request to a specific web server. So with one rewrite rule you can have your scalable sessions/session affinity. The function could simply append the number 1-100 in front of session id which relates to a particular webserver. tag rewriting. ie.Which tags to do rewriting in. eg where it appends ?SESSIONID=ABCFED938743523 to your output html. url_rewriter.tags string url_rewriter.tags specifies which HTML tags are rewritten to include session id if transparent sid support is enabled. Defaults to a=href,area=href,frame=src,input=src,form=fakeentry,fieldset= http://php.net/session ... and now for all the arguments pro Session rolled up into one paragraph. Taking load off the database server(with sessions) is a way to make an application more scalable. Often the database server is the bottleneck of the web app. Being able to move some load to the client, or the webservers is a good option to have. Being able to not use 2 tiers is also what people may want. In this way sessions allow you to scale up, and down. Sessions allow you to do a lot of jobs which databases are not needed for. Sessions are also more reliable, and secure than cookies. Cookies may not be enabled on the browser, and storing some stuff on the client side in the clear, or even encrypted is dangerous. Sessions are understood by a large amount of php/java/perl/apache people. Lots of the python web frameworks have implemented sessions too. This means sessions will be used. So making a good working implementation of sessions that everyone can share would be double plus good. On 8/18/05, Phillip J. Eby wrote: > At 10:33 PM 8/17/2005 -0400, michael bayer wrote: > >its usually not my experience either, and I have rarely written any kind > >of app that uses sessions. 99% of everything I've done relies upon > >browser state as well. although despite my being there "when the web was > >won" in 95, I am hesitant to call myself a RESTFUL developer...to me, REST > >seems to be taking some common sense ideas and turning them into some kind > >of rigid ideological crusade, which is just as bad as all the other > >ideological crusades we "web winners" had to fight with IIS and active > >server pages, EJB, UML, SOAP, etc. > > I agree; I just find it useful to use the REST banner because before that > word came around, there was nothing to call the approach. I'm a pragmatic > RESTee in that browsers don't do PUT and DELETE so POST is pretty much what > we have to work with for human-usable applications today. > > > >a document editing system is also a good example of where objects need to > >be persisted in two different scopes, i.e. a session-scope as well as a > >permanent scope. I dont really think a session has anything to do with a > >"physical three-tiered model". physically, it can be whereever you > >want. i just think its advantageous from a conceptual point of view. > > I don't object to server-side objects that are session-specific; I object > to the "bag of arbitrary objects" session interface, that is typically > stored in a web tier or middle tier. Those are two distinct sins that are > usually coupled in what most people think of as "a session". When I say I > consider sessions harmful, it's specifically those two characteristics of > the common meaning of the term. I'm not saying that I think there's no > such thing as a "session" in the sense of a browsing session. Shopping > carts would be pretty hard to do, for example, without session-specific > server-side objects. I just think that storing the shopping cart data in > anything other than your application database is almost certainly a Very > Bad Idea. > From floydophone at gmail.com Thu Aug 18 06:03:02 2005 From: floydophone at gmail.com (Peter Hunt) Date: Thu, 18 Aug 2005 00:03:02 -0400 Subject: [Web-SIG] and now for something completely different! Message-ID: <6654eac40508172103342ad54a@mail.gmail.com> Phillip - I agree with you on all counts, except for the issue of how to determine when a session ends (timeouts, etc), and how to clean up the associated objects (Carts etc) with them. Peter Hunt -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/web-sig/attachments/20050818/71310f4a/attachment.htm From fumanchu at amor.org Thu Aug 18 07:35:13 2005 From: fumanchu at amor.org (Robert Brewer) Date: Wed, 17 Aug 2005 22:35:13 -0700 Subject: [Web-SIG] and now for something completely different! Message-ID: <3A81C87DC164034AA4E2DDFE11D258E37727DC@exchange.hqamor.amorhq.net> Phillip J. Eby wrote: > I'm a pragmatic RESTee in that browsers don't do PUT > and DELETE, so POST is pretty much what we have to > work with for human-usable applications today. Unless you can rely on XmlHttpRequest, which supports arbitrary methods (which is why I made CP 2.1 fully support all HTTP methods). Fortunately, I'm currently in a position where I can do that. ;) Robert Brewer System Architect Amor Ministries fumanchu at amor.org From chrism at plope.com Thu Aug 18 07:41:51 2005 From: chrism at plope.com (Chris McDonough) Date: Thu, 18 Aug 2005 01:41:51 -0400 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <64ddb72c05081720434ff0868d@mail.gmail.com> References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com> <4301124C.7040708@colorstudy.com> <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com> <5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com> <5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com> <467390AD-4E57-4A6D-838B-B972EDF84AD3@zzzcomputing.com> <5.1.1.6.0.20050817225353.01b358e8@mail.telecommunity.com> <64ddb72c05081720434ff0868d@mail.gmail.com> Message-ID: <1124343711.31954.26.camel@plope.dyndns.org> On Thu, 2005-08-18 at 13:43 +1000, Rene Dudfield wrote: > ... and now for all the arguments pro Session rolled up into one paragraph. > > Taking load off the database server(with sessions) is a way to make an > application more scalable. In my experience, they can make applications less scalable because typically people don't need to know much about how sessions work so they tend to overuse them without understanding their cost. Very general persistent session implementations that serialize object data into a blob are typically even more expensive than simple relational database row reads and writes, too. This cost is amplified by session ease of use. > Often the database server is the > bottleneck of the web app. Being able to move some load to the > client, or the webservers is a good option to have. This is probably true for a lot of folks but my web apps are almost always CPU bound at the web/application server. I wish I had the database-too-slow problem. > Being able to not > use 2 tiers is also what people may want. In this way sessions allow > you to scale up, and down. Sessions allow you to do a lot of jobs > which databases are not needed for. Typically persistent sessions are backed by some sort of database anyway. It's just that they're craftily coded in such a way that you typically don't need to know much about it. > Sessions are also more reliable, > and secure than cookies. > Cookies may not be enabled on the browser, The most common way of enabling sessions is via cookies, and whether sessions work reliably or not is often contingent on cookies. Formvar or URL-encoded session identifiers tend to be hit-and-miss and much harder to maintain across pages. > and storing some stuff on the client side in the clear, or even > encrypted is dangerous. I agree. At least it's harder to get right. > Sessions are understood by a large amount of > php/java/perl/apache people. Lots of the python web frameworks have > implemented sessions too. This means sessions will be used. So > making a good working implementation of sessions that everyone can > share would be double plus good. What might be more practical and easier to think about because its scope is so much smaller is a a common "browser identifier" implementation. The most useful purpose of a session is to allow you to store state across requests by some anonymous browser. If you can reliably detect that "the requesting browser is the browser identified by token ABC123" and that token can be associated with the browser reliably for some extended period of time, that's half the battle. This can be done with a cookie, a URL element, a form variable, or a query string element. The association between an identifier and a browser doesn't really even need to time out; it could live forever with no ill effect. Creating namespaces that can be written to from within application code and which expire after some number of minutes of inactivity and so forth (aka sessions) could be written in terms of storing and retrieving data based on this browser identifier. - C From ianb at colorstudy.com Thu Aug 18 17:32:36 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 18 Aug 2005 10:32:36 -0500 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <64ddb72c05081720434ff0868d@mail.gmail.com> References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com> <4301124C.7040708@colorstudy.com> <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com> <5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com> <5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com> <467390AD-4E57-4A6D-838B-B972EDF84AD3@zzzcomputing.com> <5.1.1.6.0.20050817225353.01b358e8@mail.telecommunity.com> <64ddb72c05081720434ff0868d@mail.gmail.com> Message-ID: <4304AA14.90207@colorstudy.com> Rene Dudfield wrote: > Some more requirements for sessions can be found at the php page on sessions. > > Hash function declaring: > Chosing eg md5/sha. Also by using a distributed hash function you > can easily route the request to a specific web server. So with one > rewrite rule you can have your scalable sessions/session affinity. > The function could simply append the number 1-100 in front of session > id which relates to a particular webserver. Right-o, I've seen that feature before. Maybe create_session_id() should grow a prefix argument, and for now it'll be up to the glue code to provide that. It's really a configuration parameter. Though I suppose you could turn the SERVER_ADDR into a 8-byte code, which would probably identify the proper server. Or maybe you should pick it up from an environmental variable... bah, it'll only be clear in the context of a specific environment and configuration. > tag rewriting. > ie.Which tags to do rewriting in. eg where it appends > ?SESSIONID=ABCFED938743523 to your output html. That would certainly be well implemented by middleware. > url_rewriter.tags string > > url_rewriter.tags specifies which HTML tags are rewritten to > include session id if transparent sid support is enabled. Defaults to > a=href,area=href,frame=src,input=src,form=fakeentry,fieldset= Huh, what are fakeentry and fieldset? -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From pje at telecommunity.com Thu Aug 18 18:44:09 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 18 Aug 2005 12:44:09 -0400 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <6654eac40508172103342ad54a@mail.gmail.com> Message-ID: <5.1.1.6.0.20050818124137.01ad9f20@mail.telecommunity.com> At 12:03 AM 8/18/2005 -0400, Peter Hunt wrote: >Phillip - > >I agree with you on all counts, except for the issue of how to determine >when a session ends (timeouts, etc), and how to clean up the associated >objects (Carts etc) with them. I'm not sure I ever said how I clean up the associated objects, but my preference is to have an automated process remove them when they haven't been touched for N amount of time, and set the cookie expiration so it expires before that N is elapsed. Or actually, I set the cookie to expire after N, and do the cleanup at time N+M. From jjinux at gmail.com Thu Aug 18 22:19:10 2005 From: jjinux at gmail.com (Shannon -jj Behrens) Date: Thu, 18 Aug 2005 13:19:10 -0700 Subject: [Web-SIG] and now for something completely different! In-Reply-To: <1124343711.31954.26.camel@plope.dyndns.org> References: <3A81C87DC164034AA4E2DDFE11D258E37727A1@exchange.hqamor.amorhq.net> <5.1.1.6.0.20050815165353.0271b5e0@mail.telecommunity.com> <4301124C.7040708@colorstudy.com> <5.1.1.6.0.20050815181303.00a04540@mail.telecommunity.com> <5.1.1.6.0.20050817154233.01b264b8@mail.telecommunity.com> <5.1.1.6.0.20050817184110.01b1e4a0@mail.telecommunity.com> <467390AD-4E57-4A6D-838B-B972EDF84AD3@zzzcomputing.com> <5.1.1.6.0.20050817225353.01b358e8@mail.telecommunity.com> <64ddb72c05081720434ff0868d@mail.gmail.com> <1124343711.31954.26.camel@plope.dyndns.org> Message-ID: > What might be more practical and easier to think about because its scope > is so much smaller is a a common "browser identifier" implementation. > > The most useful purpose of a session is to allow you to store state > across requests by some anonymous browser. If you can reliably detect > that "the requesting browser is the browser identified by token ABC123" > and that token can be associated with the browser reliably for some > extended period of time, that's half the battle. This can be done with > a cookie, a URL element, a form variable, or a query string element. > The association between an identifier and a browser doesn't really even > need to time out; it could live forever with no ill effect. > > Creating namespaces that can be written to from within application code > and which expire after some number of minutes of inactivity and so forth > (aka sessions) could be written in terms of storing and retrieving data > based on this browser identifier. It turns out that having one unique ID per browser is a bad idea. Specifically, if a client gives you a cookie with an sid, and you've never heard about that sid (perhaps the session timed out), create a new sid. Also, it makes sense to change the sid when the user successfully logs in. There are a newly discovered set of session injection attacks to be avoided: http://www.acros.si/papers/session_fixation.pdf I hadn't heard about them until recently. It's interesting reading. I hope you'll find that to be helpful. Best Regards, -jj -- I have decided to switch to Gmail, but messages to my Yahoo account will still get through. From renesd at gmail.com Fri Aug 19 04:27:49 2005 From: renesd at gmail.com (Rene Dudfield) Date: Fri, 19 Aug 2005 12:27:49 +1000 Subject: [Web-SIG] WSGI app in a zope directory? Message-ID: <64ddb72c05081819273ddf8645@mail.gmail.com> Hey, does anyone know of a way to get a wsgi app inside of zope? Cheers. From mso at oz.net Sat Aug 20 21:56:31 2005 From: mso at oz.net (Mike Orr) Date: Sat, 20 Aug 2005 12:56:31 -0700 Subject: [Web-SIG] Session interface, v2 In-Reply-To: <4303FEC5.3050408@colorstudy.com> References: <4303FEC5.3050408@colorstudy.com> Message-ID: <43078AEF.4000309@oz.net> Ian Bicking wrote: >Same location: > >http://svn.colorstudy.com/home/ianb/scarecrow_session_interface.py > > Good work. >This version separates out SessionManager from SessionStore, and >suggests that managers be per-application (or maybe per-framework). > There's per-application/per-framework at the class level and instance level; I'm not sure which you're referring to. Regarding the former, I was saying we may be able to make a generic SessionManager class usable as-is by several frameworks. You seemed to doubt this, but I argued we should at least try. The latter only matters in multi-application deployments, where several applications (possibly different frameworks) are sharing a session. Several features below seem to exist only for this environment, and I'm having a hard time evaluating them without knowing the complete use cases you're trying to support. You've said bits about that but maybe we can flesh it out. * Scenario 1: Two apps mounted at /foo and /bar, using a common Paste dispatcher. Both applications are embedded in the same process. (threaded or asynchronous servers) * Scenario 2: Same, but the apps are in separate processes. The dispatcher remains. (forking servers) * Scenario 3: Two apps mounted at /foo and /bar, using separate handlers in the Apache config. At no point is there a common Python process between them. * Scenario 4: Two apps in different virtual hosts. * Scenario 5: Two apps in different webservers. * Others: ?? Which situations are you trying to support, which session-related objects would there be, and how would they interrelate? At what point do we say scenarios won't attract enough users to justify our time? I'm also not sure how these would relate to your "application inversion" paradigm. I'm used to applications as single long-running units that can hold shared state. But your Paste implementation seems to suggest instantiating the application for each URL, and maybe the application would last for only one request. I'm not sure how easy that will be to port some applications to it, or how this impacts the session classees/instances. >class SessionError(Exception): > pass > >class InvalidSession(SessionError): > """ > Raised when an invalid session ID is used. > """ > >class SessionNotFound(SessionError, LookupError): > """ > Raised when a session can't be found. > """ > >class ConflictError(SessionError): > """ > Raised when the ``locking_policy`` is ``optimistic``, and a > session being saved is stale. > """ > >def create_session_id(): > > Could go into a SessionCookie class, along with anything else that can be used by both session-based and sessionless fans. >class ISessionListener: > > Is this just an extra, or what are listeners for? Is this for per-application behavior with a shared manager? >class ISessionManager: > > id = """The string-identifier for this session manager. > > All applications that share this session manager need to use the > same id when creating the session manager. > > With this rule I was expecting some central repository of session managers, and factory functions a la logger.getLogger(), but there doesn't seem to be any. What's the purpose of the SessionManager id? > locking_policy = """The lock policy. > > This is one of these strings: > > ``'optimistic'``: > Optimistic locking; concurrent sessions may be opened for writing; > however, if a session is saved that was loaded before the last save > of the session, a ConflictError will be raised. > > ``'lossy'``: > First-come-first-serve. No locking is done; if a session is written > it overwrites any other session data that was written. > > ``'serialized`'': > All sessions opened for writing are serialized; the request is > blocked until session is available to be opened. > """ > > Optimistic locking sounds like a pain. The application would have to catch the error and then... what? Say "Sorry, your form input was thrown away." Redo the operation somehow (isn't that the same as lossy operation?). Reconcile the two states somehow (how?)? Not that we shouldn't provide it, just that it will need more howto documentation. > def delete_expired_sessions(): > """Scan for and delete any expired sessions. > > ???: How are sessions defined to be expired? Should listeners > participate? Should they be able to cancel an expiration? > """ > > def session_ids(): > """Return a list of session IDs. > > ???: Should this return other metadata, like last accessed > time? > """ > > If so, it shouldn't be called .session_ids(). >class ISession: > > manager = """Reference to parent ISessionManager object""" > > def __init__(id, manager, read_only, last_accessed, creation_time, >data): > """Create the session object > > def touch(): > """Update the session's last_accessed time. > > Typically just calls ``self.manager.touch(self.id)`` > """ > > def commit(): > """Calls ``self.manager.save_session(self)`` > """ > > def rollback(): > """Calls ``self.manager.unlock_session(self)``. > > Also calls ``session_listener.rollback(self)``. > """ > > These look like they don't belong here. The application already has a reference to the SessionManager and should call it directly. It points up a difference in philosophy between the session being a "dumb object" (no reference to the manager) vs being manager-aware. Is the latter necessary? Are you thinking of cases where the session would be provided by the middleware, then the application would have dispose of the session at the end of the request? The middleware could provide a reference to the session manager for this. Although that would expose irrelevant methods. > class ISessionStore: > def load_session(session_store_id, session_id, read_only, > session_factory): > def session_ids(session_store_id): > def delete_session(session_store_id, session_id): > def touch(session_store_id, session_id): > def write_lock_session(session_store_id, session_id): Isn't session_store_id 'self'? Specifying it seems to imply this is a meta SessionStore, not an individual store. Why would a deployment have multiple stores? From ianb at colorstudy.com Sat Aug 20 23:46:00 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Sat, 20 Aug 2005 16:46:00 -0500 Subject: [Web-SIG] Session interface, v2 In-Reply-To: <43078AEF.4000309@oz.net> References: <4303FEC5.3050408@colorstudy.com> <43078AEF.4000309@oz.net> Message-ID: <4307A498.3000408@colorstudy.com> Mike Orr wrote: > Ian Bicking wrote: > >> Same location: >> >> http://svn.colorstudy.com/home/ianb/scarecrow_session_interface.py >> >> > > > Good work. > > >> This version separates out SessionManager from SessionStore, and >> suggests that managers be per-application (or maybe per-framework). >> > There's per-application/per-framework at the class level and instance > level; I'm not sure which you're referring to. > > Regarding the former, I was saying we may be able to make a generic > SessionManager class usable as-is by several frameworks. You seemed to > doubt this, but I argued we should at least try. I don't think several frameworks should share a single SessionManager *instance*. But I do think we can make a class that can embody all the features that are need for typical sessions, and frameworks use instances. I think people who want something else from a session -- application-specific storage, for instance -- may need their own SessionManager class. > The latter only matters in multi-application deployments, where several > applications (possibly different frameworks) are sharing a session. > Several features below seem to exist only for this environment, and I'm > having a hard time evaluating them without knowing the complete use > cases you're trying to support. You've said bits about that but maybe > we can flesh it out. > > * Scenario 1: Two apps mounted at /foo and /bar, using a common Paste > dispatcher. Both applications are embedded in the same process. > (threaded or asynchronous servers) This is the case that drives a lot of the issues. Say, for instance, that the two application's are instances of the same basic app (e.g., blogs for two different users). If they share a session they'll overwrite each other's values, or become hopelessly confused by seemingly inconsistent data. If each of them has a separate app id and separate session managers, then they'll never see the other's data. But you can only do that with some fixed id (generated randomly or by hand, either way probably stored in the configuration). Hmm... potentially you could just generate such an id from the configuration file's name (if not given something more specific); that's a little sloppy, but generally likely to be unique and stable. > * Scenario 2: Same, but the apps are in separate processes. The > dispatcher remains. (forking servers) If the two apps share the same pool of long-lived worker processes, then all the same issues remain as with scenario 1. This isn't really an issue of threaded vs. multiprocess, but an issue of processes that run multiple independent applications over time. A common pool of worker processes would be similar to PHP, except that PHP tends to throw away more information each request... though I believe session clobbering would be a problem in PHP if you had two apps on the same domain that shared a session variable name. > * Scenario 3: Two apps mounted at /foo and /bar, using separate handlers > in the Apache config. At no point is there a common Python process > between them. It depends on the configuration, but clobbering could happen here too. If both apps use the same session id (e.g., they use the same cookie name) and share session store configuration (they are writing to the same location), then it will be a problem. Using session managers with separate app ids they can share session store configuration safely. > * Scenario 4: Two apps in different virtual hosts. Probably not an issue because the session id won't be shared. A good session id manager might be able to handle this, though, but forwarding the user between the two hosts with a special GET variable that triggers the setting of a cookie; if that was happening it would be like scenario 3. > * Scenario 5: Two apps in different webservers. Much like scenario 4; problems are possible without the session manager, but increasingly less likely. Most conflict issues could also be fixed by not sharing a session id between applications (and probably using a configurable session cookie name). > * Others: ?? > > Which situations are you trying to support, which session-related > objects would there be, and how would they interrelate? I want to support all of them. In part this is because I have a vision of much more granular applications, so I want it to be possible to deploy small applications with little risk of interaction problems. > At what point > do we say scenarios won't attract enough users to justify our time? Well, I'm just thinking about the simple session stores, not much along the application-specific stores. So I'm leaving something out there. > I'm also not sure how these would relate to your "application inversion" > paradigm. I'm used to applications as single long-running units that > can hold shared state. But your Paste implementation seems to suggest > instantiating the application for each URL, and maybe the application > would last for only one request. I'm not sure how easy that will be to > port some applications to it, or how this impacts the session > classees/instances. I don't think this really relates a whole lot. Paste doesn't need to instantiate for each URL, it could fetch an already-instantiated application just as well. paste.urlmap only dispatches to pre-existing applications, for instance, while paste.urlparser instantiates. >> def create_session_id(): >> >> > > Could go into a SessionCookie class, along with anything else that can > be used by both session-based and sessionless fans. It could, but session IDs can come from elsewhere. E.g., you might want to use it as an argument in an XMLRPC class. So I think it's pretty independent of any particular browser identification technique. >> class ISessionListener: >> >> > > > Is this just an extra, or what are listeners for? Is this for > per-application behavior with a shared manager? It's kind of an extra. I'm not really sure what would be done with it. An example I gave before about storing files and only storing the filename in the session would be helped by listeners, as you could add a file-deleting listener that was triggered on session delete. Anytime when you put data associated with the session somewhere outside of the session store I think this will be useful. >> class ISessionManager: >> >> id = """The string-identifier for this session manager. >> >> All applications that share this session manager need to use the >> same id when creating the session manager. >> >> > > > With this rule I was expecting some central repository of session > managers, and factory functions a la logger.getLogger(), but there > doesn't seem to be any. What's the purpose of the SessionManager id? The session manager id is used by the session store, to keep the sessions separate. Actual session data is keyed by (session_manager_id, session_id), so that separate applications have separate session_manager_ids, and separate browsers have separate session_ids. >> locking_policy = """The lock policy. >> >> This is one of these strings: >> >> ``'optimistic'``: >> Optimistic locking; concurrent sessions may be opened for writing; >> however, if a session is saved that was loaded before the last save >> of the session, a ConflictError will be raised. >> >> ``'lossy'``: >> First-come-first-serve. No locking is done; if a session is >> written >> it overwrites any other session data that was written. >> >> ``'serialized`'': >> All sessions opened for writing are serialized; the request is >> blocked until session is available to be opened. >> """ >> >> > > > Optimistic locking sounds like a pain. The application would have to > catch the error and then... what? Say "Sorry, your form input was > thrown away." Redo the operation somehow (isn't that the same as lossy > operation?). Reconcile the two states somehow (how?)? Not that we > shouldn't provide it, just that it will need more howto documentation. It is a bit of a pain. In Zope they catch ConflictErrors, roll back everything, and restart the request. I've had this bite me, as it just makes the contention worse, but for sessions in particular it might not be so bad (as long as *everything* is transactional and can be rolled back). Anyway, it's there mostly for the frameworks that already know how to handle this. >> class ISession: >> >> manager = """Reference to parent ISessionManager object""" >> >> def __init__(id, manager, read_only, last_accessed, creation_time, >> data): >> """Create the session object >> >> def touch(): >> """Update the session's last_accessed time. >> >> Typically just calls ``self.manager.touch(self.id)`` >> """ >> >> def commit(): >> """Calls ``self.manager.save_session(self)`` >> """ >> >> def rollback(): >> """Calls ``self.manager.unlock_session(self)``. >> >> Also calls ``session_listener.rollback(self)``. >> """ >> >> > > > These look like they don't belong here. The application already has a > reference to the SessionManager and should call it directly. It points > up a difference in philosophy between the session being a "dumb object" > (no reference to the manager) vs being manager-aware. Is the latter > necessary? Are you thinking of cases where the session would be > provided by the middleware, then the application would have dispose of > the session at the end of the request? The middleware could provide a > reference to the session manager for this. Although that would expose > irrelevant methods. Mostly these are there both to make the interface slightly nicer (many times you won't have to interact with the session manager), and to facilitate per-session session listeners. I'm not sure per-session listeners are a good idea, though. >> class ISessionStore: >> def load_session(session_store_id, session_id, read_only, >> session_factory): >> def session_ids(session_store_id): >> def delete_session(session_store_id, session_id): >> def touch(session_store_id, session_id): >> def write_lock_session(session_store_id, session_id): > > > Isn't session_store_id 'self'? Specifying it seems to imply this is a > meta SessionStore, not an individual store. Why would a deployment have > multiple stores? Oops, this was a leftover from when SessionManager was named SessionStore. These should all be session_manager_id. Fixed in svn. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From renesd at gmail.com Sun Aug 21 01:42:34 2005 From: renesd at gmail.com (Rene Dudfield) Date: Sun, 21 Aug 2005 09:42:34 +1000 Subject: [Web-SIG] Session interface, v2 In-Reply-To: <4307A498.3000408@colorstudy.com> References: <4303FEC5.3050408@colorstudy.com> <43078AEF.4000309@oz.net> <4307A498.3000408@colorstudy.com> Message-ID: <64ddb72c050820164212e0eacf@mail.gmail.com> Looks quite good. It should be able to handle all the uses I have for sessions. I am sure it will change a little once it is started to be implemented. > > * Scenario 2: Same, but the apps are in separate processes. The > > dispatcher remains. (forking servers) > > If the two apps share the same pool of long-lived worker processes, then > all the same issues remain as with scenario 1. This isn't really an > issue of threaded vs. multiprocess, but an issue of processes that run > multiple independent applications over time. A common pool of worker > processes would be similar to PHP, except that PHP tends to throw away > more information each request... though I believe session clobbering > would be a problem in PHP if you had two apps on the same domain that > shared a session variable name. > Yes session clobbering can happen with php. It gets around it by allowing you to set the session.name. Eg PHP_SESSION becomes MY_BLOG_PHP_SESSION. Just like in the proposal with SessionManager and its app_id. http://www.php.net/function.session-name.php > > * Scenario 3: Two apps mounted at /foo and /bar, using separate handlers > > in the Apache config. At no point is there a common Python process > > between them. > > It depends on the configuration, but clobbering could happen here too. > If both apps use the same session id (e.g., they use the same cookie > name) and share session store configuration (they are writing to the > same location), then it will be a problem. Using session managers with > separate app ids they can share session store configuration safely. > > > * Scenario 4: Two apps in different virtual hosts. > > Probably not an issue because the session id won't be shared. A good > session id manager might be able to handle this, though, but forwarding > the user between the two hosts with a special GET variable that triggers > the setting of a cookie; if that was happening it would be like scenario 3. > The most secure way for virtual hosts would be to use different session stores? Using different session stores for separate domains should be the default for a little extra security? However using the same SessionStores accross virtual domains could be quite useful for passing users settings amongst virtual domains(just like Ian said above). From ianb at colorstudy.com Sun Aug 21 01:56:31 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Sat, 20 Aug 2005 18:56:31 -0500 Subject: [Web-SIG] Session interface, v2 In-Reply-To: <64ddb72c050820164212e0eacf@mail.gmail.com> References: <4303FEC5.3050408@colorstudy.com> <43078AEF.4000309@oz.net> <4307A498.3000408@colorstudy.com> <64ddb72c050820164212e0eacf@mail.gmail.com> Message-ID: <4307C32F.9050205@colorstudy.com> Rene Dudfield wrote: >>>* Scenario 4: Two apps in different virtual hosts. >> >>Probably not an issue because the session id won't be shared. A good >>session id manager might be able to handle this, though, but forwarding >>the user between the two hosts with a special GET variable that triggers >>the setting of a cookie; if that was happening it would be like scenario 3. >> > > > The most secure way for virtual hosts would be to use different > session stores? Using different session stores for separate domains > should be the default for a little extra security? However using the > same SessionStores accross virtual domains could be quite useful for > passing users settings amongst virtual domains(just like Ian said > above). As long as session ids are generated properly, there should be no overlap in ids unless you are using the same browser identification (i.e., the same cookie). So if the virtual hosts aren't explicitly sharing session ids there's no real problem (as long as all those applications are trusted to read any session, of course). -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From mso at oz.net Sun Aug 21 03:08:37 2005 From: mso at oz.net (Mike Orr) Date: Sat, 20 Aug 2005 18:08:37 -0700 Subject: [Web-SIG] Session interface, v2 In-Reply-To: <4307A498.3000408@colorstudy.com> References: <4303FEC5.3050408@colorstudy.com> <43078AEF.4000309@oz.net> <4307A498.3000408@colorstudy.com> Message-ID: <4307D415.7070705@oz.net> Ian Bicking wrote: > Mike Orr wrote: > I don't think several frameworks should share a single SessionManager > *instance*. Isn't that what being a session manager means? The single gateway to the stores. Otherwise it's more a case of two instances co-managing. That sounds more difficult, since the two managers may have different bugs and thus an unintentional difference in policy. > class ISessionManager: > >>> >>> id = """The string-identifier for this session manager. >>> >>> All applications that share this session manager need to use the >>> same id when creating the session manager. >>> >>> >> >> >> With this rule I was expecting some central repository of session >> managers, and factory functions a la logger.getLogger(), but there >> doesn't seem to be any. What's the purpose of the SessionManager id? > > > The session manager id is used by the session store, to keep the > sessions separate. Actual session data is keyed by > (session_manager_id, session_id), so that separate applications have > separate session_manager_ids, and separate browsers have separate > session_ids. OK, we're using different terminology for the same thing. I would call that an application ID. Two applications that want to share sessions would use the same ID, and two instances of a blogging application that don't want to share would have different app IDs MySQLSessionStore has an app ID in the constructor, and the session is saved under (app_id, session_id). It defaults to '' if you only have one application and are too lazy to make up a name. Calling it app_id seems to make more sense. The user would find it logical to have to name their applications (=session namespaces). Whereas naming "session managers" sounds like an obscure implementation detail not related to this. I would think a session manager ID is its memory address, and why on earth would we want to know that? >>> class ISessionStore: >>> def load_session(session_store_id, session_id, read_only, >>> session_factory): >>> def session_ids(session_store_id): >>> def delete_session(session_store_id, session_id): >>> def touch(session_store_id, session_id): >>> def write_lock_session(session_store_id, session_id): >> >> >> >> Isn't session_store_id 'self'? Specifying it seems to imply this is >> a meta SessionStore, not an individual store. Why would a deployment >> have multiple stores? > > > Oops, this was a leftover from when SessionManager was named > SessionStore. These should all be session_manager_id. Fixed in svn. OK, translating 'session_manager_id' to 'app_id', this almost makes sense. So a SessionStore instance can handle multiple applications. Is this likely? I'd like to find some way to avoid passing this value to every method, since from the application's perspective, there's only one that matters. From ianb at colorstudy.com Sun Aug 21 08:29:19 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Sun, 21 Aug 2005 01:29:19 -0500 Subject: [Web-SIG] Session interface, v2 In-Reply-To: <4307D415.7070705@oz.net> References: <4303FEC5.3050408@colorstudy.com> <43078AEF.4000309@oz.net> <4307A498.3000408@colorstudy.com> <4307D415.7070705@oz.net> Message-ID: <43081F3F.2030805@colorstudy.com> Mike Orr wrote: >> The session manager id is used by the session store, to keep the >> sessions separate. Actual session data is keyed by >> (session_manager_id, session_id), so that separate applications have >> separate session_manager_ids, and separate browsers have separate >> session_ids. > > > > > OK, we're using different terminology for the same thing. I would call > that an application ID. Two applications that want to share sessions > would use the same ID, and two instances of a blogging application that > don't want to share would have different app IDs MySQLSessionStore has > an app ID in the constructor, and the session is saved under (app_id, > session_id). It defaults to '' if you only have one application and are > too lazy to make up a name. > > Calling it app_id seems to make more sense. The user would find it > logical to have to name their applications (=session namespaces). > Whereas naming "session managers" sounds like an obscure implementation > detail not related to this. I would think a session manager ID is its > memory address, and why on earth would we want to know that? The session manager needs to be instantiated with the app id, and we could rename it there, yes. It doesn't really matter to me. >>>> class ISessionStore: >>>> def load_session(session_store_id, session_id, read_only, >>>> session_factory): >>>> def session_ids(session_store_id): >>>> def delete_session(session_store_id, session_id): >>>> def touch(session_store_id, session_id): >>>> def write_lock_session(session_store_id, session_id): >>> >>> >>> >>> >>> Isn't session_store_id 'self'? Specifying it seems to imply this is >>> a meta SessionStore, not an individual store. Why would a deployment >>> have multiple stores? >> >> >> >> Oops, this was a leftover from when SessionManager was named >> SessionStore. These should all be session_manager_id. Fixed in svn. > > > > OK, translating 'session_manager_id' to 'app_id', this almost makes > sense. So a SessionStore instance can handle multiple applications. Is > this likely? I'd like to find some way to avoid passing this value to > every method, since from the application's perspective, there's only one > that matters. The session manager embodies that context, so you never pass that around. The session manager also has the locking policy; as you noticed you don't want optimistic locking unless you are ready for ConflictErrors, and you don't want lossy if you are relying on the session for something important. So application's shouldn't share that setting either. The SessionStore interface is fairly dumb about what it's storing, so it should be able to support multiple policies simultaneously. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From ianb at colorstudy.com Sun Aug 21 22:22:55 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Sun, 21 Aug 2005 15:22:55 -0500 Subject: [Web-SIG] More on app configuration... Message-ID: <4308E29F.6040607@colorstudy.com> So, I got in the first bit of working code for paste.deploy, which is a continuation of the work I mentioned back in the thread "WSGI deployment: an experiment": http://mail.python.org/pipermail/web-sig/2005-July/001598.html It's still incomplete; I only just got the very first tests to pass. The code is in http://svn.pythonpaste.org/Paste/Deploy/trunk/ But I thought I'd describe what I'm currently thinking. First, this isn't really configuration-file-based so much as URI-based. Right now there's only two schemes: egg:EggSpec#entry_point_name config:config_filename#section_name And I'll probably add: python:[protocol/]import_path (I'm not sure where protocol should really go in this case) Plain imports don't have a explicit protocol, so they are a little harder to handle. Eggs use entry points, which have explicit protocols. Configuration files use section name prefixes to denote the protocol; though since configuration files don't contain actual code, they usually refer to something with an explicit protocol. Right now there's only a couple protocols -- paste.app_factory1, paste.composit_factory1, paste.filter_factory1, and paste.server_factory1. I added "composit" for applications that bring together other applications. This includes URL dispatchers, pipelines, and some other things. This is like filters, which wrap a single application, but composits get a reference back into the application loader, so they can load applications by name. Of course anyone can load an application by URI, so it's not strictly necessary to have a separate type. But I think it's helpful to make explicit when an application is really just a dispatcher, vs. a real terminal application. I can't think of a better name than "composit", but if anyone has ideas... I haven't decided exactly how configuration will work. Right now I've included both global shared configuration and local configuration. Global configuration is inherited throughout the system, and exists in one flat namespace. Local configuration can be added in a configuration file. Applications can explicitly pull defaults from the global configuration, e.g. "email_errors" might be filled by "system_admin_email". Hopefully this makes happy those who don't like a big global pile of settings. So a configuration section might look like: [DEFAULT] # This section holds global configuration system_admin_email = ianb at colorstudy.com [app:main] use = egg:MyApp#main # or you could do: # paste.app_factory1 = import_spec:object # override a global setting: set system_admin_email = webmaster at host.com # and a local setting: database = mysql://localhost/myapp It uses ConfigParser, because it's dumb and the closest thing to making no decision on configuration formats. Maybe later it can use the file extension to denote format; I suppose if so then I should make .ini required right now (another scheme could be added for a different format, but that seems wrong). -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From ianb at colorstudy.com Mon Aug 22 06:08:24 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Sun, 21 Aug 2005 23:08:24 -0500 Subject: [Web-SIG] PasteDeploy 0.1 (was: Re: More on app configuration...) In-Reply-To: <4308E29F.6040607@colorstudy.com> References: <4308E29F.6040607@colorstudy.com> Message-ID: <43094FB8.7090505@colorstudy.com> I did a bunch more work on this today. It's still in an early state, but I decided I should release versions more often. So it's out there: http://cheeseshop.python.org/pypi/PasteDeploy/0.1 But probably more interesting to start with, the documentation: http://pythonpaste.org/deploy/paste-deploy.html I also wrote some interfaces (just for documentation purposes): http://svn.pythonpaste.org/Paste/Deploy/trunk/paste/deploy/interfaces.py I'm feeling pretty good about how it turned out. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From renesd at gmail.com Mon Aug 22 08:34:28 2005 From: renesd at gmail.com (Rene Dudfield) Date: Mon, 22 Aug 2005 16:34:28 +1000 Subject: [Web-SIG] PasteDeploy 0.1 (was: Re: More on app configuration...) In-Reply-To: <43094FB8.7090505@colorstudy.com> References: <4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com> Message-ID: <64ddb72c050821233445da5238@mail.gmail.com> Hey, a what is it good for/why use it section would be good on the web page. Cheers, On 8/22/05, Ian Bicking wrote: > I did a bunch more work on this today. It's still in an early state, > but I decided I should release versions more often. So it's out there: > > http://cheeseshop.python.org/pypi/PasteDeploy/0.1 > > But probably more interesting to start with, the documentation: > > http://pythonpaste.org/deploy/paste-deploy.html > > I also wrote some interfaces (just for documentation purposes): > > http://svn.pythonpaste.org/Paste/Deploy/trunk/paste/deploy/interfaces.py > > I'm feeling pretty good about how it turned out. > > -- > Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: http://mail.python.org/mailman/options/web-sig/renesd%40gmail.com > From ianb at colorstudy.com Mon Aug 22 19:44:32 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Mon, 22 Aug 2005 12:44:32 -0500 Subject: [Web-SIG] PasteDeploy 0.1 In-Reply-To: <64ddb72c050821233445da5238@mail.gmail.com> References: <4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com> <64ddb72c050821233445da5238@mail.gmail.com> Message-ID: <430A0F00.6050807@colorstudy.com> Rene Dudfield wrote: > Hey, > > a what is it good for/why use it section would be good on the web page. Good point. It's not a complete solution yet so I'm not sure exactly how to describe it; but this is what I put for now: Paste Deployment is a system for finding and configuring WSGI applications and servers. For WSGI application consumers it provides a single, simple function (loadapp) for loading a WSGI application from a configuration file or a Python Egg. For WSGI application providers it only asks for a single, simple entry point to your application, so that application users don't need to be exposed to the implementation details of your application. The result is something a system administrator can install and manage without knowing any Python, or the details of the WSGI application or its container. As an aside I've also added a couple features this morning to make the common case of pipelining filters a bit easier to configure. Hmm... it's also just occurred to me that filters should be easier to define. In almost all cases I find I want to curry the configuration so it can be applied at the same time the wrapped application is passed in. I might add another protocol for that. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From pje at telecommunity.com Tue Aug 23 02:26:35 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 22 Aug 2005 20:26:35 -0400 Subject: [Web-SIG] PasteDeploy 0.1 In-Reply-To: <430A0F00.6050807@colorstudy.com> References: <64ddb72c050821233445da5238@mail.gmail.com> <4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com> <64ddb72c050821233445da5238@mail.gmail.com> Message-ID: <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> At 12:44 PM 8/22/2005 -0500, Ian Bicking wrote: >Hmm... it's also just occurred to me that filters should be easier to >define. In almost all cases I find I want to curry the configuration so >it can be applied at the same time the wrapped application is passed in. > I might add another protocol for that. I think the format is improving, as it was now clear enough for me to figure out what I'd like to change. ;-) I stole this example off your blog, and then rewrote it using a slightly more advanced version of my last syntax proposal: # Put one login system in front of the entire site # [login wrapper from Paste] database = "mysql://localhost/userdb" table = "users" # Then this passes different path prefixes to different apps # [urlmap from Paste] "/" = static() "/cms" = auth(filebrowser_app()) "/blog" = blog() # variables used later # [config = vars] admin_email = "me at example.com" document_root = "/home/me/htdocs" # a very simple app... # [static = static from Paste] document_root = config.document_root # the login filter should give us a username; this just restricts # who can access # [auth = auth wrapper from Paste] require_role = "admin" admin_email = config.admin_email # this application is distributed in an egg # [filebrowser_app = filebrowser from FileBrowser] document_root = config.document_root admin_email = config.admin_email # In this case the app isn't distributed as an Egg with # entry_points, so we manually create a glue function blog_app # and just invoke it here # [blog = myglue.apps:blog_app] admin_email = config.admin_email Most of the above should be pretty obvious, but a few points anyway: * This format is generic; it has nothing to do with WSGI in particular and can be used to assemble any component tree. It also supports implementing the "wsgi services" concept. * Argument names can be either an identifier or a quoted string * You can use factories from a default group (e.g. 'vars' above might effectively be short for 'vars from WSGIUtils') * named sections ("[name = ...]") have to come after the unnamed sections, and they are turned into "curried" factory objects that are available in the eval() namespace used for all expressions. When called in an expression, they can accept keyword arguments to override the defaults in the named section. They have properties with the same names as the values defined in that section. * The first part of a section (after the "name=", if any) is an import spec for a factory, or if it's followed by "from" or "wrapper from", then it's the name of an entry point that advertises a factory. * "wrapper" means that the factory will be called with two positional arguments; non-wrappers are called with one argument. Named wrappers can be passed a positional argument if used in an another factory argument expression - this will be the object they should wrap. * The last unnamed section is the effective "result" of parsing the file, although it will be wrapped by any contiguous preceding "wrapper" sections The parser for this format would of course be considerably more complex than the Paste-Deploy parser (especially since evaluation would be done lazily), but I think the syntax is both cleaner and more powerful. The factory signatures are: def non_wrapper_factory(parent_component, **kw): ... def wrapper_factory(child_component, parent_component, **kw): ... With the parent/child parameters always being supplied positionally. The idea is that parent_component will be used to create a chain of service contexts, and child_component is an application to be wrapped by middleware. I've thought this through enough that I know how I could implement all of the features shown, but it may be a week or two at least before I could try hacking together an implementation. Also, the services side of it isn't really fleshed out yet, and it may also be that we need to provide some simple "builtin" functions in the eval() namespace to do things like lookup services or load other deployment files, etc. From ianb at colorstudy.com Tue Aug 23 04:03:18 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Mon, 22 Aug 2005 21:03:18 -0500 Subject: [Web-SIG] PasteDeploy 0.1 In-Reply-To: <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> References: <64ddb72c050821233445da5238@mail.gmail.com> <4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com> <64ddb72c050821233445da5238@mail.gmail.com> <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> Message-ID: <430A83E6.5030302@colorstudy.com> Phillip J. Eby wrote: > At 12:44 PM 8/22/2005 -0500, Ian Bicking wrote: > >> Hmm... it's also just occurred to me that filters should be easier to >> define. In almost all cases I find I want to curry the configuration so >> it can be applied at the same time the wrapped application is passed in. >> I might add another protocol for that. > > > I think the format is improving, as it was now clear enough for me to > figure out what I'd like to change. ;-) > > I stole this example off your blog, and then rewrote it using a slightly > more advanced version of my last syntax proposal: > > # Put one login system in front of the entire site > # > [login wrapper from Paste] > database = "mysql://localhost/userdb" > table = "users" > > # Then this passes different path prefixes to different apps > # > [urlmap from Paste] > "/" = static() > "/cms" = auth(filebrowser_app()) > "/blog" = blog() One aspect of paste.deploy that wasn't shown in that example is that it's easy to refer to other configuration files. It would actually be more realistic to do: [composit:app] use = egg:Paste#urlmap / = config:static_root.ini /cms = config:filebrowser.ini /blog = config:blog.ini And if filebrowser.ini defined an authentication filter named "auth", you could add this to blog.ini to reuse that configuration: [filter-app:main] use = config:filebrowser.ini#auth next = blog [app:blog] .... And so forth. I think this will be really useful to me (when I have my sysadmin/deployer hat on) -- it's something I left out of my own previous specs, but I think incorrectly. > # variables used later > # > [config = vars] > admin_email = "me at example.com" > document_root = "/home/me/htdocs" This seems useful. I had thought about some way of using the globals in expressions; but with pure-string expressions it's not easy to do much of interest. > # a very simple app... > # > [static = static from Paste] > document_root = config.document_root > > # the login filter should give us a username; this just restricts > # who can access > # > [auth = auth wrapper from Paste] > require_role = "admin" > admin_email = config.admin_email > > # this application is distributed in an egg > # > [filebrowser_app = filebrowser from FileBrowser] > document_root = config.document_root > admin_email = config.admin_email However, in paste.deploy there does remain real global configuration, so you wouldn't have to manually copy in values from the globals. While admittedly it makes the interface slightly less elegant from the Python side, I think it's an important feature. > # In this case the app isn't distributed as an Egg with > # entry_points, so we manually create a glue function blog_app > # and just invoke it here > # > [blog = myglue.apps:blog_app] > admin_email = config.admin_email > > > Most of the above should be pretty obvious, but a few points anyway: > > * This format is generic; it has nothing to do with WSGI in particular > and can be used to assemble any component tree. It also supports > implementing the "wsgi services" concept. Ditto paste.deploy. Not all of the bits are well defined in the implementation, but there's nothing inside or out that's connected to WSGI. > * Argument names can be either an identifier or a quoted string I tried to avoid anything fancy; if I was going to do something fancy I'd feel a need to look at all the configuration formats currently for Python, and if not reuse them at least steal from them. But it's clear that plain ConfigParser parsing is pretty lame. > * You can use factories from a default group (e.g. 'vars' above might > effectively be short for 'vars from WSGIUtils') How is that default group determined? What is a "group"? > * named sections ("[name = ...]") have to come after the unnamed > sections, and they are turned into "curried" factory objects that are > available in the eval() namespace used for all expressions. When called > in an expression, they can accept keyword arguments to override the > defaults in the named section. They have properties with the same names > as the values defined in that section. The properties are fine; I can't say the calling syntax appeals to me particularly. > * The first part of a section (after the "name=", if any) is an import > spec for a factory, or if it's followed by "from" or "wrapper from", > then it's the name of an entry point that advertises a factory. How do you determine the entry point type? Or is there one entry point type for anything available in a configuration file? paste.deploy defines an entry point type for each kind of object. > * "wrapper" means that the factory will be called with two positional > arguments; non-wrappers are called with one argument. Named wrappers > can be passed a positional argument if used in an another factory > argument expression - this will be the object they should wrap. This part is unclear to me. > * The last unnamed section is the effective "result" of parsing the > file, although it will be wrapped by any contiguous preceding "wrapper" > sections This isn't clear to me when reading the configuration file. INI files are flat, and I wouldn't expect them to be usefully ordered, especially in a way that puts particular importance on the last unnamed section. I'd feel more comfortable with a nested configuration format in that case. > The parser for this format would of course be considerably more complex > than the Paste-Deploy parser (especially since evaluation would be done > lazily), but I think the syntax is both cleaner and more powerful. The > factory signatures are: > > def non_wrapper_factory(parent_component, **kw): > ... > > def wrapper_factory(child_component, parent_component, **kw): > ... > > With the parent/child parameters always being supplied positionally. > The idea is that parent_component will be used to create a chain of > service contexts, and child_component is an application to be wrapped by > middleware. > > I've thought this through enough that I know how I could implement all > of the features shown, but it may be a week or two at least before I > could try hacking together an implementation. Also, the services side > of it isn't really fleshed out yet, and it may also be that we need to > provide some simple "builtin" functions in the eval() namespace to do > things like lookup services or load other deployment files, etc. I dunno... I can't say much about the services, because I don't really know what you intend with those. These are some things I like about your example: * More structured/richer section names could be good; paste.deploy's "use" could go as a result. * A clear notion of evaluation and variables would be nice. * A config format with good quoting rules is called for. ConfigParser isn't anything more than a stop-gap. But some things I don't like: * Using ordering in a syntax that doesn't feel ordered or nested. * Using function composition to represent application/filter composition. But only sometimes. * "name from egg_spec" reads nice on one level, but is vague on another level. Even if "egg:egg_spec#name" doesn't read well, I think it is nicely self-describing. * eval() scares me a bit; if I used eval() I would feel a need to keep sufficient information around to do proper tracebacks that include the source configuration file. But all-strings isn't great either. Evaluation without conditionals seems like it goes only half-way; OTOH conditionals get to something too complex for configuration. So however it goes, configuration should be somewhere in the middle of completely dumb (ConfigParser, unevaluated values), and completely general (Python code). Where in the middle I'm unsure. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From pje at telecommunity.com Tue Aug 23 05:12:44 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 22 Aug 2005 23:12:44 -0400 Subject: [Web-SIG] PasteDeploy 0.1 In-Reply-To: <430A83E6.5030302@colorstudy.com> References: <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <64ddb72c050821233445da5238@mail.gmail.com> <4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com> <64ddb72c050821233445da5238@mail.gmail.com> <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com> At 09:03 PM 8/22/2005 -0500, Ian Bicking wrote: >One aspect of paste.deploy that wasn't shown in that example is that it's >easy to refer to other configuration files. It would actually be more >realistic to do: > > [composit:app] > use = egg:Paste#urlmap > / = config:static_root.ini > /cms = config:filebrowser.ini > /blog = config:blog.ini In the minimum case, one could do that without any change to the syntax I proposed: [static = file] filename = "static_root.ini" But I think it would be nicer to just provide a "builtin function" to load a component from a file, since it's common enough to deserve a primitive. >And if filebrowser.ini defined an authentication filter named "auth", you >could add this to blog.ini to reuse that configuration: > > [filter-app:main] > use = config:filebrowser.ini#auth > next = blog One of the things I really dislike about the PasteDeploy syntax is that it mingles factory arguments with chaining, which seems like mixing metalevels to me; I never know whether an argument is intended for the parser (e.g. use, next) or for the factory. My blog.ini would look like this (in its entirety): [file] # Borrow the filebrowser 'auth' wrapper filename = "filebrowser.ini" factory = "auth" [myglue.apps:blog_app] >However, in paste.deploy there does remain real global configuration, so >you wouldn't have to manually copy in values from the globals. While >admittedly it makes the interface slightly less elegant from the Python >side, I think it's an important feature. That's easily emulated if you need it; just create a configuration service or services that can be acquired via the parent_component links. Actually, the format I propose allows numerous other ways to emulate that feature on varying scales, but doesn't force all factories to understand any one specific configuration protocol. >>* Argument names can be either an identifier or a quoted string > >I tried to avoid anything fancy; if I was going to do something fancy I'd >feel a need to look at all the configuration formats currently for Python, >and if not reuse them at least steal from them. > >But it's clear that plain ConfigParser parsing is pretty lame. The only reason for allowing string literals is to avoid coming up with a lame new escaping scheme for use cases like the URL map. >>* You can use factories from a default group (e.g. 'vars' above might >>effectively be short for 'vars from WSGIUtils') > >How is that default group determined? What is a "group"? Er, sorry, I meant entry point group, like "wsgi.factories" or whatever. I was just pointing out that the meaning of a non-import string could be loaded from an entry point group, and that the group might vary depending on the application loading the configuration file. >>* named sections ("[name = ...]") have to come after the unnamed >>sections, and they are turned into "curried" factory objects that are >>available in the eval() namespace used for all expressions. When called >>in an expression, they can accept keyword arguments to override the >>defaults in the named section. They have properties with the same names >>as the values defined in that section. > >The properties are fine; I can't say the calling syntax appeals to me >particularly. I thought about *not* calling them (except for wrappers), but then the properties would have to go. >>* The first part of a section (after the "name=", if any) is an import >>spec for a factory, or if it's followed by "from" or "wrapper from", then >>it's the name of an entry point that advertises a factory. > >How do you determine the entry point type? Or is there one entry point >type for anything available in a configuration file? paste.deploy defines >an entry point type for each kind of object. I'm thinking that the loader gets passed some arguments to determine what entry point group to use. This format, by the way, only requires one group for all the "normal" entry points, because the "wrapper" keyword distinguishes between the two factory signatures -- which are the only signatures you get. >>* "wrapper" means that the factory will be called with two positional >>arguments; non-wrappers are called with one argument. Named wrappers can >>be passed a positional argument if used in an another factory argument >>expression - this will be the object they should wrap. > >This part is unclear to me. See the urlmap in the example, where "/blog" = auth(blog()). 'auth' is a "wrapper", so it can be called with something to wrap (e.g. 'blog()'). >* Using ordering in a syntax that doesn't feel ordered or nested. Fair enough. However, I'm used to ordered .ini files (they do exist), so I'm not sure that's enough on its own to rule out the syntax. Also, we could nix the '[]' for section headings and come up with something else, e.g.: login wrapper from Paste: database = "mysql://localhost/userdb" table = "users" urlmap from Paste: "/" = static() "/cms" = auth(filebrowser_app()) "/blog" = blog() def config() as vars: admin_email = "me at example.com" document_root = "/home/me/htdocs" def static() as static from Paste: document_root = config.document_root def auth() as auth wrapper from Paste: require_role = "admin" admin_email = config.admin_email def filebrowser_app() as filebrowser from FileBrowser: document_root = config.document_root admin_email = config.admin_email def blog() as myglue.apps:blog_app: admin_email = config.admin_email This probably wouldn't be any harder to parse than my initial proposal, as I was thinking of using the "tokenize" module, and in this case a DEDENT token would indicate the end of a section. I'm not sure I like the 'def x()' bit, makes it look a little too much like Python, at the same time as it seems good to have it be like Python. >* Using function composition to represent application/filter >composition. But only sometimes. Only sometimes you don't like it? :) Or do you mean that the format I gave only uses it sometimes, and that's what you dislike? (i.e., you'd be fine if it was always done that way?) >* "name from egg_spec" reads nice on one level, but is vague on another >level. Even if "egg:egg_spec#name" doesn't read well, I think it is >nicely self-describing. Um, wha??? The only difference between the two is that one of them has "egg:" in front of it, which seems a bit redundant to me. That's probably because I assume that in the long run eggs will be so ubiquitous that it really will be redundant to explicitly refer to them as such. :) Conversely, if I assume that some further description is required, I would want to say "pypi:" or "project:" or something else of that sort, because "egg" isn't the essential nature of the thing; the name is a *project* name, while eggs are an implementation detail. >* eval() scares me a bit; if I used eval() I would feel a need to keep >sufficient information around to do proper tracebacks that include the >source configuration file. Sure, that can be done, especially if one does a couple of tricks with compile(), new.code(), and co_firstlineno. >But all-strings isn't great either. Evaluation without conditionals seems >like it goes only half-way; OTOH conditionals get to something too complex >for configuration. So however it goes, configuration should be somewhere >in the middle of completely dumb (ConfigParser, unevaluated values), and >completely general (Python code). Where in the middle I'm unsure. eval() isn't full Python code, and you can set a restricted set of builtins if you really want. I don't see much point to dumbing it down any further than that, though. From ianb at colorstudy.com Tue Aug 23 06:30:19 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Mon, 22 Aug 2005 23:30:19 -0500 Subject: [Web-SIG] PasteDeploy 0.1 In-Reply-To: <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com> References: <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <64ddb72c050821233445da5238@mail.gmail.com> <4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com> <64ddb72c050821233445da5238@mail.gmail.com> <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com> Message-ID: <430AA65B.2090409@colorstudy.com> Phillip J. Eby wrote: > At 09:03 PM 8/22/2005 -0500, Ian Bicking wrote: > >> One aspect of paste.deploy that wasn't shown in that example is that >> it's easy to refer to other configuration files. It would actually be >> more realistic to do: >> >> [composit:app] >> use = egg:Paste#urlmap >> / = config:static_root.ini >> /cms = config:filebrowser.ini >> /blog = config:blog.ini > > > In the minimum case, one could do that without any change to the syntax > I proposed: > > [static = file] > filename = "static_root.ini" > > But I think it would be nicer to just provide a "builtin function" to > load a component from a file, since it's common enough to deserve a > primitive. Definitely; files are the essence of configuration, file references should be a core concept. (Of course ZODB objects can work much like files, or wherever else you put the configuration is likely to have a filesystem-like feel) >> And if filebrowser.ini defined an authentication filter named "auth", >> you could add this to blog.ini to reuse that configuration: >> >> [filter-app:main] >> use = config:filebrowser.ini#auth >> next = blog > > > One of the things I really dislike about the PasteDeploy syntax is that > it mingles factory arguments with chaining, which seems like mixing > metalevels to me; I never know whether an argument is intended for the > parser (e.g. use, next) or for the factory. My blog.ini would look like > this (in its entirety): > > [file] > # Borrow the filebrowser 'auth' wrapper > filename = "filebrowser.ini" > factory = "auth" > > [myglue.apps:blog_app] I'm fine getting rid of "use", maybe like: [filter-app:main config:filebrowser.ini#auth] Well... upon writing it, it doesn't look that nice. But in theory... or maybe something like: [filter-app:main] config:filebrowser.ini#auth Where the first (non-comment) line was interpreted as the thing being loaded. That makes it look very different. Or whatever; I don't feel that strongly about it. >> However, in paste.deploy there does remain real global configuration, >> so you wouldn't have to manually copy in values from the globals. >> While admittedly it makes the interface slightly less elegant from the >> Python side, I think it's an important feature. > > > That's easily emulated if you need it; just create a configuration > service or services that can be acquired via the parent_component > links. Actually, the format I propose allows numerous other ways to > emulate that feature on varying scales, but doesn't force all factories > to understand any one specific configuration protocol. It's important to me, and it's not intuitive to me what you envision. So I feel a need to services in action, replacing global configuration. I'll admit, it felt a little funny to me when I converted middleware to have a global configuration parameter that they simply ignored. But *not* having that access would bother me more. >>> * You can use factories from a default group (e.g. 'vars' above might >>> effectively be short for 'vars from WSGIUtils') >> >> >> How is that default group determined? What is a "group"? > > > Er, sorry, I meant entry point group, like "wsgi.factories" or > whatever. I was just pointing out that the meaning of a non-import > string could be loaded from an entry point group, and that the group > might vary depending on the application loading the configuration file. In paste.deploy I allow for future groups that ultimately return the same kind of object, so the group adds important information. paste.composit_factory1 and paste.app_factory1 both return the same kind of object, for instance. >>> * named sections ("[name = ...]") have to come after the unnamed >>> sections, and they are turned into "curried" factory objects that are >>> available in the eval() namespace used for all expressions. When >>> called in an expression, they can accept keyword arguments to >>> override the defaults in the named section. They have properties >>> with the same names as the values defined in that section. >> >> >> The properties are fine; I can't say the calling syntax appeals to me >> particularly. > > > I thought about *not* calling them (except for wrappers), but then the > properties would have to go. I like the properties more than the composition. >>> * The first part of a section (after the "name=", if any) is an >>> import spec for a factory, or if it's followed by "from" or "wrapper >>> from", then it's the name of an entry point that advertises a factory. >> >> >> How do you determine the entry point type? Or is there one entry >> point type for anything available in a configuration file? >> paste.deploy defines an entry point type for each kind of object. > > > I'm thinking that the loader gets passed some arguments to determine > what entry point group to use. This format, by the way, only requires > one group for all the "normal" entry points, because the "wrapper" > keyword distinguishes between the two factory signatures -- which are > the only signatures you get. In paste.deploy the syntax (filter:, etc) and the group are redundant. So if you accidentally treat an application like a filter it'll be caught before you call the object with the wrong parameters. I'm also realizing that positional parameters are bad; I think I'll be changing to calling with purely keyword parameters. It's too easy to mix up positional parameters, and pass in the wrong object in the wrong location, and then it only gets caught later when you try to use the wrong object in a way it doesn't support. That was one of the more common errors I produced when converting my code. >>> * "wrapper" means that the factory will be called with two positional >>> arguments; non-wrappers are called with one argument. Named wrappers >>> can be passed a positional argument if used in an another factory >>> argument expression - this will be the object they should wrap. >> >> >> This part is unclear to me. > > > See the urlmap in the example, where "/blog" = auth(blog()). 'auth' is > a "wrapper", so it can be called with something to wrap (e.g. 'blog()'). But the wrapper there is called with one argument, and the app with zero; but you say the wrapper has two and the app one...? >> * Using ordering in a syntax that doesn't feel ordered or nested. > > > Fair enough. However, I'm used to ordered .ini files (they do exist), > so I'm not sure that's enough on its own to rule out the syntax. Also, > we could nix the '[]' for section headings and come up with something > else, e.g.: > > login wrapper from Paste: > database = "mysql://localhost/userdb" > table = "users" > > urlmap from Paste: > "/" = static() > "/cms" = auth(filebrowser_app()) > "/blog" = blog() > > def config() as vars: > admin_email = "me at example.com" > document_root = "/home/me/htdocs" > > def static() as static from Paste: > document_root = config.document_root > > def auth() as auth wrapper from Paste: > require_role = "admin" > admin_email = config.admin_email > > def filebrowser_app() as filebrowser from FileBrowser: > document_root = config.document_root > admin_email = config.admin_email > > def blog() as myglue.apps:blog_app: > admin_email = config.admin_email That's not any different to me, I guess. This would be a better use of indentation: main = pipeline: login wrapper from Paste: config... urlmap from Paste: "/" = static # for some reason this feels a lot better than # auth(filebrowser_app())) to me: "/cms" = pipeline(auth, filebrowser_app) "/blog" = blog config = vars: admin_email = "me at example.com" document_root = "/home/me/htdocs" static = static from Paste: document_root = config.document_root # gotta admit I still really prefer "filter" to "wrapper" auth = auth wrapper from Paste: require_role = "admin" admin_email = config.admin_email filebrowser_app = filebrowser from FileBrowser: # better use of properties than config.document_root, really: document_root = static.document_root admin_email = config.admin_email # or you could do the whole thing like: filebrowser = pipeline: auth wrapper from Paste: require_role = "admin" admin_email = config.admin_email filebrowser from FileBrowser: document_root = static.document_root # maybe you could specialize/clone like: auth2 = auth: require_role = "editor" That configuration feels way better to me. The named applications are full peers with the unnamed applications/objects (and I think there just shouldn't be top-level unnamed applications). >> * Using function composition to represent application/filter >> composition. But only sometimes. > > > Only sometimes you don't like it? :) Or do you mean that the format I > gave only uses it sometimes, and that's what you dislike? (i.e., you'd > be fine if it was always done that way?) I don't like the function calling at all, but even moreso because it's inconsistent (it isn't used for the main app). >> * "name from egg_spec" reads nice on one level, but is vague on >> another level. Even if "egg:egg_spec#name" doesn't read well, I think >> it is nicely self-describing. > > > Um, wha??? The only difference between the two is that one of them has > "egg:" in front of it, which seems a bit redundant to me. That's > probably because I assume that in the long run eggs will be so > ubiquitous that it really will be redundant to explicitly refer to them > as such. :) In paste.deploy config files are full peers to Eggs, and can be used anywhere that an egg: reference can be used. I think that's a neat feature. I don't want to tack on referencing other config files, like a special loader factory or textual inclusion hacks or anything like that. Config files describe applications. Egg entry points describe applications. They should be peers. > Conversely, if I assume that some further description is required, I > would want to say "pypi:" or "project:" or something else of that sort, > because "egg" isn't the essential nature of the thing; the name is a > *project* name, while eggs are an implementation detail. egg: is an access method, just like http: or whatever. It doesn't say what the URI describes, just how to find it. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From pje at telecommunity.com Tue Aug 23 16:42:30 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 23 Aug 2005 10:42:30 -0400 Subject: [Web-SIG] PasteDeploy 0.1 In-Reply-To: <430AA65B.2090409@colorstudy.com> References: <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com> <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <64ddb72c050821233445da5238@mail.gmail.com> <4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com> <64ddb72c050821233445da5238@mail.gmail.com> <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com> At 11:30 PM 8/22/2005 -0500, Ian Bicking wrote: >Phillip J. Eby wrote: >>At 09:03 PM 8/22/2005 -0500, Ian Bicking wrote: >>>However, in paste.deploy there does remain real global configuration, so >>>you wouldn't have to manually copy in values from the globals. >>>While admittedly it makes the interface slightly less elegant from the >>>Python side, I think it's an important feature. >> >>That's easily emulated if you need it; just create a configuration >>service or services that can be acquired via the parent_component >>links. Actually, the format I propose allows numerous other ways to >>emulate that feature on varying scales, but doesn't force all factories >>to understand any one specific configuration protocol. > >It's important to me, and it's not intuitive to me what you envision. So I >feel a need to services in action, replacing global configuration. Using a config service in a factory to get a default argument value: def some_app_factory(parent_component, **args): config = parent_component.get_service("global_config") args.setdefault('someparam', config['someparam']) Registering a config service (old syntax): [globalconfigservice from SomeEgg] someparam = "foo" [...next component in the stack...] The config service would respond to 'get_service("global_config")' by returning self. The idea is that when you chain non-wrapper components in a pipeline, each one gets the previous component as its "parent component", so you can "acquire" services from your parents. Components nearer to you (i.e. more local) can override more global service definitions. >>I'm thinking that the loader gets passed some arguments to determine what >>entry point group to use. This format, by the way, only requires one >>group for all the "normal" entry points, because the "wrapper" keyword >>distinguishes between the two factory signatures -- which are the only >>signatures you get. > >In paste.deploy the syntax (filter:, etc) and the group are redundant. So >if you accidentally treat an application like a filter it'll be caught >before you call the object with the wrong parameters. Well, that could certainly be done with this approach too. >>>>* "wrapper" means that the factory will be called with two positional >>>>arguments; non-wrappers are called with one argument. Named wrappers >>>>can be passed a positional argument if used in an another factory >>>>argument expression - this will be the object they should wrap. >>> >>> >>>This part is unclear to me. >> >>See the urlmap in the example, where "/blog" = auth(blog()). 'auth' is a >>"wrapper", so it can be called with something to wrap (e.g. 'blog()'). > >But the wrapper there is called with one argument, and the app with zero; >but you say the wrapper has two and the app one...? Named sections have a default parent_component argument, so that you don't have to explicitly pass them in. ># or you could do the whole thing like: > >filebrowser = pipeline: > auth wrapper from Paste: > require_role = "admin" > admin_email = config.admin_email > filebrowser from FileBrowser: > document_root = static.document_root > ># maybe you could specialize/clone like: > >auth2 = auth: > require_role = "editor" Interesting. If we used "in" to include other files then you could refer to e.g.: foo = main in "some.ini": # override params Also, I was thinking that in this syntax, you want to be able to leave off the trailing ':' for simple definitions, so that this would be a complete definition, without needing a body: foo = main in "some.ini" Finally, I think we could drop the "pipeline" keyword and simply use a ':' to define a name, which then gives us a way to stack components inside a definition, e.g.: main: login wrapper from Paste: # blah urlmap from Paste: "/": static "/blog": main in "blog.ini" "/cms": auth wrapper from Paste: require_role = "admin" filebrowser from FileBrowser: document_root = static.document_root The idea here is that if you want to pass a component or components to a factory, you have to use ':' syntax, with either a one-line component specifier on the same line, or a multi-component stack as an indented suite. An interesting question is whether you should be able to refer to nested definitions as factory prototypes (ala your auth2/auth) or whether only top-level names should be usable. For example in this: foo: bar from baz spam: foo: snickety from lemon scuzz: foo sprim: thingy: foo Does "scuzz: foo" refer to the inner foo or the outer foo? What about "thingy: foo"? I'm inclined to say that both refer to the spam: foo rather than the outermost "foo". i.e., more or less the same rules as Python scopes. One minor problem with this syntax overall, though, is that it's a bit context-dependent. Whether "foo:" means "define foo" or "create a foo" is just a matter of alternating layers. It would be better if the syntax were less ambiguous, e.g.: main := login wrapper from Paste: # blah urlmap from Paste: "/" := static "/blog" := main in "blog.ini" "/cms" := auth wrapper from Paste: require_role = "admin" filebrowser from FileBrowser: document_root = static.document_root But that doesn't actually seem to help visually, and makes it harder to write because you have to remember all the time whether you need ":" or ":=". Maybe this would be better: main is: login wrapper from Paste: # blah urlmap from Paste: match_mode = "longest" "/" is static "/blog" is main in "blog.ini" "/cms" is: auth wrapper from Paste: require_role = "admin" filebrowser from FileBrowser: document_root = static.document_root What's nice about this is that now you can unambiguously create a top-level object in the simple case: zapp from Zope: cfg_file = "site.zcml" Without needing to do: main is: zapp from Zope: cfg_file = "site.zcml" Which is a pain, IMO. Although I suppose we could allow: main is zapp from Zope: cfg_file = "site.zcml" which isn't too bad. >>>* "name from egg_spec" reads nice on one level, but is vague on another >>>level. Even if "egg:egg_spec#name" doesn't read well, I think it is >>>nicely self-describing. >> >>Um, wha??? The only difference between the two is that one of them has >>"egg:" in front of it, which seems a bit redundant to me. That's >>probably because I assume that in the long run eggs will be so ubiquitous >>that it really will be redundant to explicitly refer to them as such. :) > >In paste.deploy config files are full peers to Eggs, and can be used >anywhere that an egg: reference can be used. I think that's a neat >feature. I don't want to tack on referencing other config files, like a >special loader factory or textual inclusion hacks or anything like that. > >Config files describe applications. Egg entry points describe >applications. They should be peers. Okay. The "in" syntax I gave above allows that, although I could also go for only using "from", as long as config URLs are quoted strings. I also think the strings should be relative or absolute URLs, rather than filenames. (So that '/' has the same meaning on all platforms.) That will be something of a pain for Windows users who may need to include drive letters, but oh well. We can always treat the letters A-Z as a special "file:" protocol to fix that. :) >>Conversely, if I assume that some further description is required, I >>would want to say "pypi:" or "project:" or something else of that sort, >>because "egg" isn't the essential nature of the thing; the name is a >>*project* name, while eggs are an implementation detail. > >egg: is an access method, just like http: or whatever. It doesn't say >what the URI describes, just how to find it. Ah, but that's just it. The project name is a URN, not a URL, precisely because it *doesn't* describe how to locate the resource, it just names the resource and tells the system to go find it. From ianb at colorstudy.com Tue Aug 23 18:03:05 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 23 Aug 2005 11:03:05 -0500 Subject: [Web-SIG] PasteDeploy 0.1 In-Reply-To: <5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com> References: <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com> <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <64ddb72c050821233445da5238@mail.gmail.com> <4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com> <64ddb72c050821233445da5238@mail.gmail.com> <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com> <5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com> Message-ID: <430B48B9.90607@colorstudy.com> Phillip J. Eby wrote: >> It's important to me, and it's not intuitive to me what you envision. >> So I feel a need to services in action, replacing global configuration. > > > Using a config service in a factory to get a default argument value: > > def some_app_factory(parent_component, **args): > config = parent_component.get_service("global_config") > args.setdefault('someparam', config['someparam']) So "parent_component" is some special object created by the config loader... > Registering a config service (old syntax): > > [globalconfigservice from SomeEgg] > someparam = "foo" ...and I assume in this case globalconfigservice does something along the lines: def globalconfigservice(parent_component, next, **args): config = parent_component.get_service('global_config').copy() config.update(args) component = Component(parent_component) component.save_service('global_config', config) return next(component) Obviously I'm making up the component interface here. > [...next component in the stack...] > > The config service would respond to 'get_service("global_config")' by > returning self. > > The idea is that when you chain non-wrapper components in a pipeline, > each one gets the previous component as its "parent component", so you > can "acquire" services from your parents. Components nearer to you > (i.e. more local) can override more global service definitions. OK, well now I'm a bit confused... is globalconfigservice a wrapper? I assume globalconfigservice can't modify the parent_component it is passed, and has to create a new one? >> # or you could do the whole thing like: >> >> filebrowser = pipeline: >> auth wrapper from Paste: >> require_role = "admin" >> admin_email = config.admin_email >> filebrowser from FileBrowser: >> document_root = static.document_root >> >> # maybe you could specialize/clone like: >> >> auth2 = auth: >> require_role = "editor" > > > Interesting. If we used "in" to include other files then you could > refer to e.g.: > > foo = main in "some.ini": > # override params Hmm... it would be nice to allow configuration filenames to be variables. Though "in" and "from" don't scream "config file" and "egg" to me -- they are both equally vague terms. I'd rather see "in egg" and "in file". > Also, I was thinking that in this syntax, you want to be able to leave > off the trailing ':' for simple definitions, so that this would be a > complete definition, without needing a body: > > foo = main in "some.ini" Yes, that works well. > Finally, I think we could drop the "pipeline" keyword and simply use a > ':' to define a name, which then gives us a way to stack components > inside a definition, e.g.: > > main: > login wrapper from Paste: > # blah > urlmap from Paste: > "/": static > "/blog": main in "blog.ini" > "/cms": > auth wrapper from Paste: > require_role = "admin" > filebrowser from FileBrowser: > document_root = static.document_root > > > The idea here is that if you want to pass a component or components to a > factory, you have to use ':' syntax, with either a one-line component > specifier on the same line, or a multi-component stack as an indented > suite. This starts looking a lot like class statements (especially when class statements get reused as data definitions). And of course a bit like YAML. But then both those resemblences are okay. > An interesting question is whether you should be able to refer to nested > definitions as factory prototypes (ala your auth2/auth) or whether only > top-level names should be usable. For example in this: > > foo: bar from baz > > spam: > foo: snickety from lemon > scuzz: foo > sprim: > thingy: foo > > Does "scuzz: foo" refer to the inner foo or the outer foo? What about > "thingy: foo"? > > I'm inclined to say that both refer to the spam: foo rather than the > outermost "foo". i.e., more or less the same rules as Python scopes. I agree. Will spam.foo be an unambiguous representation? It seems like it should be. Would there be a global object, like globals.foo? > One minor problem with this syntax overall, though, is that it's a bit > context-dependent. Whether "foo:" means "define foo" or "create a foo" > is just a matter of alternating layers. It would be better if the > syntax were less ambiguous, e.g.: I don't see the distinction between "define" and "create". By this distinction do you mean that pieces of the loading process lazy? Can all parts be lazy? (I.e., the config file defines named factories, the body of sections isn't evaluated until those factories are invoked) > main := > login wrapper from Paste: > # blah > urlmap from Paste: > "/" := static > "/blog" := main in "blog.ini" > "/cms" := > auth wrapper from Paste: > require_role = "admin" > filebrowser from FileBrowser: > document_root = static.document_root ...for instance, when this was "main = pipeline:", it was clear this was just another "create", except using "pipeline" to create the object, and pipeline looks at the section contents. The unnamed sections below it are just like positional parameters (would named sections be ordered? -- I've always wanted ordered class statements, I imagine I'd like to keep order here too) I don't have any attachment to "pipeline", but I think some word is fine in that position, and I don't see why this is a particularly "special" construct (except of course that it should be builtin). Would this be allowed?: main = urlmap from Paste: "/" = static from Paste: document_root = "/home/me/htdocs" > But that doesn't actually seem to help visually, and makes it harder to > write because you have to remember all the time whether you need ":" or > ":=". Maybe this would be better: > > main is: > login wrapper from Paste: > # blah > urlmap from Paste: > match_mode = "longest" > "/" is static > "/blog" is main in "blog.ini" > "/cms" is: > auth wrapper from Paste: > require_role = "admin" > filebrowser from FileBrowser: > document_root = static.document_root While I'm not attached to "pipeline", "is" is about as vague as "in" and "from". > What's nice about this is that now you can unambiguously create a > top-level object in the simple case: > > zapp from Zope: > cfg_file = "site.zcml" > > Without needing to do: > > main is: > zapp from Zope: > cfg_file = "site.zcml" You could do: main = zapp from Zope: cfg_file = "site.zcml" Assuming "main" was a special magic name for the primary application. I would certainly assume that reading the config file (even I'd never seen these config files before). I, for instance, do not like Python's "if __name__=='__main__'" idiom; I think using a conventional name to indicate the primary function of a file is just fine. > Which is a pain, IMO. Although I suppose we could allow: > > main is zapp from Zope: > cfg_file = "site.zcml" > > which isn't too bad. > > >>>> * "name from egg_spec" reads nice on one level, but is vague on >>>> another level. Even if "egg:egg_spec#name" doesn't read well, I >>>> think it is nicely self-describing. >>> >>> >>> Um, wha??? The only difference between the two is that one of them >>> has "egg:" in front of it, which seems a bit redundant to me. That's >>> probably because I assume that in the long run eggs will be so >>> ubiquitous that it really will be redundant to explicitly refer to >>> them as such. :) >> >> >> In paste.deploy config files are full peers to Eggs, and can be used >> anywhere that an egg: reference can be used. I think that's a neat >> feature. I don't want to tack on referencing other config files, like >> a special loader factory or textual inclusion hacks or anything like >> that. >> >> Config files describe applications. Egg entry points describe >> applications. They should be peers. > > > Okay. The "in" syntax I gave above allows that, although I could also > go for only using "from", as long as config URLs are quoted strings. I > also think the strings should be relative or absolute URLs, rather than > filenames. (So that '/' has the same meaning on all platforms.) That > will be something of a pain for Windows users who may need to include > drive letters, but oh well. We can always treat the letters A-Z as a > special "file:" protocol to fix that. :) By URLs, do you just mean that they use URL syntax, URL quoting of filenames, etc? That's fine by me; I normalize \ to / in paste.deploy and run urllib.unquote on the result already. I'm not sure what to do with \'s; they are dumb and annoying and I hate them, but when they slip into the system it should at least handle them reasonably. While it is slightly annoying to keep track of it, I think it's important that filenames be defined as relative to the config file that they are contained in. The current working directory is useless, and always using absolute filenames makes config files very hard to reuse. >>> Conversely, if I assume that some further description is required, I >>> would want to say "pypi:" or "project:" or something else of that >>> sort, because "egg" isn't the essential nature of the thing; the name >>> is a *project* name, while eggs are an implementation detail. >> >> >> egg: is an access method, just like http: or whatever. It doesn't say >> what the URI describes, just how to find it. > > > Ah, but that's just it. The project name is a URN, not a URL, precisely > because it *doesn't* describe how to locate the resource, it just names > the resource and tells the system to go find it. Well, sure it says how to find it -- load pkg_resources, get the package by name, etc. There's always a "system, go do stuff for me" step, that's how computers work. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From pje at telecommunity.com Tue Aug 23 19:08:21 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 23 Aug 2005 13:08:21 -0400 Subject: [Web-SIG] PasteDeploy 0.1 In-Reply-To: <430B48B9.90607@colorstudy.com> References: <5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com> <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com> <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <64ddb72c050821233445da5238@mail.gmail.com> <4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com> <64ddb72c050821233445da5238@mail.gmail.com> <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com> <5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com> At 11:03 AM 8/23/2005 -0500, Ian Bicking wrote: >Phillip J. Eby wrote: >>>It's important to me, and it's not intuitive to me what you envision. So >>>I feel a need to services in action, replacing global configuration. >> >>Using a config service in a factory to get a default argument value: >> def some_app_factory(parent_component, **args): >> config = parent_component.get_service("global_config") >> args.setdefault('someparam', config['someparam']) > >So "parent_component" is some special object created by the config loader... Each component in a "pipeline" receives the previous non-wrapper component in the pipeline as its parent. The top-level parent would be an object whose get_service() always returns None or raises an error or something like that. (I'm being vague because we haven't started nailing down a precise "services" spec and don't want to mix it in with the syntax discussion for now.) >>Registering a config service (old syntax): >> [globalconfigservice from SomeEgg] >> someparam = "foo" > >...and I assume in this case globalconfigservice does something along the >lines: > >def globalconfigservice(parent_component, next, **args): > config = parent_component.get_service('global_config').copy() > config.update(args) > component = Component(parent_component) > component.save_service('global_config', config) > return next(component) > >Obviously I'm making up the component interface here. I was thinking something more like this: class globalconfigservice: def __init__(self, parent_component, **args): self._parent = parent_component self._data = args def get_service(self, key): if key=='global_config': return self return self._parent.get_service(key) def __getitem__(self,key): try: return self._data[key] except KeyError: previous = self._parent.get_service('global_config') if previous is None: raise result = self._data[key] = previous[key] return result This isn't a wrapper, so it doesn't know about the "next" component, and doesn't need to. Parent components can be shared by multiple children. Wrappers, on the other hand, transform their child, and are not considered a parent component. >> [...next component in the stack...] >>The config service would respond to 'get_service("global_config")' by >>returning self. >>The idea is that when you chain non-wrapper components in a pipeline, >>each one gets the previous component as its "parent component", so you >>can "acquire" services from your parents. Components nearer to you (i.e. >>more local) can override more global service definitions. > >OK, well now I'm a bit confused... is globalconfigservice a wrapper? I >assume globalconfigservice can't modify the parent_component it is passed, >and has to create a new one? No. The globalconfigservice *becomes* the parent_component of the components that follow it, until another non-wrapper component is defined (which then becomes the parent of those that follow it, and so on). >Hmm... it would be nice to allow configuration filenames to be >variables. Though "in" and "from" don't scream "config file" and "egg" to >me -- they are both equally vague terms. I'd rather see "in egg" and "in >file". I'd rather just use 'from ProjectNameHere' and 'from "config_URL_here"', since these two syntaxes can cover everything you or I have thus far imagined. >>An interesting question is whether you should be able to refer to nested >>definitions as factory prototypes (ala your auth2/auth) or whether only >>top-level names should be usable. For example in this: >> foo: bar from baz >> spam: >> foo: snickety from lemon >> scuzz: foo >> sprim: >> thingy: foo >>Does "scuzz: foo" refer to the inner foo or the outer foo? What about >>"thingy: foo"? >>I'm inclined to say that both refer to the spam: foo rather than the >>outermost "foo". i.e., more or less the same rules as Python scopes. > >I agree. Will spam.foo be an unambiguous representation? It seems like >it should be. Would there be a global object, like globals.foo? I guess there could be, but then I lean towards making a file be one object by default. If you want a named top-level, you could do: main from: bar is squidge from spim main is bar: foo = "whee" That is, we could allow targetless "from" to promote a name from a new child context. >>One minor problem with this syntax overall, though, is that it's a bit >>context-dependent. Whether "foo:" means "define foo" or "create a foo" >>is just a matter of alternating layers. It would be better if the syntax >>were less ambiguous, e.g.: > >I don't see the distinction between "define" and "create". By define I mean "bind the following to the name foo", and by create I mean "create an instance using the foo factory". > By this distinction do you mean that pieces of the loading process > lazy? Can all parts be lazy? (I.e., the config file defines named > factories, the body of sections isn't evaluated until those factories are > invoked) No; I was strictly speaking of the context-specific nature of that specific syntax, because it alternates layers of defining names and invoking factories, such that a given snippet of syntax can't be independently understood by a reader. >> main := >> login wrapper from Paste: >> # blah >> urlmap from Paste: >> "/" := static >> "/blog" := main in "blog.ini" >> "/cms" := >> auth wrapper from Paste: >> require_role = "admin" >> filebrowser from FileBrowser: >> document_root = static.document_root > >...for instance, when this was "main = pipeline:", it was clear this was >just another "create", except using "pipeline" to create the object, and >pipeline looks at the section contents. The unnamed sections below it are >just like positional parameters (would named sections be ordered? -- I've >always wanted ordered class statements, I imagine I'd like to keep order >here too) I don't really want them to be positional parameters, I want them to stack. If pipelines were rare, I'd just nest them and use e.g. a 'next' keyword. However, nested pipelines mean you have to indent everything every time you add a new wrapper, which would be like having to do "else: if:" instead of "elif:". >I don't have any attachment to "pipeline", but I think some word is fine >in that position, and I don't see why this is a particularly "special" >construct (except of course that it should be builtin). Would this be >allowed?: > >main = urlmap from Paste: > "/" = static from Paste: > document_root = "/home/me/htdocs" This syntax is ambiguous, because you don't know if the thing after the '=' should be parsed as a Python expression or as a constructor expression, at least not without significant parser lookahead. Significant lookahead isn't that good for a human reader, either. That's why I think we need syntax to distinguish "object definition" from "value assignment". >>But that doesn't actually seem to help visually, and makes it harder to >>write because you have to remember all the time whether you need ":" or >>":=". Maybe this would be better: >> main is: >> login wrapper from Paste: >> # blah >> urlmap from Paste: >> match_mode = "longest" >> "/" is static >> "/blog" is main in "blog.ini" >> "/cms" is: >> auth wrapper from Paste: >> require_role = "admin" >> filebrowser from FileBrowser: >> document_root = static.document_root > >While I'm not attached to "pipeline", "is" is about as vague as "in" and >"from". Well, I'm fine with dropping "in", so we would have only two special keywords, "is" and "from", and they're not interchangeable, so there's a minimum of ambiguity. Also, I chose "from" because of the similarity to importing, and "is" implies object identity as well as definition (e.g. "the definition of main is..."). (One of the things I'm trying to do with this syntax, btw, is stick with Python's tokens and keywords, so that the tokenize module can do most of the heavy lifting, and I'd also prefer we didn't introduce new reserved words that aren't keywords in Python.) >Assuming "main" was a special magic name for the primary application. I >would certainly assume that reading the config file (even I'd never seen >these config files before). I, for instance, do not like Python's "if >__name__=='__main__'" idiom; I think using a conventional name to indicate >the primary function of a file is just fine. Well, __name__=='__main__' doesn't apply here. I see this as the difference between def statements and regular statements in a module. Function bodies aren't executed unless they're used, so it seems wrong to me to have a def main. If the magic name were __main__ I could accept it more, except for the fact that it would then highlight the point that if the idiom is common enough to need a magic name, then it's common enough to warrant a way of doing it without a name! >>Okay. The "in" syntax I gave above allows that, although I could also go >>for only using "from", as long as config URLs are quoted strings. I also >>think the strings should be relative or absolute URLs, rather than >>filenames. (So that '/' has the same meaning on all platforms.) That >>will be something of a pain for Windows users who may need to include >>drive letters, but oh well. We can always treat the letters A-Z as a >>special "file:" protocol to fix that. :) > >By URLs, do you just mean that they use URL syntax, URL quoting of >filenames, etc? Yes. And that relative URLs are interpreted as relative to the URL that was used to load the file they're in. But also that absolute URLs are allowed, which may include application or framework/specific URLs, and the loading facility should be hookable to do the actual URL joining and retrieving. ZConfig works like this, and PEAK hooks into it so that all of PEAK's special urls like "pkgfile:" and such can be used. I've definitely got an eye on using this format we're discussing as a nice schema-free alternative to ZConfig. > That's fine by me; I normalize \ to / in paste.deploy and run > urllib.unquote on the result already. I'm not sure what to do with \'s; > they are dumb and annoying and I hate them, but when they slip into the > system it should at least handle them reasonably. I think \ should have its normal meaning in a string literal, unless a "raw" literal is used. >While it is slightly annoying to keep track of it, I think it's important >that filenames be defined as relative to the config file that they are >contained in. The current working directory is useless, and always using >absolute filenames makes config files very hard to reuse. Agreed; they should be interpreted as URLs relative to the current file. ZConfig (and PEAK's wrapping of it) both use this approach and it works well. >>>>Conversely, if I assume that some further description is required, I >>>>would want to say "pypi:" or "project:" or something else of that sort, >>>>because "egg" isn't the essential nature of the thing; the name is a >>>>*project* name, while eggs are an implementation detail. >>> >>>egg: is an access method, just like http: or whatever. It doesn't say >>>what the URI describes, just how to find it. >> >>Ah, but that's just it. The project name is a URN, not a URL, precisely >>because it *doesn't* describe how to locate the resource, it just names >>the resource and tells the system to go find it. > >Well, sure it says how to find it -- load pkg_resources, get the package >by name, etc. There's always a "system, go do stuff for me" step, that's >how computers work. I'm referring here to the technical meaning of a "naming" system versus an "addressing" system. An addressing system identifies a canonical "naming authority" that provides global uniqueness, whereas a "naming" system only implies the context in which the name may be understood. You can read up the RFCs on URNs vs. URLs (which are both subtypes of URI), or you can read up on JNDI, LDAP, x.500 and other "naming" services if you don't believe me. An 'egg:' URI would be a URN, not a URL, and the 'egg' makes no sense in either case, because an egg is a resource *type*, not a naming or addressing scheme. Thus, if I were to create a URI scheme for eggs, I would use a name like 'pypi:' or 'py-project:' or something like that, to denote the naming scheme. From ianb at colorstudy.com Tue Aug 23 22:16:00 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 23 Aug 2005 15:16:00 -0500 Subject: [Web-SIG] [Paste] Re: PasteDeploy 0.1 In-Reply-To: References: <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <64ddb72c050821233445da5238@mail.gmail.com> <4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com> <64ddb72c050821233445da5238@mail.gmail.com> <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com> Message-ID: <430B8400.4050003@colorstudy.com> Michal Wallace wrote: > I'm working on a similar problem, and I really like your approach > here, but I feel like I'm going to have to reinvent the wheel for > my particular framework, because it's RESTlike. See, in addition > to each URL, I'd like to be able to dispatch based on the HTTP > method (GET,PUT,POST,DELETE...) > > What would be nice (for me) is if you could do something like: > > > GET / = config:static_root.ini > POST /cms = config:filebrowser.ini > * /blog = config:blog.ini This shouldn't be a problem (in paste.deploy or the alternatives we're discussing) -- the "urlmap" I refer to is just (not very complicated) Python code. You could do the same thing dispatching on HTTP methods. With paste.deploy you'd do something like: [composit:main] use = egg:MyFramework#httpdispatch GET = config:static_root.ini ... ... If you want to do both at once (dispatch both on path like urlmap, and on HTTP method) you'd have to make something like urlmap that also keeps track of methods, and use "GET / = ...". Potentially urlmap could support all of these (maybe not all as egg:Paste#urlmap, but with the same basic code). Right now it matches based on path prefix and domain, and I've meant to add ports, and HTTP method would be easy enough. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From pje at telecommunity.com Tue Aug 23 22:27:05 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 23 Aug 2005 16:27:05 -0400 Subject: [Web-SIG] [Paste] Re: PasteDeploy 0.1 Message-ID: <5.1.1.6.0.20050823162659.02024808@mail.telecommunity.com> At 04:05 PM 8/23/2005 -0400, Michal Wallace wrote: >What would be nice (for me) is if you could do something like: > > GET / = config:static_root.ini > POST /cms = config:filebrowser.ini > * /blog = config:blog.ini > >Where "*" indicates "any HTTP method"... And of >course "*" could be the default, so if you don't >care about methods people could just use the >existing syntax. This is accomodated fairly easy within the syntax currently being discussed: methodmap from Paste: GET is urlmap from Paste: "/" is main from "static_root.ini" POST is urlmap from Paste: "/cms" is main from "filebrowser.ini" "*" is urlmap from Paste: "/blog" is main from "blog.ini" Although this might also be spelled: main from: byURL is url_dispatcher from Paste byMethod is method_dispatcher from Paste main is byMethod: GET is byURL: "/" is main from "static_root.ini" POST is byURL: "/cms" is main from "filebrowser.ini" "*" is byURL: "/blog" is main from "blog.ini" From ianb at colorstudy.com Tue Aug 23 22:37:17 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 23 Aug 2005 15:37:17 -0500 Subject: [Web-SIG] [Paste] Re: PasteDeploy 0.1 In-Reply-To: References: <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <64ddb72c050821233445da5238@mail.gmail.com> <4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com> <64ddb72c050821233445da5238@mail.gmail.com> <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com> Message-ID: <430B88FD.3000702@colorstudy.com> Michal Wallace wrote: > GET / = config:static_root.ini > POST /cms = config:filebrowser.ini > * /blog = config:blog.ini One interesting thing about this sort of thing is, REST or no, you probably aren't going to do method-based dispatch on a server level, since it's hard to actually partition applications that way. For example, you could almost put a transparent webdav layer on top of something else, except GET is overloaded, and you'd actually end up with some user-agent-based dispatch, which doesn't seem particularly RESTful. But I can imagine using this deployment format as an internal format when setting up your otherwise-encapsulated application. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From ianb at colorstudy.com Tue Aug 23 22:41:27 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 23 Aug 2005 15:41:27 -0500 Subject: [Web-SIG] [Paste] Re: PasteDeploy 0.1 In-Reply-To: References: <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <64ddb72c050821233445da5238@mail.gmail.com> <4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com> <64ddb72c050821233445da5238@mail.gmail.com> <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com> Message-ID: <430B89F7.4070407@colorstudy.com> Whoops, hit the send key before I had finished the response... Michal Wallace wrote: > GET / = config:static_root.ini > POST /cms = config:filebrowser.ini > * /blog = config:blog.ini One interesting thing about this sort of thing is, REST or no, you probably aren't going to do method-based dispatch on a server level, since it's hard to actually partition applications that way. For example, you could almost put a transparent webdav layer on top of something else, except GET is overloaded, and you'd actually end up with some user-agent-based dispatch, which doesn't seem particularly RESTful. But I can imagine using this deployment format as an internal format when setting up your otherwise-encapsulated application. This is the way some of the regex-based dispatching frameworks work (like Django), or something like the Rails Routes port could work. These require configuration, and the configuration we're discussing here is actually pretty reasonable for those kinds of systems. When you are doing that, I'd guess you'd put the configuration in your package (in the .egg-info directory or elsewhere), and then create a little shell of a function that loads the application described by the configuration file. Anyway, another use case to keep in mind; I'd thought about configuration files contained inside distributions, but I hadn't actually thought of a good reason for it until now. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From ianb at colorstudy.com Wed Aug 24 00:27:39 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 23 Aug 2005 17:27:39 -0500 Subject: [Web-SIG] PasteDeploy 0.1 In-Reply-To: <5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com> References: <5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com> <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com> <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <64ddb72c050821233445da5238@mail.gmail.com> <4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com> <64ddb72c050821233445da5238@mail.gmail.com> <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com> <5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com> <5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com> Message-ID: <430BA2DB.6070003@colorstudy.com> Phillip J. Eby wrote: > I was thinking something more like this: > > class globalconfigservice: > def __init__(self, parent_component, **args): > self._parent = parent_component > self._data = args > > def get_service(self, key): > if key=='global_config': > return self > return self._parent.get_service(key) > > def __getitem__(self,key): > try: > return self._data[key] > except KeyError: > previous = self._parent.get_service('global_config') > if previous is None: > raise > result = self._data[key] = previous[key] > return result > > This isn't a wrapper, so it doesn't know about the "next" component, and > doesn't need to. Parent components can be shared by multiple children. > Wrappers, on the other hand, transform their child, and are not > considered a parent component. So services (aka components) are just a objects with .get_service(key) methods? Is there any other API or semantics implied? >> OK, well now I'm a bit confused... is globalconfigservice a wrapper? >> I assume globalconfigservice can't modify the parent_component it is >> passed, and has to create a new one? > > > No. The globalconfigservice *becomes* the parent_component of the > components that follow it, until another non-wrapper component is > defined (which then becomes the parent of those that follow it, and so on). Does the configuration somehow indicate that something produces a component, as opposed to producing the object-in-question (WSGI application for us)? I'm not clear how an application, an application wrapper, and a component wrapper are distinguished. >> Hmm... it would be nice to allow configuration filenames to be >> variables. Though "in" and "from" don't scream "config file" and >> "egg" to me -- they are both equally vague terms. I'd rather see "in >> egg" and "in file". > > > I'd rather just use 'from ProjectNameHere' and 'from "config_URL_here"', > since these two syntaxes can cover everything you or I have thus far > imagined. What exactly do you envision for config URLs? >>> main := >>> login wrapper from Paste: >>> # blah >>> urlmap from Paste: >>> "/" := static >>> "/blog" := main in "blog.ini" >>> "/cms" := >>> auth wrapper from Paste: >>> require_role = "admin" >>> filebrowser from FileBrowser: >>> document_root = static.document_root >> >> >> ...for instance, when this was "main = pipeline:", it was clear this >> was just another "create", except using "pipeline" to create the >> object, and pipeline looks at the section contents. The unnamed >> sections below it are just like positional parameters (would named >> sections be ordered? -- I've always wanted ordered class statements, I >> imagine I'd like to keep order here too) > > > I don't really want them to be positional parameters, I want them to > stack. If pipelines were rare, I'd just nest them and use e.g. a 'next' > keyword. However, nested pipelines mean you have to indent everything > every time you add a new wrapper, which would be like having to do > "else: if:" instead of "elif:". A stack and positional parameters are nearly the same thing... def pipeline(*args): app = args[-1] wrappers = args[:-1] wrappers.reverse() for wrapper in wrappers: app = wrapper(app) return wrapper ...? Throw in a couple other arguments and whatnot for keywords or whatever, it doesn't matter. Another case where I would use positional parameters to do something different would be a cascading dispatcher, like: main is cascade from Paste: static from Paste: document_root = "/..." blog from MyBlog: ... catch = 404 I can phrase the same thing in other ways (in paste.deploy it uses the sorted keys that start with "app"), but it seems like an unexpected bonus if the format is general enough to do this. >> I don't have any attachment to "pipeline", but I think some word is >> fine in that position, and I don't see why this is a particularly >> "special" construct (except of course that it should be builtin). >> Would this be allowed?: >> >> main = urlmap from Paste: >> "/" = static from Paste: >> document_root = "/home/me/htdocs" > > > This syntax is ambiguous, because you don't know if the thing after the > '=' should be parsed as a Python expression or as a constructor > expression, at least not without significant parser lookahead. > Significant lookahead isn't that good for a human reader, either. > That's why I think we need syntax to distinguish "object definition" > from "value assignment". OK, I see the issue now. I guess "is" is fine; I think using different punctuation like := is much too subtle. >>> But that doesn't actually seem to help visually, and makes it harder >>> to write because you have to remember all the time whether you need >>> ":" or ":=". Maybe this would be better: >>> main is: >>> login wrapper from Paste: >>> # blah >>> urlmap from Paste: >>> match_mode = "longest" >>> "/" is static >>> "/blog" is main in "blog.ini" >>> "/cms" is: >>> auth wrapper from Paste: >>> require_role = "admin" >>> filebrowser from FileBrowser: >>> document_root = static.document_root >> >> >> While I'm not attached to "pipeline", "is" is about as vague as "in" >> and "from". > > > Well, I'm fine with dropping "in", so we would have only two special > keywords, "is" and "from", and they're not interchangeable, so there's a > minimum of ambiguity. Also, I chose "from" because of the similarity to > importing, and "is" implies object identity as well as definition (e.g. > "the definition of main is..."). If you really want similarities, invert "is" and call it "as". "foo from AnEgg as main:". But I think that's backwards, so I wouldn't really advocate for it. > (One of the things I'm trying to do with this syntax, btw, is stick with > Python's tokens and keywords, so that the tokenize module can do most of > the heavy lifting, and I'd also prefer we didn't introduce new reserved > words that aren't keywords in Python.) > > >> Assuming "main" was a special magic name for the primary application. >> I would certainly assume that reading the config file (even I'd never >> seen these config files before). I, for instance, do not like >> Python's "if __name__=='__main__'" idiom; I think using a conventional >> name to indicate the primary function of a file is just fine. > > > Well, __name__=='__main__' doesn't apply here. I see this as the > difference between def statements and regular statements in a module. > Function bodies aren't executed unless they're used, so it seems wrong > to me to have a def main. If the magic name were __main__ I could > accept it more, except for the fact that it would then highlight the > point that if the idiom is common enough to need a magic name, then it's > common enough to warrant a way of doing it without a name! I think we disagree about one-app-per-file, and perhaps you also have a notion that doesn't come out in all of your examples that you want a stack represented at the top-level of the file...? That is, like: auth from Paste: ... # wraps... session from Session: ... # wraps main from MyApp: ... If that's what you are getting at, I *really* don't like that. Config files don't use top-level ordering often at all. The few cases where order matters, it's purely as priority for overlapping options (like rewrite rules). And those few cases suck anyway because of the ambiguity of overlap, so it's kind of the exception that proves the rule. I'm okay with ordering *under* the top-level names, like: main is: auth from Paste: ... session from Session: ... ... it doesn't appeal to me, but it doesn't bother me. I don't want "main" (or worse "__main__") to be special, just to be conventional, like as a default to a keyword argument. It's an ugly wart that every (good!) script in Python looks like: def main(): ... if __name__ == '__main__': main() That's nothing but stupid boilerplate, because otherwise you can't get at that function if you put everything in the "if" statement. In the same way, I want to be able to be able to pick pieces out of a configuration file without creating the main application, and I want to be able to look in the main application without creating it (since it's mostly opaque once it's been created). __main__ is completely unnecessary, as "main" seems quite special on its own without scary underscores. It's a very natural name, and one that should be intuitive to anyone reading the file. That it has a name shows that it is a distinct entity, but a series of unnamed entries in the config file doesn't imply that in the same way. >> That's fine by me; I normalize \ to / in paste.deploy and run >> urllib.unquote on the result already. I'm not sure what to do with >> \'s; they are dumb and annoying and I hate them, but when they slip >> into the system it should at least handle them reasonably. > > > I think \ should have its normal meaning in a string literal, unless a > "raw" literal is used. Hmm... it doesn't really matter to me, since I never use Windows. But whenever I gaze upon Windows filenames in Python they hurt my eyes. I agree anything in "" should be a string literal, with all the string literal rules. Maybe these don't have to be string literals. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From pje at telecommunity.com Wed Aug 24 01:32:39 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 23 Aug 2005 19:32:39 -0400 Subject: [Web-SIG] PasteDeploy 0.1 In-Reply-To: <430BA2DB.6070003@colorstudy.com> References: <5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com> <5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com> <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com> <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <64ddb72c050821233445da5238@mail.gmail.com> <4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com> <64ddb72c050821233445da5238@mail.gmail.com> <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com> <5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com> <5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com> At 05:27 PM 8/23/2005 -0500, Ian Bicking wrote: >So services (aka components) are just a objects with .get_service(key) >methods? Is there any other API or semantics implied? Not at the handwavy level we're currently discussing them with, no. >>No. The globalconfigservice *becomes* the parent_component of the >>components that follow it, until another non-wrapper component is defined >>(which then becomes the parent of those that follow it, and so on). > >Does the configuration somehow indicate that something produces a >component, as opposed to producing the object-in-question (WSGI >application for us)? I'm not clear how an application, an application >wrapper, and a component wrapper are distinguished. In the syntax I've been using to date, "wrapper" simply indicates that the component wishes to receive the components following it as an argument, replacing them with the wrapper's return value. All non-wrappers are just components. As I've been thinking through the implementation some more, I've realized that the "wrapper" keyword isn't really needed, if the construction responsibilities are divided a bit differently than I first had in mind. More on that in a later post. >>I'd rather just use 'from ProjectNameHere' and 'from "config_URL_here"', >>since these two syntaxes can cover everything you or I have thus far imagined. > >What exactly do you envision for config URLs? In the simple case, they should just be relative URLs. >Another case where I would use positional parameters to do something >different would be a cascading dispatcher, like: > >main is cascade from Paste: > static from Paste: > document_root = "/..." > blog from MyBlog: > ... > catch = 404 This syntax is ambiguous at 1-token lookahead, because you can't tell up front whether "static" is supposed to be a name that's being assigned (and therefore followed by "is" or "="), or whether it's a factory name (in which case it may be followed by "." and more identifiers, possibly followed by "from"). There might be a way to disambiguate it by complicating the grammar, I suppose, but I'm not sure I like it. The way I've currently conceived of the grammar is that you can have either assignment (namespace) scopes or sequence scopes. In my way of thinking, the top-level is a sequence scope, and everything else is a namespace scope, unless you introduce a sequence scope using "is:". Thus, I see your example above as simply beginning "main is:", and then the contents can be a sequence. >I think we disagree about one-app-per-file, and perhaps you also have a >notion that doesn't come out in all of your examples that you want a stack >represented at the top-level of the file...? That is, like: > > auth from Paste: > ... > # wraps... > session from Session: > ... > # wraps > main from MyApp: > ... > > >If that's what you are getting at, I *really* don't like that. Config >files don't use top-level ordering often at all. That depends quite a lot on what the configuration file does, and its format. However, if you would like to make it not be that way, all you have to do is: main from: # named stuff here My reasoning for this is as follows. In the simplest possible case, a user should be able to deploy an application using only this, as their entire file: app from SomeCoolApp In other words, the above is the "hello world" of this language. Your variation would be: main is app from SomeCoolApp Not a lot of difference at this initial level, but now let's add a filter. My way: login from Paste app from SomeCoolApp Your way: main is: login from Paste app from SomeCoolApp The big difference between your take and my take on this is that I'm viewing a file as specifying an object, while you're viewing it as defining a namespace of objects. > The few cases where order matters, it's purely as priority for > overlapping options (like rewrite rules). And those few cases suck > anyway because of the ambiguity of overlap, so it's kind of the exception > that proves the rule. But pipelines are sequences too. >That's nothing but stupid boilerplate, because otherwise you can't get at >that function if you put everything in the "if" statement. In the same >way, I want to be able to be able to pick pieces out of a configuration >file without creating the main application, and I want to be able to look >in the main application without creating it (since it's mostly opaque once >it's been created). You're making the assumption that what you "get" is the created object, while I'm assuming that what you get is a partially-applied factory, with properties that return configuration values or other factories. You still have to call the factory to create the objects. IOW, the way I see it is that you parse a configuration file by providing some scope-and-context information, and you get a factory object back. If the factory object is a namespace, then you can access its properties to get values or child factories. So, to create a library configuration file, I'd assume something like: some_factory: foo is blah: ... bar is feh: ... What 'some_factory' actually creates is unimportant if it never gets called, and if you're just pulling pieces out of it in another configuration file, it won't get called. >__main__ is completely unnecessary, as "main" seems quite special on its >own without scary underscores. It's a very natural name, and one that >should be intuitive to anyone reading the file. That it has a name shows >that it is a distinct entity, but a series of unnamed entries in the >config file doesn't imply that in the same way. Yeah, it's just that it seems weird to me to have URLs represent namespaces that contain objects, but not be able to have URLs refer to objects! That seems downright strange. It also seems to me that the common case will be to define a single pipeline in a file (often with just a single component!), and that making the library developer's job easier (by avoiding the 'some_factory:' wrapper at the top level) makes the deployer's job harder (by requiring a "main is:" wrapper). That pretty much seems like the tradeoff; either the multi-config developer has to do an extra indent, or else the deployer does. My inclination is to favor the deployer. >Maybe these don't have to be string literals. They do if we want to keep it compatible with Python's tokenizer, and I definitely want that. For one thing, it potentially allows implementing a pgen-based C parser for this. Speaking of parsers, here's my current idea of the grammar: sequence ::= object+ object ::= qname source? (suite | NEWLINE) source ::= "from" (STRING | project)? suite ::= ":" INDENT assign+ DEDENT assign ::= (NAME | STRING) ( ("=" testlist NEWLINE) | ("is" objects) ) objects ::= object | ":" INDENT sequence DEDENT qname ::= NAME ("." NAME)* project ::= NAME ("-" NAME)* versions? extras? versions ::= cmpop version ("," cmpop version)* ","? version ::= INT | FLOAT | STRING # maybe just string? cmpop ::= "<" | "<=" | "==" | "!=" | ">=" | ">" extras ::= "[" NAME ("," NAME)* ","? "]" As you can see, the core syntax is just seven productions, not counting the five for egg project requirements and the "testlist" productions from the Python expression grammar. So, it's pretty darn simple as languages go. My rough concept of the semantics is that suites represent functions, and definitions are a cross between setting function attributes on the function defined by the enclosing suite, and setting a default value for a keyword argument within that enclosing function. i.e.: foo: bar is baz: spam = 23 is roughly equivalent to: def __main__(**kw): kw.setdefault('bar', __main__.bar()) def bar(**kw): kw.setdefault('spam', 23) return baz(**kw) bar.spam = 23 __main__.bar = bar For sequences of definitions, you get a function whose attributes come from the namespace of the last suite in the sequence. This is all *rough* semantics, mind you; it will almost certainly *not* be implemented using Python functions, because of the need to manage many levels of nested scopes, and the calling signatures won't exactly match this either. I'm just giving this "as functions" sketch to give an idea of why the whole thing can readily be introspected as data if you want it to be. From renesd at gmail.com Wed Aug 24 02:17:30 2005 From: renesd at gmail.com (Rene Dudfield) Date: Wed, 24 Aug 2005 10:17:30 +1000 Subject: [Web-SIG] cusom config files. was (PasteDeploy 0.1) In-Reply-To: <5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com> References: <4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com> <64ddb72c050821233445da5238@mail.gmail.com> <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com> <5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com> <5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com> <430BA2DB.6070003@colorstudy.com> <5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com> Message-ID: <64ddb72c05082317175d69d49@mail.gmail.com> Hey, are custom config files with custom parsers needed or wanted for configuration? Would not a .ini, python, xml, sql db, file system, or even apache style config file be better? If a common format is used then: 1) less code to maintain. 2) less to learn/document. Cheers, From pje at telecommunity.com Wed Aug 24 02:46:39 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 23 Aug 2005 20:46:39 -0400 Subject: [Web-SIG] cusom config files. was (PasteDeploy 0.1) In-Reply-To: <64ddb72c05082317175d69d49@mail.gmail.com> References: <5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com> <4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com> <64ddb72c050821233445da5238@mail.gmail.com> <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com> <5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com> <5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com> <430BA2DB.6070003@colorstudy.com> <5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050823202008.01b1c980@mail.telecommunity.com> At 10:17 AM 8/24/2005 +1000, Rene Dudfield wrote: >Hey, > >are custom config files with custom parsers needed or wanted for >configuration? > >Would not a .ini, python, xml, sql db, file system, or even apache >style config file be better? > >If a common format is used then: >1) less code to maintain. >2) less to learn/document. That would only be true if there were a common format that worked. The main problem is that all of those formats simply push the complexity from the syntax to the semantic level! One previous proposal was for an .ini variant that could handle pipelines easily, but could not do URL dispatch without awkward hacks to the .ini syntax. .ini files are extremely difficult to use for any kind of nesting. Python files are possible, and that approach has been discussed a bit, but the full Python language may be a little overpowered for configuration, while at the same time not offering convenient constructs for simple things. XML is too verbose, redundant, and strict, and simply pushes the issue to defining the XML schema involved. Also, the very use of XML tends to attract XML geeks who then nitpick about whether you're using XSD or DTDs properly and why you shouldn't use attributes for data, blah blah blah. ;) I'm not sure what you mean by SQL DB, but if you mean putting the configuration in a database, I don't see why that would be useful or good. Similarly, I don't know what you mean by "file system". Apache-style configuration (like ZConfig) can also get very ugly very quickly when nesting gets involved, and it has no built-in way to reference items within the configuration, so like XML and .ini files it forces you to invent your own reference semantics layered atop the basic syntax. (You didn't mention YAML, but I'll point out anyway that it has way too many subsyntaxes, punctuation tricks and suchlike to be easy for humans to write, while not expanding on the capabilities of XML that much.) Really the problem is that of the basic possible syntaxes, Python and XML are the only ones that come close to having adequate expressive power. XML falls short of being able to implement the more complex use cases without creating some sort of mini-programming language within XML, and Python requires verbose procedural constructs to create declarative hierarchies that would be easy in XML. Thus, the proposal that I've been fronting at the moment is actually a hybrid of XML-like structure and Python-like language characteristics. If it fails, I'm not sure what I'd fall back to. The nice thing about this "Python data language" is that I can see a lot of applications besides web stuff. For example, Chandler's UI really wants to have a more declarative format than can easily be done in pure Python, but a more computationally-flexible format than can easily be done in XML. I can basically see this "data language" being used for a lot of things that otherwise would be done crudely with .ini, .xml, ZConfig, or one of the other "standard" formats. Consider, for example, the grotesque hack of .ini syntax used by the "logging" module to define loggers, handlers, and filters -- and then consider what it could look like if it used this "data language" instead. I would say that there is definitely a real need for a declarative Python object definition syntax that supports nesting and internal references, and so if we can come up with something good, it can and should *become* a standard for such purposes, well beyond the scope of its initial mission of being a WSGI deployment syntax. From renesd at gmail.com Wed Aug 24 03:14:30 2005 From: renesd at gmail.com (Rene Dudfield) Date: Wed, 24 Aug 2005 11:14:30 +1000 Subject: [Web-SIG] cusom config files. was (PasteDeploy 0.1) In-Reply-To: <5.1.1.6.0.20050823202008.01b1c980@mail.telecommunity.com> References: <4308E29F.6040607@colorstudy.com> <64ddb72c050821233445da5238@mail.gmail.com> <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com> <5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com> <5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com> <430BA2DB.6070003@colorstudy.com> <5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com> <64ddb72c05082317175d69d49@mail.gmail.com> <5.1.1.6.0.20050823202008.01b1c980@mail.telecommunity.com> Message-ID: <64ddb72c05082318145a9cafce@mail.gmail.com> On 8/24/05, Phillip J. Eby wrote: > > I'm not sure what you mean by SQL DB, but if you mean putting the > configuration in a database, I don't see why that would be useful or > good. Similarly, I don't know what you mean by "file system". > By sql db I meant storing configuration in a database. Which has many advantages including scaling, searching, ACID, permissions etc etc. By filesystem I mean djb, /proc/ and others style. An example for virtual hosts might be: virtual_hosts/ virtual_hosts/1/ virtual_hosts/1/name virtual_hosts/1/ip_address virtual_hosts/1/port virtual_hosts/1/directory virtual_hosts/1/access_log_path Good luck with your configurationing! From michal at sabren.com Wed Aug 24 05:46:57 2005 From: michal at sabren.com (Michal Wallace) Date: Tue, 23 Aug 2005 23:46:57 -0400 (EDT) Subject: [Web-SIG] cusom config files. was (PasteDeploy 0.1) In-Reply-To: <5.1.1.6.0.20050823202008.01b1c980@mail.telecommunity.com> References: <5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com> <4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com> <64ddb72c050821233445da5238@mail.gmail.com> <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com> <5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com> <5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com> <430BA2DB.6070003@colorstudy.com> <5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com> <5.1.1.6.0.20050823202008.01b1c980@mail.telecommunity.com> Message-ID: On Tue, 23 Aug 2005, Phillip J. Eby wrote: > I would say that there is definitely a real need for a declarative Python > object definition syntax that supports nesting and internal references, and > so if we can come up with something good, it can and should *become* a > standard for such purposes, well beyond the scope of its initial mission of > being a WSGI deployment syntax. Well, if that's all you want to do, then why not just add some syntactic sugar to pickle? Sincerely, Michal J Wallace Sabren Enterprises, Inc. ------------------------------------- contact: michal at sabren.com hosting: http://www.cornerhost.com/ my site: http://www.withoutane.com/ ------------------------------------- From pje at telecommunity.com Wed Aug 24 07:55:02 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 24 Aug 2005 01:55:02 -0400 Subject: [Web-SIG] cusom config files. was (PasteDeploy 0.1) In-Reply-To: References: <5.1.1.6.0.20050823202008.01b1c980@mail.telecommunity.com> <5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com> <4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com> <64ddb72c050821233445da5238@mail.gmail.com> <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com> <5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com> <5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com> <430BA2DB.6070003@colorstudy.com> <5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com> <5.1.1.6.0.20050823202008.01b1c980@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050824015409.01b1eb18@mail.telecommunity.com> At 11:46 PM 8/23/2005 -0400, Michal Wallace wrote: >On Tue, 23 Aug 2005, Phillip J. Eby wrote: > > > I would say that there is definitely a real need for a declarative Python > > object definition syntax that supports nesting and internal references, > and > > so if we can come up with something good, it can and should *become* a > > standard for such purposes, well beyond the scope of its initial > mission of > > being a WSGI deployment syntax. > >Well, if that's all you want to do, then >why not just add some syntactic sugar >to pickle? pickles aren't a declarative format; they're procedural. From ianb at colorstudy.com Wed Aug 24 08:01:17 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Wed, 24 Aug 2005 01:01:17 -0500 Subject: [Web-SIG] cusom config files. was (PasteDeploy 0.1) In-Reply-To: <64ddb72c05082318145a9cafce@mail.gmail.com> References: <4308E29F.6040607@colorstudy.com> <64ddb72c050821233445da5238@mail.gmail.com> <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com> <5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com> <5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com> <430BA2DB.6070003@colorstudy.com> <5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com> <64ddb72c05082317175d69d49@mail.gmail.com> <5.1.1.6.0.20050823202008.01b1c980@mail.telecommunity.com> <64ddb72c05082318145a9cafce@mail.gmail.com> Message-ID: <430C0D2D.8050907@colorstudy.com> Rene Dudfield wrote: > On 8/24/05, Phillip J. Eby wrote: > >>I'm not sure what you mean by SQL DB, but if you mean putting the >>configuration in a database, I don't see why that would be useful or >>good. Similarly, I don't know what you mean by "file system". >> > > > By sql db I meant storing configuration in a database. Which has many > advantages including scaling, searching, ACID, permissions etc etc. Do you mean like putting the configuration files in a database? That shouldn't be a problem if there's a consistent way to access files (pkg_resources?) that handles (or has an interface for) virtual file systems. If it doesn't go in initially, I expect it would be a simple refactoring otherwise. If you don't intend to use text configuration files, then you'd have to code your own logic to put the pieces together. This is perfectly fine to do, and quite reasonable as well. If, for instance, you were doing some system where new applications were deployed automatically based on a very constrained configuration, you can easily do that programmatically in Python without involving any configuration files. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From michal at sabren.com Wed Aug 24 09:05:36 2005 From: michal at sabren.com (Michal Wallace) Date: Wed, 24 Aug 2005 03:05:36 -0400 (EDT) Subject: [Web-SIG] cusom config files. was (PasteDeploy 0.1) In-Reply-To: <5.1.1.6.0.20050824015409.01b1eb18@mail.telecommunity.com> References: <5.1.1.6.0.20050823202008.01b1c980@mail.telecommunity.com> <5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com> <4308E29F.6040607@colorstudy.com> <43094FB8.7090505@colorstudy.com> <64ddb72c050821233445da5238@mail.gmail.com> <5.1.1.6.0.20050822195812.01b1cb38@mail.telecommunity.com> <5.1.1.6.0.20050822221935.01b1e4f0@mail.telecommunity.com> <5.1.1.6.0.20050823095002.01b1dd60@mail.telecommunity.com> <5.1.1.6.0.20050823122659.0299ee68@mail.telecommunity.com> <430BA2DB.6070003@colorstudy.com> <5.1.1.6.0.20050823183512.01b64ae8@mail.telecommunity.com> <5.1.1.6.0.20050823202008.01b1c980@mail.telecommunity.com> <5.1.1.6.0.20050824015409.01b1eb18@mail.telecommunity.com> Message-ID: On Wed, 24 Aug 2005, Phillip J. Eby wrote: > At 11:46 PM 8/23/2005 -0400, Michal Wallace wrote: > > On Tue, 23 Aug 2005, Phillip J. Eby wrote: > > > > I would say that there is definitely a real need for a > > > declarative Python object definition syntax that supports > > > nesting and internal references, and so if we can come up > > > with something good, it can and should *become* a standard for > > > such purposes, well beyond the scope of its initial mission > > > of being a WSGI deployment syntax. > > > Well, if that's all you want to do, then > > why not just add some syntactic sugar > > to pickle? > > pickles aren't a declarative format; they're procedural. Huh. So it is. I didn't know that. :) I guess my real point is that it seems like a huge leap to come up with a whole new language, when python itself can do the job just fine. For example, if you set up a coding standard where data classes have an empty constructor, then you can do something like this: class Instance(Class, **kwargs): def __init__(self): self.class_ = Class self.kwargs = kwargs def eval(self): obj = self.class_() for k, v in kw.items(): setattr(obj, kw) return obj and maybe this for forward references: class Promise(thunk): def __init__(self, thunk): self.thunk = thunk def eval(self): return self.thunk() Then you can make all kinds of complicated things declaratively: class Node: pass def aComplicatedStructure(): loop = Instance(Node, next=Promise(lambda: loop)) return Instance(Node, child=loop, other={"a":"b"}) The only thing missing is to walk the tree and replace any Promise or Instance node with the result of its eval(). I'm sure there's a way to do all that without the restriction on __init__, too... Just add another class along those lines that handles parameters to the constructor. Now, I'm *not* saying this is the way to go for WSGI. But if you're going to shoot for the moon and propose a standard to use for *everything*, I think plain old python is more than adequate. Sincerely, Michal J Wallace Sabren Enterprises, Inc. ------------------------------------- contact: michal at sabren.com hosting: http://www.cornerhost.com/ my site: http://www.withoutane.com/ ------------------------------------- From ianb at colorstudy.com Mon Aug 29 02:01:21 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Sun, 28 Aug 2005 19:01:21 -0500 Subject: [Web-SIG] WSGI config, transparency Message-ID: <43125051.8040202@colorstudy.com> Anyway, I'm okay the discussion has died down a bit -- I'll keep working on paste.deploy and see how that works out, and revisit this later. Right now I'm more interested in how this effects the rest of the "system". One issue I've come upon is how to make applications and frameworks both encapsulated and transparent. Specifically if I have an application which uses a framework, and the framework uses several pieces of middleware, I need both *some* of the framework configuration to be exposed, and some application configuration parameters to be exposed. This can continue further when one logical application is composed of subapplications. For instance, imagine I have an admin interface built on Subway, with a web frontend in Wareweb, and a WebDAV interface from PyFileServer. The three applications/frameworks can create a single logical application. But how do I present a unified face for the application? With global and flat configuration, the default is nearly complete transparency, with some potential for collision. Without global configuration the default is an opaque system, with no possibility of collision. In a practical sense the global configuration is easier to get working, and more adaptable for the system administrator. Anyway, that's the issue I'm thinking about now. In paste.deploy it's kind of handled by: [app:someapp] set master_setting = foo ... include subapp somehow ... [app:subapp] get some_local_setting = master_setting -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From ianb at colorstudy.com Tue Aug 30 00:57:36 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Mon, 29 Aug 2005 17:57:36 -0500 Subject: [Web-SIG] Session interface, v2 In-Reply-To: <4303FEC5.3050408@colorstudy.com> References: <4303FEC5.3050408@colorstudy.com> Message-ID: <431392E0.4010001@colorstudy.com> Ian Bicking wrote: > Same location: > > http://svn.colorstudy.com/home/ianb/scarecrow_session_interface.py BTW, session stores aren't really what I'm focused on at the moment. I just thought I'd be helpful moving the discussion forward past general requirements to something easier to discuss, like an interface. But I doubt I'll be working on this anytime soon, as there's lots of other projects that I'm more focused on right now. So I'd encourage anyone interested in this to start some work on it, perhaps using this interface as a starting point. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org