From ben at groovie.org  Sat Jan  7 06:39:02 2006
From: ben at groovie.org (Ben Bangert)
Date: Fri, 6 Jan 2006 21:39:02 -0800
Subject: [Web-SIG] Session Middleware....apparently on hold,
	why not solve an easier problem first?
Message-ID: <ADDE8E33-AADA-4DC8-8362-191E02F20035@groovie.org>

I've been using wsgi middleware quite a bit lately, partly because my  
framework uses it extensively, and it occurred to me that maybe the  
whole session middleware stuff is an overly complex solution that  
might not ever work ideally.

This isn't to say that the problem doesn't exist, rather that perhaps  
looking to put the session itself in middleware isn't the greatest  
solution at this juncture. As the debate regarding how to make  
session middleware showed, many frameworks have different ideas about  
how the session works, ideas they aren't ready to give up.

However, almost all (if not all) session implementations do share two  
common themes:
- A unique ID is used to tag the session
- This unique ID is put into a cookie, typically signed/hashed some  
way to prevent hijacking/spoofing

The key reasons for session middleware:
  - Ability to share data between multiple WSGI apps regardless of  
framework
  - Reduce duplication of effort (though all the frameworks already  
have sessions....)

First, a huge problem I see despite all the other debate on session  
middleware. If you have multiple instances of the same webapp, you'd  
need to tell each webapp to prefix itself in some unique way to avoid  
session key conflicts.

Second, to share data, the webapps will need to know in advance no  
matter what, that there is extra "stuff" they might want to look for  
in the session.

So here's my solution:
- No session middleware
- Cookie middleware (paste.auth.cookie already provides a secure  
cookie for you)
- Modify the frameworks in question to have the ability to get their  
session ID from environ['session.id'], rather than doing cookie  
business themselves

Essentially, rather than trying to have frameworks ditch their  
session in favor of something else, the framework keeps its session,  
and gets the ID of it from a more minimal cookie middleware layer. If  
webapps want to share small tidbits of information, they add it to  
the secure cookie. The most common, besides for a session ID (so the  
other webapp instances can have sessions stay active across webapps),  
would be a login token.

This way if the user crosses into a different webapp that the user  
hasn't logged into yet, the other webapp would see by the login token  
that the user has successfully logged in elsewhere, and it would then  
setup its own session with the same ID, and load the user record for  
that login token. Maybe you have a back-end xml-rpc server that gives  
you user profile data so you can retain profile data across webapps  
that are logging in someone who never setup a user record with them...

Keeping the ID and login token in the cookie makes it easy to scale  
across multiple servers, etc.

As far as I can tell, all thats required to get this going is the  
desire to do it, and minor feature enhancements to some frameworks  
(no idea which ones). The framework just needs the ability to not use  
its own cookies for sessions, and load a session given an ID.

Anyways, any thoughts on this? The requirements are low and in reach  
right now with a minimal of effort, so rather than just talk, this  
will be easy to actually do and see how it works...

Cheers,
Ben

From ianb at colorstudy.com  Mon Jan  9 07:45:26 2006
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon, 09 Jan 2006 00:45:26 -0600
Subject: [Web-SIG] ANN: Python Paste 0.4
Message-ID: <43C20686.4030309@colorstudy.com>

I'm pleased to announce the 0.4 release of the Paste Suite: Paste
core, Script, Deploy, and WebKit.

Locations
---------

Web:
   http://pythonpaste.org
Download:
   http://cheeseshop.python.org/pypi/Paste
   http://cheeseshop.python.org/pypi/PasteScript
   http://cheeseshop.python.org/pypi/PasteDeploy
   http://cheeseshop.python.org/pypi/PasteWebKit
     (or just use easy_install!)
Mailing list, etc:
   http://pythonpaste.org/community/

Changes
-------

This release brings bugfixes and new code.  Thanks to Clark Evans for
many of the contributions, and to Ben Bangert for giving this project
more exercise.  A quick overview:

* A new exception-catching middleware (evalexception) for interactive
   debugging, including through-the-browser execution of Python code in
   the context of a traceback.

* A paste.auth package for authentication/identification, including
   HTTP Basic/Digest, cookies, CAS (single-signon), and OpenID (another
   single-signon)

* Many added improvements and conveniences for people working on the
   HTTP level, in the httpexceptions, httpheaders, fileapp modules.

* Moving modules around to make the layout more logical (old imports
   are supported with warnings).

* Paste Script has experimental new commands to install and setup web
   applications.  Also improvements to the command-line interaction.

* Paste Deploy and WebKit are largely maintenance releases.

What Is It?
-----------

Paste is a set of tools for building web applications and frameworks
using WSGI.  All the pieces are framework-neutral, and facilitate both
high-level (non-leaky) abstractions built on top of them (frameworks)
and low-level access to raw WSGI/HTTP.

Paste Deploy is a way to configure WSGI applications and application
stacks using Egg plugin facilities and a simple configuration file.
It offers uniformity to web application deployment and configuration.

Paste Script is a pluggable frontend for managing projects, and for
serving up web applications using Paste Deploy and pluggable WSGI
servers.

Paste WebKit is an implementation of Webware/WebKit using WSGI and
the tools in Paste.


-- 
Ian Bicking  |  ianb at colorstudy.com  |  http://blog.ianbicking.org

From web-sig at nice.net.nz  Wed Jan 11 03:53:23 2006
From: web-sig at nice.net.nz (Hadley Rich)
Date: Wed, 11 Jan 2006 15:53:23 +1300
Subject: [Web-SIG] Simple CGI sessions
Message-ID: <200601111553.23758.web-sig@nice.net.nz>

Hi all,

From a quick browse through the archives I can see sessions are a common 
subject here. Hopefully you're all not to sick of it to comment.

I've been wanting to use python for a web project I'm doing at the moment. I 
need to use sessions in this project and want to use cgi rather than 
mod_python or a framework. I couldn't find anything which specifically suited 
my needs on the web so I wrote this simple little module. 

Rather than messing up the whitespace I've posted it on my site here[1]

Since this is really the first bit of python I have written I'd appreciate 
anyone to see if they can see any glaring security holes or other problems 
with it. Feel free to be as harsh as you like.

Thanks.

hads

[1]http://nice.net.nz/tools/pysession
-- 
The faster I go, the behinder I get.
		-- Lewis Carroll

From exarkun at divmod.com  Wed Jan 11 04:58:05 2006
From: exarkun at divmod.com (Jean-Paul Calderone)
Date: Tue, 10 Jan 2006 22:58:05 -0500
Subject: [Web-SIG] Simple CGI sessions
In-Reply-To: <200601111553.23758.web-sig@nice.net.nz>
Message-ID: <20060111035805.31401.1930500669.divmod.quotient.79@ohm>

On Wed, 11 Jan 2006 15:53:23 +1300, Hadley Rich <web-sig at nice.net.nz> wrote:
>Hi all,
>
>>From a quick browse through the archives I can see sessions are a common
>subject here. Hopefully you're all not to sick of it to comment.
>
>I've been wanting to use python for a web project I'm doing at the moment. I
>need to use sessions in this project and want to use cgi rather than
>mod_python or a framework. I couldn't find anything which specifically suited
>my needs on the web so I wrote this simple little module.
>
>Rather than messing up the whitespace I've posted it on my site here[1]
>
>Since this is really the first bit of python I have written I'd appreciate
>anyone to see if they can see any glaring security holes or other problems
>with it. Feel free to be as harsh as you like.

Not related to your code at all...  You probably don't want to release the code under the PSF license.  There are various boring reasons that I won't bother to go into (the topic has been discussed on various lists, I'm sure you can find a thread if you want).  Consider the MIT license, instead.  It's much simpler and /probably/ means what you hoped the PSF license meant.

Jean-Paul

From web-sig at nice.net.nz  Wed Jan 11 05:13:18 2006
From: web-sig at nice.net.nz (Hadley Rich)
Date: Wed, 11 Jan 2006 17:13:18 +1300
Subject: [Web-SIG] Simple CGI sessions
In-Reply-To: <20060111035805.31401.1930500669.divmod.quotient.79@ohm>
References: <20060111035805.31401.1930500669.divmod.quotient.79@ohm>
Message-ID: <200601111713.18363.web-sig@nice.net.nz>

On Wednesday 11 January 2006 16:58, Jean-Paul Calderone wrote:
> Not related to your code at all... ?You probably don't want to release the
> code under the PSF license. ?There are various boring reasons that I won't
> bother to go into (the topic has been discussed on various lists, I'm sure
> you can find a thread if you want). ?Consider the MIT license, instead.
> ?It's much simpler and /probably/ means what you hoped the PSF license
> meant.

Interesting. Thanks for that, the MIT license certainly looks nice and simple 
and, as you say, is what I was intending. Maybe if I get really bored I'll go 
and try to find those discussions you were talking about :)

hads

-- 
Make it right before you make it faster.

From cce at clarkevans.com  Wed Jan 11 17:06:21 2006
From: cce at clarkevans.com (Clark C. Evans)
Date: Wed, 11 Jan 2006 11:06:21 -0500
Subject: [Web-SIG] transaction  progress with cgi.FieldStorage
In-Reply-To: <43AAE16B.9040006@gmail.com>
References: <43AAE16B.9040006@gmail.com>
Message-ID: <20060111160621.GB61466@prometheusresearch.com>

kai,

I don't know if your still interested, but I wrote something which
might help with your end-user goal of a progress meter:

   http://svn.w4py.org/Paste/trunk/paste/progress.py

Kind Regards,

Clark

P.S. it is far from perfect and marked experimental

On Thu, Dec 22, 2005 at 12:24:59PM -0500, kai wrote:
| Hi All,
| this is my first post on this list. I am working on a way to monitor the 
| progress of reading a file upload from wsgi.input.  I can currently 
| monitor the overall transfer and when individual files of a multiple 
| file upload are completed. The ultimate goal of this is to be able to 
| display a progress meter when someone is uploading a file.
| 
| To do this I subclassed cgi.FieldStorage but when I finished I had 
| modified most of the non-trivial methods just to hook in something to 
| monitor the transfer progress, oops.
| 
| Has anyone else found FieldStorage insufficient for certain tasks?
| Is there a general need for a more flexible FieldStorage replacement?
| 
| 
| kai keliikuli
| _______________________________________________
| Web-SIG mailing list
| Web-SIG at python.org
| Web SIG: http://www.python.org/sigs/web-sig
| Unsubscribe: http://mail.python.org/mailman/options/web-sig/cce%40clarkevans.comkkkkk
| 

From kai.keliikuli at gmail.com  Sat Jan 14 02:05:30 2006
From: kai.keliikuli at gmail.com (kai)
Date: Fri, 13 Jan 2006 20:05:30 -0500
Subject: [Web-SIG] transaction  progress with cgi.FieldStorage
In-Reply-To: <20060111160621.GB61466@prometheusresearch.com>
References: <43AAE16B.9040006@gmail.com>
	<20060111160621.GB61466@prometheusresearch.com>
Message-ID: <43C84E5A.3050008@gmail.com>

Hey Thanks Clark,
I've been gone for a bit and off this project. I'll dig into this
this tommorow.
Kai

Clark C. Evans wrote:
> kai,
> 
> I don't know if your still interested, but I wrote something which
> might help with your end-user goal of a progress meter:
> 
>    http://svn.w4py.org/Paste/trunk/paste/progress.py
> 
> Kind Regards,
> 
> Clark
> 
> P.S. it is far from perfect and marked experimental
> 
> On Thu, Dec 22, 2005 at 12:24:59PM -0500, kai wrote:
> | Hi All,
> | this is my first post on this list. I am working on a way to monitor the 
> | progress of reading a file upload from wsgi.input.  I can currently 
> | monitor the overall transfer and when individual files of a multiple 
> | file upload are completed. The ultimate goal of this is to be able to 
> | display a progress meter when someone is uploading a file.
> | 
> | To do this I subclassed cgi.FieldStorage but when I finished I had 
> | modified most of the non-trivial methods just to hook in something to 
> | monitor the transfer progress, oops.
> | 
> | Has anyone else found FieldStorage insufficient for certain tasks?
> | Is there a general need for a more flexible FieldStorage replacement?
> | 
> | 
> | kai keliikuli
> | _______________________________________________
> | Web-SIG mailing list
> | Web-SIG at python.org
> | Web SIG: http://www.python.org/sigs/web-sig
> | Unsubscribe: http://mail.python.org/mailman/options/web-sig/cce%40clarkevans.comkkkkk
> | 
> 


From mo.babaei at gmail.com  Sat Jan 14 12:24:25 2006
From: mo.babaei at gmail.com (Mohamad Babaei)
Date: Sat, 14 Jan 2006 14:54:25 +0330
Subject: [Web-SIG] Best way to create a random image
Message-ID: <5bf3a41f0601140324j5fe06ca7ld7cf101fa4edf54d@mail.gmail.com>

Hello,
What's the best way to create a random image ?(something like images used in
some forms)


Regards,
M.B
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/web-sig/attachments/20060114/0f26ebcd/attachment.html 

From sa at c-area.ch  Sat Jan 14 17:20:42 2006
From: sa at c-area.ch (Steven Armstrong)
Date: Sat, 14 Jan 2006 17:20:42 +0100
Subject: [Web-SIG] Best way to create a random image
In-Reply-To: <5bf3a41f0601140324j5fe06ca7ld7cf101fa4edf54d@mail.gmail.com>
References: <5bf3a41f0601140324j5fe06ca7ld7cf101fa4edf54d@mail.gmail.com>
Message-ID: <43C924DA.7000702@c-area.ch>

On 01/14/06 12:24, Mohamad Babaei wrote:
> Hello,
> What's the best way to create a random image ?(something like images used in
> some forms)
> 

Here's an example how we've done it for the Pyblosxom nospam plugin [1].

[1] http://www.c-area.ch/code/pyblosxom/plugins/nospam.py

hth
cheers
Steven

From jim at zope.com  Sun Jan 22 17:22:09 2006
From: jim at zope.com (Jim Fulton)
Date: Sun, 22 Jan 2006 11:22:09 -0500
Subject: [Web-SIG] Communicating authenticated user information
Message-ID: <43D3B131.7080005@zope.com>

Typically, web servers provide access logs that include a label
for the authenticated user.

Often, WSGI applications (or middleware) provide their own user
authentication facilities.  Well, Zope does. :)

There doesn't seem to be a standard way for WSGI applications or
middleware to communicate the information necessary for a server
to log the authenticated user back to the server.

Am I missing something?  How do other people handle this?

Is Zope the only WSGI application that performs authentication
itself?

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From pje at telecommunity.com  Sun Jan 22 17:34:03 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 22 Jan 2006 11:34:03 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <43D3B131.7080005@zope.com>
Message-ID: <5.1.1.6.0.20060122113155.01e53728@mail.telecommunity.com>

At 11:22 AM 1/22/2006 -0500, Jim Fulton wrote:
>Typically, web servers provide access logs that include a label
>for the authenticated user.
>
>Often, WSGI applications (or middleware) provide their own user
>authentication facilities.  Well, Zope does. :)
>
>There doesn't seem to be a standard way for WSGI applications or
>middleware to communicate the information necessary for a server
>to log the authenticated user back to the server.
>
>Am I missing something?  How do other people handle this?
>
>Is Zope the only WSGI application that performs authentication
>itself?

I think Zope is the only WSGI application that cares about communicating 
this information back to the web server's logs.  :)  Or at least, the only 
one whose author has said so.  :)

Perhaps an "X-Authenticated-User: foo" header could be added in a future 
spec version?  (And as an optional feature in the current PEP.)  This seems 
a simpler way to incorporate the feature than adding an extension API to 
environ.


From jim at zope.com  Sun Jan 22 18:13:50 2006
From: jim at zope.com (Jim Fulton)
Date: Sun, 22 Jan 2006 12:13:50 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <5.1.1.6.0.20060122113155.01e53728@mail.telecommunity.com>
References: <5.1.1.6.0.20060122113155.01e53728@mail.telecommunity.com>
Message-ID: <43D3BD4E.3050800@zope.com>

Phillip J. Eby wrote:
> At 11:22 AM 1/22/2006 -0500, Jim Fulton wrote:
> 
>> Typically, web servers provide access logs that include a label
>> for the authenticated user.
>>
>> Often, WSGI applications (or middleware) provide their own user
>> authentication facilities.  Well, Zope does. :)
>>
>> There doesn't seem to be a standard way for WSGI applications or
>> middleware to communicate the information necessary for a server
>> to log the authenticated user back to the server.
>>
>> Am I missing something?  How do other people handle this?
>>
>> Is Zope the only WSGI application that performs authentication
>> itself?
> 
> 
> I think Zope is the only WSGI application that cares about communicating 
> this information back to the web server's logs.  :)

I hope that's not true.  Certainly, if anyone else is doing authentication
in their applications or middleware, they *should* care about getting
information into the access logs.

 > Or at least, the
> only one whose author has said so.  :)

Please, someone else speak up. :)


> Perhaps an "X-Authenticated-User: foo" header could be added in a future 
> spec version?  (And as an optional feature in the current PEP.) 

Perhaps. Note that it should be clear that this is soley for use
in the access log.  There should be no assumption that this is
a principal id or a login name.  It is really just a label for the
log.  To make this clearer, I'd use something like:
"X-Access-User-Label: foo".

 > This
> seems a simpler way to incorporate the feature than adding an extension 
> API to environ.

Why is that?  Isn't the env meant for communication between the WSGI
layers?  I'm not sure I'd want to send this information back to the browser.

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From jim at zope.com  Sun Jan 22 18:28:01 2006
From: jim at zope.com (Jim Fulton)
Date: Sun, 22 Jan 2006 12:28:01 -0500
Subject: [Web-SIG] Deployment tools
Message-ID: <43D3C0A1.5050902@zope.com>


Who is working on deployment tools for WSGI?  I'm aware of Paste Deploy.
Are there any other efforts underway?

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From pywebsig at xhaus.com  Sun Jan 22 18:45:34 2006
From: pywebsig at xhaus.com (Alan Kennedy)
Date: Sun, 22 Jan 2006 17:45:34 +0000
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <43D3BD4E.3050800@zope.com>
References: <5.1.1.6.0.20060122113155.01e53728@mail.telecommunity.com>
	<43D3BD4E.3050800@zope.com>
Message-ID: <43D3C4BE.1090806@xhaus.com>

[Jim Fulton]
 >>>Is Zope the only WSGI application that performs authentication
 >>>itself?

[Phillip J. Eby]
 >>I think Zope is the only WSGI application that cares about
 >> communicating this information back to the web server's logs.  :)

[Jim Fulton]
 > I hope that's not true.  Certainly, if anyone else is doing
 > authentication in their applications or middleware, they
 > *should* care about getting information into the access logs.

Well, Apache records auth info in logs as well, and it seems like a 
perfectly reasonable thing for a server to do .....

http://httpd.apache.org/docs/2.0/logs.html#accesslog

[Phillip J. Eby]
 >> Perhaps an "X-Authenticated-User: foo" header could be added
 >> in a future spec version?  (And as an optional feature in the
 >> current PEP.)

[Jim Fulton]
 > Perhaps. Note that it should be clear that this is soley for use
 > in the access log.  There should be no assumption that this is
 > a principal id or a login name.  It is really just a label for the
 > log.  To make this clearer, I'd use something like:
 > "X-Access-User-Label: foo".

Sending X-headers seems hacky, and results in unnecessary information 
being transmitted back to the user (possibly revealing sensitive 
information, or opening security holes?)

I think that the communication mechanism for auth information is 
possibly best served by a simple convention between auth middleware 
authors. Perhaps servers that are aware that auth middleware is in use 
can put a callable into the WSGI environment, which auth middleware 
calls when it has auth'ed the user?

[Phillip J. Eby]
 > This seems a simpler way to incorporate the feature than adding
 > an extension API to environ.

[Jim Fulton]
 > Why is that?  Isn't the env meant for communication between
 > the WSGI layers?  I'm not sure I'd want to send this information
 > back to the browser.

I think an API could be very simple, and optional for servers that know 
they won't be logging auth information.

I agree about not sending this information back to the user: it's 
unnecessary and potentially dangerous.

Regards,

Alan Kennedy.

From pje at telecommunity.com  Sun Jan 22 19:25:49 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 22 Jan 2006 13:25:49 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <43D3C4BE.1090806@xhaus.com>
References: <43D3BD4E.3050800@zope.com>
	<5.1.1.6.0.20060122113155.01e53728@mail.telecommunity.com>
	<43D3BD4E.3050800@zope.com>
Message-ID: <5.1.1.6.0.20060122132453.01e4c970@mail.telecommunity.com>

At 05:45 PM 1/22/2006 +0000, Alan Kennedy wrote:
>I agree about not sending this information back to the user: it's
>unnecessary and potentially dangerous.

Yep, it would be really dangerous to let me know who I just logged in to an 
application as.  I might find out who I really am! ;)


From jim at zope.com  Sun Jan 22 19:30:59 2006
From: jim at zope.com (Jim Fulton)
Date: Sun, 22 Jan 2006 13:30:59 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <5.1.1.6.0.20060122132453.01e4c970@mail.telecommunity.com>
References: <43D3BD4E.3050800@zope.com>	<5.1.1.6.0.20060122113155.01e53728@mail.telecommunity.com>	<43D3BD4E.3050800@zope.com>
	<5.1.1.6.0.20060122132453.01e4c970@mail.telecommunity.com>
Message-ID: <43D3CF63.40408@zope.com>

Phillip J. Eby wrote:
> At 05:45 PM 1/22/2006 +0000, Alan Kennedy wrote:
> 
>>I agree about not sending this information back to the user: it's
>>unnecessary and potentially dangerous.
> 
> 
> Yep, it would be really dangerous to let me know who I just logged in to an 
> application as.  I might find out who I really am! ;)

The point is that there's really no reason to send this to the client.
It is certainly conceivable that some app could consider this
information sensitive.

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From pywebsig at xhaus.com  Sun Jan 22 20:08:05 2006
From: pywebsig at xhaus.com (Alan Kennedy)
Date: Sun, 22 Jan 2006 19:08:05 +0000
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <5.1.1.6.0.20060122132453.01e4c970@mail.telecommunity.com>
References: <43D3BD4E.3050800@zope.com>
	<5.1.1.6.0.20060122113155.01e53728@mail.telecommunity.com>
	<43D3BD4E.3050800@zope.com>
	<5.1.1.6.0.20060122132453.01e4c970@mail.telecommunity.com>
Message-ID: <43D3D815.9020104@xhaus.com>

[Alan Kennedy]
>> I agree about not sending this information back to the user: it's
>> unnecessary and potentially dangerous.

[Phillip J. Eby]
> Yep, it would be really dangerous to let me know who I just logged in to 
> an application as.  I might find out who I really am! ;)

Very droll ;-)

What if other information, such as meta-information about the auth 
directory or database in which the credentials were looked up, was also 
communicated through X-headers, e.g. server connection details, etc.

Happy for that to go back to the user too?

If X-headers are to be used in WSGI, I think there should be something 
in the spec about whether or not they should be transmitted to the user.

Alan.

From ianb at colorstudy.com  Sun Jan 22 22:24:52 2006
From: ianb at colorstudy.com (Ian Bicking)
Date: Sun, 22 Jan 2006 15:24:52 -0600
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <43D3B131.7080005@zope.com>
References: <43D3B131.7080005@zope.com>
Message-ID: <43D3F824.4050903@colorstudy.com>

Jim Fulton wrote:
> Typically, web servers provide access logs that include a label
> for the authenticated user.
> 
> Often, WSGI applications (or middleware) provide their own user
> authentication facilities.  Well, Zope does. :)
> 
> There doesn't seem to be a standard way for WSGI applications or
> middleware to communicate the information necessary for a server
> to log the authenticated user back to the server.
> 
> Am I missing something?  How do other people handle this?
> 
> Is Zope the only WSGI application that performs authentication
> itself?

I do the authentication in my apps, but I am sloppy and do not record it 
;)  Well, that's not completely true.  In the rough access logger in 
Paste (http://pythonpaste.org/paste/translogger.py.html?f=8&l=80#8) I 
include environ['REMOTE_USER'] if it is present.  So if the WSGI environ 
that the middleware sees initially is the same environ that the 
authenticator writes too, then the middleware will see that change on 
the way out and include it.  Using a header would solve the problem 
where the environment is completely changed (unlikely), or copied before 
REMOTE_USER is assigned (fairly likely).

I can imagine a convention of X-WSGI-Authenticated, where X-WSGI-* gets 
stripped by the server, and any middleware that is interested can watch 
for these headers.  Another option is a callback, but potentially 
multiple middleware's will be interested (multiple logs isn't hard to 
imagine), and that complicates the callback.

-- 
Ian Bicking  |  ianb at colorstudy.com  |  http://blog.ianbicking.org

From jim at zope.com  Sun Jan 22 22:31:01 2006
From: jim at zope.com (Jim Fulton)
Date: Sun, 22 Jan 2006 16:31:01 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <43D3F824.4050903@colorstudy.com>
References: <43D3B131.7080005@zope.com> <43D3F824.4050903@colorstudy.com>
Message-ID: <43D3F995.8050303@zope.com>

Ian Bicking wrote:
> Jim Fulton wrote:
> 
>> Typically, web servers provide access logs that include a label
>> for the authenticated user.
>>
>> Often, WSGI applications (or middleware) provide their own user
>> authentication facilities.  Well, Zope does. :)
>>
>> There doesn't seem to be a standard way for WSGI applications or
>> middleware to communicate the information necessary for a server
>> to log the authenticated user back to the server.
>>
>> Am I missing something?  How do other people handle this?
>>
>> Is Zope the only WSGI application that performs authentication
>> itself?
> 
> 
> I do the authentication in my apps,

Cool.

 > but I am sloppy and do not record it
> ;)  Well, that's not completely true.  In the rough access logger in 
> Paste (http://pythonpaste.org/paste/translogger.py.html?f=8&l=80#8) I 
> include environ['REMOTE_USER'] if it is present.   So if the WSGI environ
> that the middleware sees initially is the same environ that the 
> authenticator writes too, then the middleware will see that change on 
> the way out and include it.  Using a header would solve the problem 
> where the environment is completely changed (unlikely), or copied before 
> REMOTE_USER is assigned (fairly likely).
> 
> I can imagine a convention of X-WSGI-Authenticated, where X-WSGI-* gets 
> stripped by the server,

Works for me.

 > and any middleware that is interested can watch
> for these headers.  Another option is a callback, but potentially 
> multiple middleware's will be interested (multiple logs isn't hard to 
> imagine), and that complicates the callback.

I think just scribbling a value into the env or headers is fine.

JIm

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From srichter at cosmos.phy.tufts.edu  Mon Jan 23 02:39:49 2006
From: srichter at cosmos.phy.tufts.edu (Stephan Richter)
Date: Sun, 22 Jan 2006 20:39:49 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <5.1.1.6.0.20060122113155.01e53728@mail.telecommunity.com>
References: <5.1.1.6.0.20060122113155.01e53728@mail.telecommunity.com>
Message-ID: <200601222039.49428.srichter@cosmos.phy.tufts.edu>

On Sunday 22 January 2006 11:34, Phillip J. Eby wrote:
> >Is Zope the only WSGI application that performs authentication
> >itself?
>
> I think Zope is the only WSGI application that cares about communicating
> this information back to the web server's logs. ?:) ?Or at least, the only
> one whose author has said so. ?:)

Well, I originally worked with Itamar and James on the Twisted integration 
into Zope 3, when we noticed this problem.

> Perhaps an "X-Authenticated-User: foo" header could be added in a future
> spec version? ?(And as an optional feature in the current PEP.) ?This seems
> a simpler way to incorporate the feature than adding an extension API to
> environ.

 We considered and even implemented originally suggestions you made, but 
considered it a security problem and dismissed it. And a "convention" is not 
really a viable solution either, since it defeats the point of a non-specific 
API, like WSGI.

We thought about the problem quiet a bit and decided that the user is really 
the only thing that the log really has to know from the application. So a 
simple callback that expects a simple string would be just fine.

Regards,
Stephan
-- 
Stephan Richter
CBU Physics & Chemistry (B.S.) / Tufts Physics (Ph.D. student)
Web2k - Web Software Design, Development and Training

From cce at clarkevans.com  Mon Jan 23 18:42:23 2006
From: cce at clarkevans.com (Clark C. Evans)
Date: Mon, 23 Jan 2006 12:42:23 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <43D3F824.4050903@colorstudy.com>
References: <43D3B131.7080005@zope.com> <43D3F824.4050903@colorstudy.com>
Message-ID: <20060123174223.GD29367@prometheusresearch.com>

I'm using paste.auth.* modules, and they fill-in environ['REMOTE_USER']
with the authenticated user.  I then use this information in later
processing stages and it works nicely for me and is quite simple.

On Sun, Jan 22, 2006 at 03:24:52PM -0600, Ian Bicking wrote:
| So if the WSGI environ that the middleware sees initially is the same 
| environ that the authenticator writes too, then the middleware will
| see that change on the way out and include it.

For this case, I would imagine that a good transaction logger 
would come *before* the authentication middleware, stuffing away 
the ``environ`` and then hook into the ``start_response()`` callback 
to actually log the transaction (including the ``REMOTE_USER``) when
the response is created.  I don't see how a header would help here.

| Using a header would solve the problem where the environment is 
| completely changed (unlikely), or copied before REMOTE_USER is 
| assigned (fairly likely).

Ok.  If you are "completely changing" the environment, you should
just copy it and sent the copy on, so let us address these two cases
together.  In this situation, you also have to assume that the
authentication middleware happens *after* the request re-write or
you're in the situation described above (the logger can get the
REMOTE_USER).  I can picture two use-cases for this situation:

  Your server is doing a "internal redirect" to a sub-application
  that needs its own authentication.  In this case, why not just
  do an external redirect?

  Your server is doing N sub-requests, some of which require their
  own authentication, and assembling the results into a single
  response.  In this case, you'll need your own custom logging
  mechanism anyway... and I cannot imagine the complexity of 
  having N sub-branches that might return a 401.

In short, I can't think of any generic use-cases for this second
scenerio (where authentication happens *after* a complete re-write
of the environ) that would work with a generic request logging;
and I don't see how a header would help.

Perhaps I'm missing something?

Best,

Clark

From pje at telecommunity.com  Mon Jan 23 20:25:35 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 23 Jan 2006 14:25:35 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <20060123174223.GD29367@prometheusresearch.com>
References: <43D3F824.4050903@colorstudy.com> <43D3B131.7080005@zope.com>
	<43D3F824.4050903@colorstudy.com>
Message-ID: <5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>

At 12:42 PM 1/23/2006 -0500, Clark C. Evans wrote:
>In short, I can't think of any generic use-cases for this second
>scenerio (where authentication happens *after* a complete re-write
>of the environ) that would work with a generic request logging;
>and I don't see how a header would help.
>
>Perhaps I'm missing something?

You simply can't use environ values to communicate *up* the WSGI stack, 
since at no level is it guaranteed you have the "same" 
dictionary.  Response headers and callables (or mutables) in the environ 
are the only way to send stuff upstream.  You also have to be careful that 
any upstream communication doesn't bypass something that middleware should 
be allowed to control.

In the case of authentication, it should be sufficient to have a callable 
or mutable in the environ that can be called or set more than once per 
request, i.e. it only takes effect once the request is completed.  This 
allows outer middleware to override what inner middleware or the 
application set it to.


From cce at clarkevans.com  Mon Jan 23 20:52:51 2006
From: cce at clarkevans.com (Clark C. Evans)
Date: Mon, 23 Jan 2006 14:52:51 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
References: <43D3F824.4050903@colorstudy.com> <43D3B131.7080005@zope.com>
	<43D3F824.4050903@colorstudy.com>
	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
Message-ID: <20060123195251.GB31096@prometheusresearch.com>

On Mon, Jan 23, 2006 at 02:25:35PM -0500, Phillip J. Eby wrote:
| You simply can't use environ values to communicate *up* the WSGI stack, 
| since at no level is it guaranteed you have the "same" 
| dictionary.

The same could be said for response headers, no?  You've got a WSGI
stack of A, B, and C.  Just beacuse "C" sets a header intended for A,
doesn't mean that B has to pass it on. 

| In the case of authentication, it should be sufficient to have a 
| callable or mutable in the environ that can be called or set more than 
| once per request, i.e. it only takes effect once the request is 
| completed.  This allows outer middleware to override what inner 
| middleware or the application set it to.

This is exactly what environ['REMOTE_USER'] is, a mutable value in
the environ that can be set more than once, and only the current 
value matters when create_response hits the request log middleware.

| Response headers and callables (or mutables) in the environ 
| are the only way to send stuff upstream.  You also have to be careful 
| that any upstream communication doesn't bypass something that middleware 
| should be allowed to control.

Of course you have to be careful and work out a protocol that all
intermediate middleware components agree upon.  However, beyond that
I fail to understand the distinctions you're making or why they 
are important.  Perhaps a tangable example would help to educate me?

Thanks so much,

Clark

From srichter at cosmos.phy.tufts.edu  Mon Jan 23 20:54:02 2006
From: srichter at cosmos.phy.tufts.edu (Stephan Richter)
Date: Mon, 23 Jan 2006 14:54:02 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
References: <43D3F824.4050903@colorstudy.com>
	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
Message-ID: <200601231454.02610.srichter@cosmos.phy.tufts.edu>

On Monday 23 January 2006 14:25, Phillip J. Eby wrote:
> In the case of authentication, it should be sufficient to have a callable
> or mutable in the environ that can be called or set more than once per
> request, i.e. it only takes effect once the request is completed. ?This
> allows outer middleware to override what inner middleware or the
> application set it to.

+1. If we would have this in the specs, I would be totally happy.

Regards,
Stephan
-- 
Stephan Richter
CBU Physics & Chemistry (B.S.) / Tufts Physics (Ph.D. student)
Web2k - Web Software Design, Development and Training

From pje at telecommunity.com  Mon Jan 23 21:02:09 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 23 Jan 2006 15:02:09 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <20060123195251.GB31096@prometheusresearch.com>
References: <5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<43D3F824.4050903@colorstudy.com> <43D3B131.7080005@zope.com>
	<43D3F824.4050903@colorstudy.com>
	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>

At 02:52 PM 1/23/2006 -0500, Clark C. Evans wrote:
>On Mon, Jan 23, 2006 at 02:25:35PM -0500, Phillip J. Eby wrote:
>| You simply can't use environ values to communicate *up* the WSGI stack,
>| since at no level is it guaranteed you have the "same"
>| dictionary.
>
>The same could be said for response headers, no?  You've got a WSGI
>stack of A, B, and C.  Just beacuse "C" sets a header intended for A,
>doesn't mean that B has to pass it on.

That's a feature, not a bug.  However, the presumption is that middleware 
will in general pass through the same *set* of environ variables or 
response headers as it received, unless it has a reason to modify 
them.  This does not require middleware to pass the same 'environ' or 
'header' *objects*, just that in general they should pass through the 
*contents*.


>| In the case of authentication, it should be sufficient to have a
>| callable or mutable in the environ that can be called or set more than
>| once per request, i.e. it only takes effect once the request is
>| completed.  This allows outer middleware to override what inner
>| middleware or the application set it to.
>
>This is exactly what environ['REMOTE_USER'] is, a mutable value in
>the environ that can be set more than once,

Strings aren't mutable.


>and only the current
>value matters when create_response hits the request log middleware.

The current value in *which* environ?  The application doesn't necessarily 
have the same environ object as the server, so modifying it will make no 
difference to anything.


>| Response headers and callables (or mutables) in the environ
>| are the only way to send stuff upstream.  You also have to be careful
>| that any upstream communication doesn't bypass something that middleware
>| should be allowed to control.
>
>Of course you have to be careful and work out a protocol that all
>intermediate middleware components agree upon.  However, beyond that
>I fail to understand the distinctions you're making or why they
>are important.  Perhaps a tangable example would help to educate me?

Middleware is not required to pass the same environ to a child application 
that it received from its parent server, and environ objects are not 
returned to the caller.  Ergo, modifying 'environ' itself (as opposed to 
modifying an object *in* the environment), cannot guarantee that the server 
will "see" the change unless middleware specifically conspires to make this 
so.  This is the opposite of the way it should work, which is that it 
should be communicated to the server unless the middleware specifically 
conspires to prevent it (e.g. by stripping the environ entry that allows 
the communication, or by changing the value before returning).


From cce at clarkevans.com  Mon Jan 23 21:12:55 2006
From: cce at clarkevans.com (Clark C. Evans)
Date: Mon, 23 Jan 2006 15:12:55 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>
References: <5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<43D3F824.4050903@colorstudy.com> <43D3B131.7080005@zope.com>
	<43D3F824.4050903@colorstudy.com>
	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>
Message-ID: <20060123201255.GE31096@prometheusresearch.com>

Thanks Phillip!  

This clears it up for me. Although, I disagree with the 
specification in this case; there does not seem to be a
reason why middleware shouldn't be required to send the
*same* environ dict along in subsequent calls. 

Certainly a middleware component may make N sub-requests
to subordinate applications; however, these are really
different requests from what the user made.  Further, 
there is likely to be more than 1 of them for this case.
So, it is quite different than the more usual case.

Regardless of my opinion on the matter, what *is* being
proposed for this particular problem; and more generally
for these sorts of situations?

Best,

Clark

From srichter at cosmos.phy.tufts.edu  Mon Jan 23 21:36:33 2006
From: srichter at cosmos.phy.tufts.edu (Stephan Richter)
Date: Mon, 23 Jan 2006 15:36:33 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <20060123201255.GE31096@prometheusresearch.com>
References: <5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
Message-ID: <200601231536.33527.srichter@cosmos.phy.tufts.edu>

On Monday 23 January 2006 15:12, Clark C. Evans wrote:
> Regardless of my opinion on the matter, what *is* being
> proposed for this particular problem; and more generally
> for these sorts of situations?

I think the following is being proposed (and also my favorite solution):

Specify a new environment variable called 'wsgi.user' (or something similar) 
that is a mutable and can be written several times. Only the last write 
(before the output is sent) is important. By default the variable is set to 
``None`` for not set.

Of course I am not good at writing specs, but something like that it should 
say. Of course, one could argue that you possibly want to send other 
information for logging to the server, but I would call this YAGNI.

Regards,
Stephan
-- 
Stephan Richter
CBU Physics & Chemistry (B.S.) / Tufts Physics (Ph.D. student)
Web2k - Web Software Design, Development and Training

From pje at telecommunity.com  Mon Jan 23 22:15:06 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 23 Jan 2006 16:15:06 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <200601231536.33527.srichter@cosmos.phy.tufts.edu>
References: <20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
Message-ID: <5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>

At 03:36 PM 1/23/2006 -0500, Stephan Richter wrote:
>Specify a new environment variable called 'wsgi.user' (or something similar)
>that is a mutable and can be written several times. Only the last write
>(before the output is sent) is important. By default the variable is set to
>``None`` for not set.

I'd suggest a callable under 'wsgi.log_username', that takes one argument.

It should be specified whether it requires ASCII or Unicode.


From srichter at cosmos.phy.tufts.edu  Mon Jan 23 22:18:22 2006
From: srichter at cosmos.phy.tufts.edu (Stephan Richter)
Date: Mon, 23 Jan 2006 16:18:22 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
References: <20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
Message-ID: <200601231618.23058.srichter@cosmos.phy.tufts.edu>

On Monday 23 January 2006 16:15, Phillip J. Eby wrote:
> I'd suggest a callable under 'wsgi.log_username', that takes one argument.

Sounds good to me.

> It should be specified whether it requires ASCII or Unicode.

I don't care; I think ASCII is fine; we can have the application handle the 
encoding.

Regards,
Stephan
-- 
Stephan Richter
CBU Physics & Chemistry (B.S.) / Tufts Physics (Ph.D. student)
Web2k - Web Software Design, Development and Training

From ianb at colorstudy.com  Mon Jan 23 22:29:32 2006
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon, 23 Jan 2006 15:29:32 -0600
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <20060123201255.GE31096@prometheusresearch.com>
References: <5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>	<43D3F824.4050903@colorstudy.com>
	<43D3B131.7080005@zope.com>	<43D3F824.4050903@colorstudy.com>	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>	<5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
Message-ID: <43D54ABC.2010009@colorstudy.com>

Clark C. Evans wrote:
> Thanks Phillip!  
> 
> This clears it up for me. Although, I disagree with the 
> specification in this case; there does not seem to be a
> reason why middleware shouldn't be required to send the
> *same* environ dict along in subsequent calls. 

Paste already does this, for the N subrequest method.  This is done at 
least in paste.cascade, where we retry the request several times until 
something responds with a non-404.  Since it is common at least for 
subapplications to rewrite SCRIPT_NAME/PATH_INFO, you can't pass later 
objects the same dictionary as previuos objects.  Well, I suppose you 
could update the one-and-only environ from a copy you made before 
sending the request on.  But anyway, it doesn't do that.

I'd like to do this same thing (N subrequests) sometime in the future 
for server-side HTML Overlays.

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From cce at clarkevans.com  Tue Jan 24 04:15:48 2006
From: cce at clarkevans.com (Clark C. Evans)
Date: Mon, 23 Jan 2006 22:15:48 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
References: <20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
Message-ID: <20060124031548.GA36152@prometheusresearch.com>

On Mon, Jan 23, 2006 at 04:15:06PM -0500, Phillip J. Eby wrote:
| At 03:36 PM 1/23/2006 -0500, Stephan Richter wrote:
| > Specify a new environment variable called 'wsgi.user' (or something 
| > similar) that is a mutable and can be written several times. Only 
| > the last write (before the output is sent) is important. By default
| > the variable is set to ``None`` for not set.

Why not ``wsgi.context`` or something like that which defaults to 
an empty dictionary.  Then you can put what ever you want in it; 
``wsgi.user`` just seems to be a bit too specific.

| I'd suggest a callable under 'wsgi.log_username', that takes one 
| argument.

I think this is way too specific; it doesn't address the general
problem: how do you pass information back up the middleware stack.

| It should be specified whether it requires ASCII or Unicode.

Why cannot it just accept a Python string?  You can always check
if it is Unicode or not.

Best,

Clark

From cce at clarkevans.com  Tue Jan 24 04:30:53 2006
From: cce at clarkevans.com (Clark C. Evans)
Date: Mon, 23 Jan 2006 22:30:53 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <43D54ABC.2010009@colorstudy.com>
References: <5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<43D3F824.4050903@colorstudy.com> <43D3B131.7080005@zope.com>
	<43D3F824.4050903@colorstudy.com>
	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<43D54ABC.2010009@colorstudy.com>
Message-ID: <20060124033053.GB36152@prometheusresearch.com>

I'm not convinced that we shouldn't just require WSGI middleware
to forward on the *exact* same ``environ`` as it receives.

On Mon, Jan 23, 2006 at 03:29:32PM -0600, Ian Bicking wrote:
| Paste already does this, for the N subrequest method.  This is done at 
| least in paste.cascade, where we retry the request several times until 
| something responds with a non-404.

Yes; this is exactly the sort of edge cases that I think will elude just
about any "general" solution.  How would Phillip's recent suggestion,
for example, a ``wsgi.log_username`` work in this situation?

Assertion:

  If a WSGI middleware component _isn't_ passing on the actual
  ``environ`` given by its parent, then it is an edge case where
  this problem can't be solved anyway.

| I suppose you could update the one-and-only environ from a copy 
| you made before sending the request on.  But anyway, it doesn't do that.

Yes, you could for this case _copy_ the ``environ`` and then when
one of the cascade applications returns, you can update the 
original ``environ`` with the saved copy. 

Suggested Wording:

   A WSGI Middleware component (that is, one that receives a 
   request and forwards it on to another component) must forward 
   on the *exact* same ``environ`` dict that it received.

| I'd like to do this same thing (N subrequests) sometime in the future 
| for server-side HTML Overlays.

The above restriction won't hurt these use-cases (which you must
be careful about anyway), and it addresses the current issue:
how does one pass information back up the call chain.

Best,

Clark

From srichter at cosmos.phy.tufts.edu  Tue Jan 24 13:31:35 2006
From: srichter at cosmos.phy.tufts.edu (Stephan Richter)
Date: Tue, 24 Jan 2006 07:31:35 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <20060124031548.GA36152@prometheusresearch.com>
References: <20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<20060124031548.GA36152@prometheusresearch.com>
Message-ID: <200601240731.35407.srichter@cosmos.phy.tufts.edu>

On Monday 23 January 2006 22:15, Clark C. Evans wrote:
> On Mon, Jan 23, 2006 at 04:15:06PM -0500, Phillip J. Eby wrote:
> | At 03:36 PM 1/23/2006 -0500, Stephan Richter wrote:
> | > Specify a new environment variable called 'wsgi.user' (or something
> | > similar) that is a mutable and can be written several times. Only
> | > the last write (before the output is sent) is important. By default
> | > the variable is set to ``None`` for not set.
>
> Why not ``wsgi.context`` or something like that which defaults to
> an empty dictionary. ?Then you can put what ever you want in it;
> ``wsgi.user`` just seems to be a bit too specific.

But if you use a dictionary you need to specify all allowed keys. The server 
needs to know from the standard (WSGI) what it is looking for. The twisted 
guys and us have thought about other possible data for logging and we could 
not come up with any. If you have real use cases for other data, please let 
me know.

> | I'd suggest a callable under 'wsgi.log_username', that takes one
> | argument.
>
> I think this is way too specific; it doesn't address the general
> problem: how do you pass information back up the middleware stack.

You cannot address this issue generally. The point of WSGI is that it is a 
well-defined API that specifies exactly what to expect. Let's take your 
suggestion. Let's say there is a dictionary that can contain anything. Zope 3 
(acting as the application) decides to put a key named "user" into the 
dictionary. But Twisted (acting as the server) looks for "remote-user". Since 
the key is not specified in the specification, we have gained absolutely 
nothing.

> | It should be specified whether it requires ASCII or Unicode.
>
> Why cannot it just accept a Python string? ?You can always check
> if it is Unicode or not.

Because encoding might be arbitrary. It has to be clearly specified in the 
specs what to expect.

Regards,
Stephan
-- 
Stephan Richter
CBU Physics & Chemistry (B.S.) / Tufts Physics (Ph.D. student)
Web2k - Web Software Design, Development and Training

From pje at telecommunity.com  Tue Jan 24 17:33:56 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 24 Jan 2006 11:33:56 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <20060124031548.GA36152@prometheusresearch.com>
References: <5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20060124113058.01e51ec8@mail.telecommunity.com>

At 10:15 PM 1/23/2006 -0500, Clark C. Evans wrote:
>On Mon, Jan 23, 2006 at 04:15:06PM -0500, Phillip J. Eby wrote:
>| At 03:36 PM 1/23/2006 -0500, Stephan Richter wrote:
>| > Specify a new environment variable called 'wsgi.user' (or something
>| > similar) that is a mutable and can be written several times. Only
>| > the last write (before the output is sent) is important. By default
>| > the variable is set to ``None`` for not set.
>
>Why not ``wsgi.context`` or something like that which defaults to
>an empty dictionary.  Then you can put what ever you want in it;
>``wsgi.user`` just seems to be a bit too specific.

We want to be specific, as it wouldn't be a very good specific-ation 
otherwise.  :)


>| I'd suggest a callable under 'wsgi.log_username', that takes one
>| argument.
>
>I think this is way too specific; it doesn't address the general
>problem: how do you pass information back up the middleware stack.

There is no "general problem" which anyone is trying to solve.  The use 
case requested by Jim and Stephan is quite specific.


>| It should be specified whether it requires ASCII or Unicode.
>
>Why cannot it just accept a Python string?  You can always check
>if it is Unicode or not.

I'm pointing out that the use case under consideration isn't specific 
*enough* yet.  Do people's log files support unicode?  Do the 
authentication systems?  This hasn't been made clear, and it should be.


From jim at zope.com  Tue Jan 24 17:37:25 2006
From: jim at zope.com (Jim Fulton)
Date: Tue, 24 Jan 2006 11:37:25 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <5.1.1.6.0.20060124113058.01e51ec8@mail.telecommunity.com>
References: <5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>	<20060123201255.GE31096@prometheusresearch.com>	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>	<5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>	<20060123201255.GE31096@prometheusresearch.com>	<5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<5.1.1.6.0.20060124113058.01e51ec8@mail.telecommunity.com>
Message-ID: <43D657C5.5020900@zope.com>

Phillip J. Eby wrote:
...
> I'm pointing out that the use case under consideration isn't specific 
> *enough* yet.  Do people's log files support unicode?  Do the 
> authentication systems?  This hasn't been made clear, and it should be.

I agree.  I think we should be guided by the common log file format.
Log data are written to files and are thus not unicode. The user
info is *just* documentation, so it is really up to the app what to
show imo.  Further, because the common log file format is space
delimited, the user info cannot contain spaces.

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From pje at telecommunity.com  Tue Jan 24 17:41:04 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 24 Jan 2006 11:41:04 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <20060124033053.GB36152@prometheusresearch.com>
References: <43D54ABC.2010009@colorstudy.com>
	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<43D3F824.4050903@colorstudy.com> <43D3B131.7080005@zope.com>
	<43D3F824.4050903@colorstudy.com>
	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<43D54ABC.2010009@colorstudy.com>
Message-ID: <5.1.1.6.0.20060124113426.0439ca70@mail.telecommunity.com>

At 10:30 PM 1/23/2006 -0500, Clark C. Evans wrote:
>Suggested Wording:
>
>    A WSGI Middleware component (that is, one that receives a
>    request and forwards it on to another component) must forward
>    on the *exact* same ``environ`` dict that it received.

-1.  This invalidates current WSGI design principles and can't go in any 
WSGI 1.x version, and even for a WSGI 2.x it would need a heck of a lot 
more justification.

Note that WSGI is an HTTP analogue, it is not a web server API.  In the 
context of this discussion, I'm now more convinced than ever that the right 
place to communicate information back to the server is via response 
headers, and that's how this use case should be addressed in WSGI 1.1, as 
it maintains the functional composition of middleware better than an 
environ-supplied extension.  In WSGI the design principle needs to be 
"Isolation beats cleanliness".


From ianb at colorstudy.com  Tue Jan 24 17:53:53 2006
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 24 Jan 2006 10:53:53 -0600
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <43D657C5.5020900@zope.com>
References: <5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>	<20060123201255.GE31096@prometheusresearch.com>	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>	<5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>	<20060123201255.GE31096@prometheusresearch.com>	<5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>	<5.1.1.6.0.20060124113058.01e51ec8@mail.telecommunity.com>
	<43D657C5.5020900@zope.com>
Message-ID: <43D65BA1.8030801@colorstudy.com>

Jim Fulton wrote:
> Phillip J. Eby wrote:
> ...
> 
>>I'm pointing out that the use case under consideration isn't specific 
>>*enough* yet.  Do people's log files support unicode?  Do the 
>>authentication systems?  This hasn't been made clear, and it should be.
> 
> 
> I agree.  I think we should be guided by the common log file format.
> Log data are written to files and are thus not unicode. The user
> info is *just* documentation, so it is really up to the app what to
> show imo.  Further, because the common log file format is space
> delimited, the user info cannot contain spaces.

In theory the log file could be encoded in some way, and could include 
spaces in usernames.  Maybe in this case unicode should be allowed, and 
spaces allowed, with the caveat that the log file may not represent 
this?  So for a common log:

   if isinstance(username, unicode):
       username = username.encode('ascii', replace)
   username = username.replace(' ', '')

It is up to the consumer to handle any unicode, and to maintain the 
integrity of their log format regardless of input.

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From michal at sabren.com  Tue Jan 24 18:35:48 2006
From: michal at sabren.com (Michal Wallace)
Date: Tue, 24 Jan 2006 12:35:48 -0500 (EST)
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <43D657C5.5020900@zope.com>
References: <5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<5.1.1.6.0.20060124113058.01e51ec8@mail.telecommunity.com>
	<43D657C5.5020900@zope.com>
Message-ID: <Pine.LNX.4.62.0601241213540.7966@hydrogen.sabren.com>

On Tue, 24 Jan 2006, Jim Fulton wrote:

> Phillip J. Eby wrote:
> ...
> > I'm pointing out that the use case under consideration isn't specific 
> > *enough* yet.  Do people's log files support unicode?  Do the 
> > authentication systems?  This hasn't been made clear, and it should be.
>
> I agree.  I think we should be guided by the common log file format.
> Log data are written to files and are thus not unicode. The user
> info is *just* documentation, so it is really up to the app what to
> show imo.  Further, because the common log file format is space
> delimited, the user info cannot contain spaces.


I'm curious. Suppose that a subset of your site
or application requires users to log in through
an HTML form, but that for some reason there's 
also a general HTTP authentication username and 
password over the whole system.

Which username should the web server log?

I think you guys are trying to solve this at
the wrong level. This problem should be 
handled by the web server itself. Even if you
want to use your own custom forms, there is
already a really nice solution to this problem
for apache:

http://aspn.activestate.com/ASPN/CodeDoc/Apache-AuthCookie/AuthCookie.html

If a particular webserver doesn't allow this
sort of approach, then maybe the work should 
be done there?

Meanwhile, you can always do your own logging
for the events you actually want to record at
the application layer.

Maybe I just don't understand why this is 
important. Can someone (Jim) explain why this
is a requirement in the first place?

Sincerely,
 
Michal J Wallace
Sabren Enterprises, Inc.
-------------------------------------
contact: michal at sabren.com
hosting: http://www.cornerhost.com/
my site: http://www.withoutane.com/
-------------------------------------


From cce at clarkevans.com  Tue Jan 24 19:55:34 2006
From: cce at clarkevans.com (Clark C. Evans)
Date: Tue, 24 Jan 2006 13:55:34 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <43D65BA1.8030801@colorstudy.com>
References: <5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<5.1.1.6.0.20060124113058.01e51ec8@mail.telecommunity.com>
	<43D657C5.5020900@zope.com> <43D65BA1.8030801@colorstudy.com>
Message-ID: <20060124185534.GC43166@prometheusresearch.com>

On Tue, Jan 24, 2006 at 11:33:56AM -0500, Phillip J. Eby wrote:
| > I think this is way too specific; it doesn't address the general
| > problem: how do you pass information back up the middleware stack.

| There is no "general problem" which anyone is trying to solve.  The use 
| case requested by Jim and Stephan is quite specific.

Yes there is; it is passing information from applications back to 
middleware (or the server), you even talk about it yourself:

  | I'm now more convinced than ever that the right place to 
  | communicate information back to the server is via response
  | headers, and that's how this use case should be addressed in 
  | WSGI 1.1, as it maintains the functional composition of middleware 

If this is the solution, great.  However, I really don't like the
``environ`` options out there /w mutable objects.  Can you please
then specify a *general* mechanism for headers that won't be 
sent to the client?  My server needs to know which ones to strip.

On Tue, Jan 24, 2006 at 10:53:53AM -0600, Ian Bicking wrote:
| Jim Fulton wrote:
| > Phillip J. Eby wrote:
| >>I'm pointing out that the use case under consideration isn't specific 
| >>*enough* yet.  Do people's log files support unicode?  Do the 
| >>authentication systems?  This hasn't been made clear, and it should be.
| > 
| > I agree.  I think we should be guided by the common log file format.
| > Log data are written to files and are thus not unicode. The user
| > info is *just* documentation, so it is really up to the app what to
| > show imo.  Further, because the common log file format is space
| > delimited, the user info cannot contain spaces.
| 
| It is up to the consumer to handle any unicode, and to maintain the 
| integrity of their log format regardless of input.

I second Ian's opinion.  I have to log Russian user-names and web-pages,
internally I use Unicode strings; and when writing to common log file
format, I simply urlencode the string.  This takes care of spaces and
non-ASCII code points.

Thus, the WSGI specification should not restrict the character set,
since some other logging middleware might want to use XML(UTF-8) or
write each access to a database that is unicode aware.  The value
should be *any* python string object; let the logging module determine
the type and encoding and handle it as needed.

On Tue, Jan 24, 2006 at 12:35:48PM -0500, Michal Wallace wrote:
| I think you guys are trying to solve this at the wrong level. 
| This problem should be handled by the web server itself.

People are writing features like this as specific middleware
components so that you don't have a bloated web-server.  

| Maybe I just don't understand why this is important. Can 
| someone (Jim) explain why this is a requirement in the 
| first place?

Well, the general problem is how to communicate information from
applications back to the middleware or server.  This is one use
case; there are others, I am sure.

On Tue, Jan 24, 2006 at 07:31:35AM -0500, Stephan Richter wrote:
| > Why not ``wsgi.context`` or something like that which defaults to
| > an empty dictionary. ?Then you can put what ever you want in it;
| > ``wsgi.user`` just seems to be a bit too specific.
| 
| But if you use a dictionary you need to specify all allowed keys.

Fine, use REMOTE_USER as this is a CGI standard.

| The twisted guys and us have thought about other possible data for
| logging and we could not come up with any. If you have real use cases
| for other data, please let me know.

Depends on the logging level;

0. Trace messages
1. The database instance name used 
2. A sequence of SQL queries executed
3. Files that were created by the request
etc.

I can think of a lot of things I might want to log (and in fact do log);
that said, I'm not interested in this specific application -- I want
the general case spelled-out.  How do I pass information from the
application back to the middleware reliably?

| > Why cannot it just accept a Python string? ?You can always check
| > if it is Unicode or not.
| 
| Because encoding might be arbitrary. It has to be clearly specified in the 
| specs what to expect.

It's very easy for the logging module to check what it has and act
intelligently.   We arn't using C89...

Best,

Clark

From herb at dynamic-solutions.com  Tue Jan 24 19:12:03 2006
From: herb at dynamic-solutions.com (Herb Lainchbury)
Date: Tue, 24 Jan 2006 10:12:03 -0800
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <Pine.LNX.4.62.0601241213540.7966@hydrogen.sabren.com>
Message-ID: <00b401c62111$b3abf6f0$6601a8c0@hercules>

It seems to me that applications should be concerned only about what
applications need to be concerned about.  If an application happens to be a
login screen for a system, then that application will be concerned with
setting the userid somewhere.  

The fact that the middleware or the webserver wants to log that information
somewhere is none of the applications business.

The middleware is providing the service to the application so the middleware
gets to define the interface for those services.  One of those services is
the userid.  Most applications will just want to know the userid and can get
it by accessing wsgi.userid.   

Applications that need to modify the userid (like login & logout
applications) can do so by setting wsgi.userid.  For example: if the login
application decides that a user has entered sufficient credentials to
satisfy the authentication criteria then it just sets the wsgi.userid
attribute and it's done.

Now, if the middleware wants to do something with that value (like log it),
great, but the application doesn't need to know that and need not be
concerned with it.

There may be other values that applications can set but userid is one that
many applications will need and I prefer the specific member for this
purpose.

BTW, if authentication is being done with HTTP then there's no need for the
login and logout applications and the middleware still has the
responsibility of providing the userid attribute for applications.  The
middleware set that attribute based on the HTTP information available.

Herb Lainchbury
Dynamic Solutions Inc.
www.dynamic-solutions.com


-----Original Message-----
From: web-sig-bounces+herb=dynamic-solutions.com at python.org
[mailto:web-sig-bounces+herb=dynamic-solutions.com at python.org] On Behalf Of
Michal Wallace
Sent: Tuesday, January 24, 2006 9:36 AM
To: Jim Fulton
Cc: web-sig at python.org
Subject: Re: [Web-SIG] Communicating authenticated user information

On Tue, 24 Jan 2006, Jim Fulton wrote:

> Phillip J. Eby wrote:
> ...
> > I'm pointing out that the use case under consideration isn't specific 
> > *enough* yet.  Do people's log files support unicode?  Do the 
> > authentication systems?  This hasn't been made clear, and it should be.
>
> I agree.  I think we should be guided by the common log file format.
> Log data are written to files and are thus not unicode. The user
> info is *just* documentation, so it is really up to the app what to
> show imo.  Further, because the common log file format is space
> delimited, the user info cannot contain spaces.


I'm curious. Suppose that a subset of your site
or application requires users to log in through
an HTML form, but that for some reason there's 
also a general HTTP authentication username and 
password over the whole system.

Which username should the web server log?

I think you guys are trying to solve this at
the wrong level. This problem should be 
handled by the web server itself. Even if you
want to use your own custom forms, there is
already a really nice solution to this problem
for apache:

http://aspn.activestate.com/ASPN/CodeDoc/Apache-AuthCookie/AuthCookie.html

If a particular webserver doesn't allow this
sort of approach, then maybe the work should 
be done there?

Meanwhile, you can always do your own logging
for the events you actually want to record at
the application layer.

Maybe I just don't understand why this is 
important. Can someone (Jim) explain why this
is a requirement in the first place?

Sincerely,
 
Michal J Wallace
Sabren Enterprises, Inc.
-------------------------------------
contact: michal at sabren.com
hosting: http://www.cornerhost.com/
my site: http://www.withoutane.com/
-------------------------------------

_______________________________________________
Web-SIG mailing list
Web-SIG at python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe:
http://mail.python.org/mailman/options/web-sig/herb%40dynamic-solutions.com


From pje at telecommunity.com  Tue Jan 24 23:17:29 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 24 Jan 2006 17:17:29 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <Pine.LNX.4.62.0601241213540.7966@hydrogen.sabren.com>
References: <43D657C5.5020900@zope.com>
	<5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<5.1.1.6.0.20060124113058.01e51ec8@mail.telecommunity.com>
	<43D657C5.5020900@zope.com>
Message-ID: <5.1.1.6.0.20060124171439.03e7a098@mail.telecommunity.com>

At 12:35 PM 1/24/2006 -0500, Michal Wallace wrote:
>Maybe I just don't understand why this is
>important. Can someone (Jim) explain why this
>is a requirement in the first place?

I'd like to know too, although the obvious argument is backward 
compatibility for people accustomed to ZServer as Zope migrates away from it.

I've personally never felt a need to feed this data back to the web server, 
probably because I'm so used to using FastCGI, which has no *way* to feed 
it back to the web server, and I prefer to look at the application's 
logs.  But Zope is a content management system, not an "application" in the 
sense I mean, so the same use cases don't necessarily apply.


From pje at telecommunity.com  Tue Jan 24 23:34:19 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 24 Jan 2006 17:34:19 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <20060124185534.GC43166@prometheusresearch.com>
References: <43D65BA1.8030801@colorstudy.com>
	<5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<5.1.1.6.0.20060124113058.01e51ec8@mail.telecommunity.com>
	<43D657C5.5020900@zope.com> <43D65BA1.8030801@colorstudy.com>
Message-ID: <5.1.1.6.0.20060124171743.03e7c568@mail.telecommunity.com>

At 01:55 PM 1/24/2006 -0500, Clark C. Evans wrote:
>On Tue, Jan 24, 2006 at 11:33:56AM -0500, Phillip J. Eby wrote:
>| > I think this is way too specific; it doesn't address the general
>| > problem: how do you pass information back up the middleware stack.
>
>| There is no "general problem" which anyone is trying to solve.  The use
>| case requested by Jim and Stephan is quite specific.
>
>Yes there is; it is passing information from applications back to
>middleware (or the server), you even talk about it yourself:

That doesn't mean I admit it's a "general problem".  So far, only this one 
exceedingly obscure use case (whose only merit that I know of so far is 
backward compatibility) has presented a clear and present need for such 
communication at all.


>   | I'm now more convinced than ever that the right place to
>   | communicate information back to the server is via response
>   | headers, and that's how this use case should be addressed in
>   | WSGI 1.1, as it maintains the functional composition of middleware
>
>If this is the solution, great.  However, I really don't like the
>``environ`` options out there /w mutable objects.  Can you please
>then specify a *general* mechanism for headers that won't be
>sent to the client?  My server needs to know which ones to strip.

For compatibility reasons, this would have to be a WSGI 1.1 feature.


>On Tue, Jan 24, 2006 at 10:53:53AM -0600, Ian Bicking wrote:
>| Jim Fulton wrote:
>| > Phillip J. Eby wrote:
>| >>I'm pointing out that the use case under consideration isn't specific
>| >>*enough* yet.  Do people's log files support unicode?  Do the
>| >>authentication systems?  This hasn't been made clear, and it should be.
>| >
>| > I agree.  I think we should be guided by the common log file format.
>| > Log data are written to files and are thus not unicode. The user
>| > info is *just* documentation, so it is really up to the app what to
>| > show imo.  Further, because the common log file format is space
>| > delimited, the user info cannot contain spaces.
>|
>| It is up to the consumer to handle any unicode, and to maintain the
>| integrity of their log format regardless of input.
>
>I second Ian's opinion.

I agree with it also - but you don't appear to, since you seem to be 
proposing something different in this next paragraph:

>Thus, the WSGI specification should not restrict the character set,
>since some other logging middleware might want to use XML(UTF-8) or
>write each access to a database that is unicode aware.  The value
>should be *any* python string object; let the logging module determine
>the type and encoding and handle it as needed.

+1 for ASCII strings or Unicode objects, but -1 on strings with arbitrary 
encoding.  That's precisely what we *don't* want.


>Well, the general problem is how to communicate information from
>applications back to the middleware or server.  This is one use
>case; there are others, I am sure.

Please feel free to revisit this issue when you have some; in the meantime, 
declaring it a general problem doesn't make it so.  If a pattern emerges 
from the specific solution to multiple specific problems, then it will be 
worth looking at a general solution.  For now, there is only one specific 
problem and generalizing it would be premature, given the exceedingly 
narrow niche of the issue at hand: writing a user name into a server's 
common access log so that existing ZServer users will have backward 
compatibility.

By turning that narrowly-stated issue into a general problem, you're 
dissolving three dimensions of specificity at once: i.e., you're turning 
the problem into essentially "communicating something about anything to 
anybody", which no longer carries any useful information for making design 
tradeoffs, especially since you are not presenting any alternative use 
cases that present examples with different values along any of the 
generalized dimensions.  This is unsound design practice, and it simply 
leads to people jabbering misunderstandings at each other while they think 
they're communicating.  Let's stick to real-life use cases, please, not 
theoretical ones.  In the meantime, extension APIs as provided for by the 
existing PEP present an adequate ad hoc "upstream" communications facility.


From jim at zope.com  Tue Jan 24 23:34:38 2006
From: jim at zope.com (Jim Fulton)
Date: Tue, 24 Jan 2006 17:34:38 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <Pine.LNX.4.62.0601241213540.7966@hydrogen.sabren.com>
References: <5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<5.1.1.6.0.20060124113058.01e51ec8@mail.telecommunity.com>
	<43D657C5.5020900@zope.com>
	<Pine.LNX.4.62.0601241213540.7966@hydrogen.sabren.com>
Message-ID: <43D6AB7E.5050905@zope.com>

Michal Wallace wrote:
> On Tue, 24 Jan 2006, Jim Fulton wrote:
> 
...

> Maybe I just don't understand why this is 
> important. Can someone (Jim) explain why this
> is a requirement in the first place?

We do our own authentication for lots of reasons, including:

- Zope can provide user and group management facilities that
   are convenient to use,

- Zope can integrate with external systems that haven't been integrated with
   the server,

- Zope can use authentication schemes that the server may not support.

History has shown us that many users find this useful.

If Zope performs authentication, then we'd like the authentication to show
up in the access logs.

People sometimes use Zope behind another web server, but often people
don't.  When they don't and are using Zope with Zserver (medusa) or Twisted,
then it should be possible to give ZServer or Twisted the information to log
appropriately.

If this isn't possible with WSGI, then we can write out own access logs.
I'd prefer not to have to do that because the times included in any
access logs we write won't be accurate, as the request as seen by the
web client will end later than the time we're done with the request.

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From michal at sabren.com  Wed Jan 25 01:06:30 2006
From: michal at sabren.com (Michal Wallace)
Date: Tue, 24 Jan 2006 19:06:30 -0500 (EST)
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <43D6AB7E.5050905@zope.com>
References: <5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<5.1.1.6.0.20060124113058.01e51ec8@mail.telecommunity.com>
	<43D657C5.5020900@zope.com>
	<Pine.LNX.4.62.0601241213540.7966@hydrogen.sabren.com>
	<43D6AB7E.5050905@zope.com>
Message-ID: <Pine.LNX.4.62.0601241853001.9071@hydrogen.sabren.com>

On Tue, 24 Jan 2006, Jim Fulton wrote:

> Michal Wallace wrote:
>
> > Maybe I just don't understand why this is important. Can someone (Jim)
> > explain why this
> > is a requirement in the first place?
> 
> We do our own authentication for lots of reasons, including:
... 
> History has shown us that many users find this useful.


No, I understand why you do your own authentication.
Simply having the ability to log out trumps HTTP 
authentication every time. 

What I'm trying to understand is the next thought in
the chain:
 
> If Zope performs authentication, then we'd like 
> the authentication to show up in the access logs.

Why do you want this? 
What do people do with the information?

To me it makes a lot more sense to log application-level
events: so-and-so tried to do this, etc... Whereas at
the web server log level, you're logging that so-and-so's 
browser requested a gif or a css file.

I'm not trying to argue here. I'm just trying to
understand what value you're getting out of the
logs. 

Sincerely,
 
Michal J Wallace
Sabren Enterprises, Inc.
-------------------------------------
contact: michal at sabren.com
hosting: http://www.cornerhost.com/
my site: http://www.withoutane.com/
-------------------------------------


From cce at clarkevans.com  Wed Jan 25 03:42:27 2006
From: cce at clarkevans.com (Clark C. Evans)
Date: Tue, 24 Jan 2006 21:42:27 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <5.1.1.6.0.20060124171743.03e7c568@mail.telecommunity.com>
References: <5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<5.1.1.6.0.20060124113058.01e51ec8@mail.telecommunity.com>
	<43D657C5.5020900@zope.com> <43D65BA1.8030801@colorstudy.com>
	<5.1.1.6.0.20060124171743.03e7c568@mail.telecommunity.com>
Message-ID: <20060125024227.GB53631@prometheusresearch.com>

On Tue, Jan 24, 2006 at 05:34:19PM -0500, Phillip J. Eby wrote:
| By turning that narrowly-stated issue into a general problem, you're 
| dissolving three dimensions of specificity at once: i.e., you're turning 
| the problem into essentially "communicating something about anything to 
| anybody", which no longer carries any useful information for making 
| design tradeoffs, especially since you are not presenting any 
| alternative use cases that present examples with different values along 
| any of the generalized dimensions.  This is unsound design practice, and 
| it simply leads to people jabbering misunderstandings at each other 
| while they think they're communicating.  Let's stick to real-life use 
| cases, please, not theoretical ones.  In the meantime, extension APIs as 
| provided for by the existing PEP present an adequate ad hoc "upstream" 
| communications facility.

Nice sermon; now can we get back to the issue being discussed without
being argumentative and santimonious?

Another use case for passing information "up" the WSGI stack is is where
you have two 'othogonal' but decoupled modules, each of which have a
role/interface that could be implemented by an equivalent replacement:

  'paste.auth.digest'   
     This does authentication handling, sending a 401 back to 
     the server if REMOTE_USER is not already filled in.

  'paste.auth.cookie'
     This looks for a cookie and injects REMOTE_USER into the
     environ on the way "down"; it then looks for a REMOTE_USER
     to save via a cookie on the way "up".  It is a simple and
     elegant mechanism.

However, this implementation violates your vision of WSGI, since I am
assuming that the later stacks will pass along the current environment:

    On Mon, Jan 23, 2006 at 02:25:35PM -0500, Phillip J. Eby wrote:
    | You simply can't use environ values to communicate *up*
    | the WSGI stack, since at no level is it guaranteed you
    | have the "same" dictionary.  Response headers and
    | callables (or mutables) in the environ are the only way to
    | send stuff upstream.  You also have to be careful that any
    | upstream communication doesn't bypass something that
    | middleware should be allowed to control.
    |
    | In the case of authentication, it should be sufficient to
    | have a callable or mutable in the environ that can be
    | called or set more than once per request, i.e. it only
    | takes effect once the request is completed.  This allows
    | outer middleware to override what inner middleware or the
    | application set it to. 

The problem with "fixing" my implementation with this approach is
that it unnecessarly couples cookie and digest modules.  I don't
think it is necessary nor a good idea to have decoupled modules
dependent on each other via a callable in the ``environ``.
So, I reject this approach, and I suggested that the same ``environ`` 
object should be passed all the way down the WSGI stack.

    On Tue, Jan 24, 2006 at 11:41:04AM -0500, Phillip J. Eby wrote:
    | At 10:30 PM 1/23/2006 -0500, Clark C. Evans wrote:
    | >Suggested Wording:
    | >
    | >   A WSGI Middleware component (that is, one that receives a
    | >   request and forwards it on to another component) must forward
    | >   on the *exact* same ``environ`` dict that it received.
    | 
    | -1.  This invalidates current WSGI design principles and can't
    | go in any WSGI 1.x version, and even for a WSGI 2.x it would 
    | need a heck of a lot more justification.

Having the *same* ``environ`` passed all the way up the stack works --
nicely.  I've not yet seen a rationale why WSGI should not have this
limitation; Ian presented 2 use cases in paste where a different environ
is passed down the stack, however, both of his cases can be fixed (as I
demonstrated) to be compliant with the suggested wording above.

    On Tue, Jan 24, 2006 at 11:41:04AM -0500, Phillip J. Eby wrote:
    | Note that WSGI is an HTTP analogue, it is not a web server
    | API.  In the context of this discussion, I'm now more
    | convinced than ever that the right place to communicate
    | information back to the server is via response headers,
    | and that's how this use case should be addressed in WSGI
    | 1.1, as it maintains the functional composition of
    | middleware better than an environ-supplied extension.  In
    | WSGI the design principle needs to be "Isolation beats
    | cleanliness".

Well, regardless of what you intended of WSGI, it is a web server API;
and a particularly good low-level one.  The current usage I have of
using the ``environ`` to pass information *up* does provide a great deal
of isolation, and the solutions so far don't have the same advantages.

On Tue, Jan 24, 2006 at 05:34:19PM -0500, Phillip J. Eby wrote:
| >On Tue, Jan 24, 2006 at 10:53:53AM -0600, Ian Bicking wrote:
| >| It is up to the consumer to handle any unicode, and to maintain the
| >| integrity of their log format regardless of input.
| >
| >I second Ian's opinion.
| 
| +1 for ASCII strings or Unicode objects

Perfect.


Kind Regards,

Clark 

From pje at telecommunity.com  Wed Jan 25 05:37:30 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 24 Jan 2006 23:37:30 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <20060125024227.GB53631@prometheusresearch.com>
References: <5.1.1.6.0.20060124171743.03e7c568@mail.telecommunity.com>
	<5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<5.1.1.6.0.20060124113058.01e51ec8@mail.telecommunity.com>
	<43D657C5.5020900@zope.com> <43D65BA1.8030801@colorstudy.com>
	<5.1.1.6.0.20060124171743.03e7c568@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20060124231537.023970a8@mail.telecommunity.com>

At 09:42 PM 1/24/2006 -0500, Clark C. Evans wrote:
>Nice sermon; now can we get back to the issue being discussed without
>being argumentative and santimonious?

I didn't notice anyone being either of those.  As for the sermon, however, 
I'm glad you enjoyed it.  :)


>Another use case for passing information "up" the WSGI stack is is where
>you have two 'othogonal' but decoupled modules, each of which have a
>role/interface that could be implemented by an equivalent replacement:
>
>   'paste.auth.digest'
>      This does authentication handling, sending a 401 back to
>      the server if REMOTE_USER is not already filled in.
>
>   'paste.auth.cookie'
>      This looks for a cookie and injects REMOTE_USER into the
>      environ on the way "down"; it then looks for a REMOTE_USER
>      to save via a cookie on the way "up".  It is a simple and
>      elegant mechanism.

I don't see why an extension API placed in the environ, such as 
"paste.auth.set_user" doesn't satisfy this use case.


>The problem with "fixing" my implementation with this approach is
>that it unnecessarly couples cookie and digest modules.

You lost me.  How does it do that in any way that the 'REMOTE_USER' 
variable does not?


>So, I reject this approach, and I suggested that the same ``environ``
>object should be passed all the way down the WSGI stack.

And as I've already said, this simply isn't possible in WSGI 1.x, as it's 
not backward compatible.  That needs to be a 2.x revision, if it happens at 
all.


>Having the *same* ``environ`` passed all the way up the stack works --
>nicely.

So do extension APIs.


>   I've not yet seen a rationale why WSGI should not have this
>limitation;

Because WSGI is designed for functional composability.  Requiring environ 
passthrough breaks that by creating a global coupling.  If anything, in a 
2.x WSGI version I would lean towards getting rid of extension APIs and 
replacing them with some kind of additional response facility, as it's 
still too easy to create global coupling or to bypass middleware via 
extension APIs.


>Ian presented 2 use cases in paste where a different environ
>is passed down the stack, however, both of his cases can be fixed (as I
>demonstrated) to be compliant with the suggested wording above.

So we can make it harder for people to write middleware, in order to make 
it easier for people to introduce global coupling?  That doesn't sound like 
a useful tradeoff -- certainly not one that overcomes the cost of changing 
the spec.


>Well, regardless of what you intended of WSGI, it is a web server API;
>and a particularly good low-level one.  The current usage I have of
>using the ``environ`` to pass information *up* does provide a great deal
>of isolation, and the solutions so far don't have the same advantages.

Not so.  It's just as easy to create a 'paste.remote_user' environ key that 
contains a 1-element list with a value in it, if you insist on having 
global coupling.  That works today with the existing spec and likely always 
will, is trivial to implement, and requires no "fixing" of existing 
middleware that isn't broken.


From michal at sabren.com  Wed Jan 25 06:41:01 2006
From: michal at sabren.com (Michal Wallace)
Date: Wed, 25 Jan 2006 00:41:01 -0500 (EST)
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <20060125024227.GB53631@prometheusresearch.com>
References: <5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<5.1.1.6.0.20060124113058.01e51ec8@mail.telecommunity.com>
	<43D657C5.5020900@zope.com> <43D65BA1.8030801@colorstudy.com>
	<5.1.1.6.0.20060124171743.03e7c568@mail.telecommunity.com>
	<20060125024227.GB53631@prometheusresearch.com>
Message-ID: <Pine.LNX.4.62.0601250029100.19269@hydrogen.sabren.com>

On Tue, 24 Jan 2006, Clark C. Evans wrote:

>     On Mon, Jan 23, 2006 at 02:25:35PM -0500, Phillip J. Eby wrote:
>     | You simply can't use environ values to communicate *up*
>     | the WSGI stack, since at no level is it guaranteed you
>     | have the "same" dictionary.  Response headers and
>     | callables (or mutables) in the environ are the only way to
>     | send stuff upstream.  You also have to be careful that any
>     | upstream communication doesn't bypass something that
>     | middleware should be allowed to control.

> So, I reject this approach, and I suggested that the same ``environ`` 
> object should be passed all the way down the WSGI stack.

Unfortunately, if you require it to be the exact same 
*object* then you're making the requirement that 
everything in the stack happens in the same process, 
on the same machine. 

That means you can't distribute the magic over xml-rpc or SOAP
or some other protocol, and you might want to do that if you're
using a load balancing feature or want part of the system
to run as a different user.

I suppose you could pass *copies* around, either of
the whole dictionary or just certain values... (maybe?)

Sincerely,
 
Michal J Wallace
Sabren Enterprises, Inc.
-------------------------------------
contact: michal at sabren.com
hosting: http://www.cornerhost.com/
my site: http://www.withoutane.com/
-------------------------------------


From cce at clarkevans.com  Wed Jan 25 07:53:38 2006
From: cce at clarkevans.com (Clark C. Evans)
Date: Wed, 25 Jan 2006 01:53:38 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <5.1.1.6.0.20060124231537.023970a8@mail.telecommunity.com>
References: <20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<5.1.1.6.0.20060124113058.01e51ec8@mail.telecommunity.com>
	<43D657C5.5020900@zope.com> <43D65BA1.8030801@colorstudy.com>
	<5.1.1.6.0.20060124171743.03e7c568@mail.telecommunity.com>
	<5.1.1.6.0.20060124231537.023970a8@mail.telecommunity.com>
Message-ID: <20060125065338.GA93754@prometheusresearch.com>

On Wed, Jan 25, 2006 at 12:41:01AM -0500, Michal Wallace wrote:
| Unfortunately, if you require it to be the exact same 
| *object* then you're making the requirement that 
| everything in the stack happens in the same process, 
| on the same machine. 

Correct.  Phillip's extension APIs approach has the same short-coming;
it does seem that using response headers is the only sane way to go
about solving this problem.

| That means you can't distribute the magic over xml-rpc or SOAP
| or some other protocol, and you might want to do that if you're
| using a load balancing feature or want part of the system
| to run as a different user.

In other words, each WSGI component in a stack should, ideally, not be
dependent upon mutable objects and should only use values that can be
passed by value. It's additional work; but I'll buy that one -- I just
don't buy the idea that extension APIs are superior to just requring the
``environ`` be constant through a given request.

This seems to be where Phillip is headed:

    On Tue, Jan 24, 2006 at 11:37:30PM -0500, Phillip J. Eby wrote:
    | WSGI is designed for functional composability.  Requiring
    | environ passthrough breaks that by creating a global coupling.
    | If anything, in a 2.x WSGI version I would lean towards getting
    | rid of extension APIs and replacing them with some kind of
    | additional response facility, as it's still too easy to create
    | global coupling or to bypass middleware via extension APIs.


On Tue, Jan 24, 2006 at 11:37:30PM -0500, Phillip J. Eby wrote:
| I don't see why an extension API placed in the environ, such as 
| "paste.auth.set_user" doesn't satisfy this use case.

I must not have explained the modules clear enough; sorry for the
repetition, but let me take another stab at it.  

I have several authentication modules, one for HTML form authentication,
basic, digest, and quite a few others.  The function of these modules is
to ensure that environ['REMOTE_USER'] exists.  If a remote user is
already provided, they are a no-op.  Otherwise, they do what is
necessary (a 401, 302, returning an HTML form, etc.) in order to
get a remote user and fill in the environ.

Then I have a class of restoration modules, one which uses a signed key,
and another one that does path re-writing.  These modules look to see if
they have enough information to fill in a REMOTE_USER, if not, they are
a no-op on the way in.  On the way out, however, they *look* at the 
``environ`` to see if REMOTE_USER was set -- if it was set they do 
what ever they need to *save* this information.

Hence, the interfaces between these modules is simply using the
well-understood CGI variable ``REMOTE_USER``.  They can be used
independently of each other, and in creative combinations.

| You lost me.  How does it do that in any way that the 'REMOTE_USER' 
| variable does not?

Let's talk about both sorts of modules independently.  First, the CGI
variable 'REMOTE_USER' is already well documented; and the goal of the
authentication modules is simple -- fill in that environment variable.
Your approach requires that an additional activity/burden is imposed on
these sorts of modules.

I agree that the cookie module isn't quite as straight-forward, but
it isn't that bad.  Since the authentication modules are already
filling in the 'REMOTE_USER' to meet the expectations of standard
software components, it makes sense to obtain that inforamtion
directly from the environment.

In summary, I think extension APIs are more brittle and are a 
poor substitute for just using a shared environ both up and
down the WSGI stack.

| >So, I reject this approach, and I suggested that the same ``environ``
| >object should be passed all the way down the WSGI stack.
| 
| And as I've already said, this simply isn't possible in WSGI 1.x, as 
| it's not backward compatible.  That needs to be a 2.x revision, if it 
| happens at all.

The WSGI middleware components that actually create their own environ
are few and far between.  This is an uncommon edge case.

| >Ian presented 2 use cases in paste where a different environ
| >is passed down the stack, however, both of his cases can be fixed (as I
| >demonstrated) to be compliant with the suggested wording above.
| 
| So we can make it harder for people to write middleware, in order to 
| make it easier for people to introduce global coupling?  That doesn't 
| sound like a useful tradeoff -- certainly not one that overcomes the 
| cost of changing the spec.


The change needed is trivial and minor compared to most other things you
have to get correct while writing WSGI middleware; and given the
relative immaturity of WSGI at this point (especially in edge cases like
this), I doubt it is the problem that you make it out to be.

In summary; I think that a response-headers approach as proposed by
Phillip is the best (but higher overhead) approach.  However, I disagree
that some sort of extension API is preferable to just keeping the
``environ`` constant throughout the request.

Best,

Clark

From jim at zope.com  Wed Jan 25 12:15:13 2006
From: jim at zope.com (Jim Fulton)
Date: Wed, 25 Jan 2006 06:15:13 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <Pine.LNX.4.62.0601241853001.9071@hydrogen.sabren.com>
References: <5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<5.1.1.6.0.20060124113058.01e51ec8@mail.telecommunity.com>
	<43D657C5.5020900@zope.com>
	<Pine.LNX.4.62.0601241213540.7966@hydrogen.sabren.com>
	<43D6AB7E.5050905@zope.com>
	<Pine.LNX.4.62.0601241853001.9071@hydrogen.sabren.com>
Message-ID: <43D75DC1.7060507@zope.com>

Michal Wallace wrote:
> On Tue, 24 Jan 2006, Jim Fulton wrote:
> 
> 
>>Michal Wallace wrote:
>>
>>
>>>Maybe I just don't understand why this is important. Can someone (Jim)
>>>explain why this
>>>is a requirement in the first place?
>>
>>We do our own authentication for lots of reasons, including:
> 
> ... 
> 
>>History has shown us that many users find this useful.
> 
> 
> 
> No, I understand why you do your own authentication.
> Simply having the ability to log out trumps HTTP 
> authentication every time. 
> 
> What I'm trying to understand is the next thought in
> the chain:
>  
> 
>>If Zope performs authentication, then we'd like 
>>the authentication to show up in the access logs.
> 
> 
> Why do you want this? 
> What do people do with the information?

Ask the authors of the Apache common log format.

When you see an entry in the access log, it is often useful to
know:

- Was it a request from an anonymous user?

- If not who made the request?

Zope 3.2, which uses WSGI exclusively for HTTP requests no
longer has this information and we have recieved numerous
complaints.

> To me it makes a lot more sense to log application-level
> events: so-and-so tried to do this, etc... Whereas at
> the web server log level, you're logging that so-and-so's 
> browser requested a gif or a css file.

We are also logging requests that change application state.
For these, some indication of who performed the action is
important.  Or you might be logging a request in which
someone is downloading information that requires login,
perhaps because someone had to pay for the piviledge.

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From pje at telecommunity.com  Wed Jan 25 18:17:29 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 25 Jan 2006 12:17:29 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <20060125065338.GA93754@prometheusresearch.com>
References: <5.1.1.6.0.20060124231537.023970a8@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<5.1.1.6.0.20060124113058.01e51ec8@mail.telecommunity.com>
	<43D657C5.5020900@zope.com> <43D65BA1.8030801@colorstudy.com>
	<5.1.1.6.0.20060124171743.03e7c568@mail.telecommunity.com>
	<5.1.1.6.0.20060124231537.023970a8@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20060125121126.04130c78@mail.telecommunity.com>

At 01:53 AM 1/25/2006 -0500, Clark C. Evans wrote:
>Hence, the interfaces between these modules is simply using the
>well-understood CGI variable ``REMOTE_USER``.  They can be used
>independently of each other, and in creative combinations.

If each middleware or application does this:

     remote_user = environ.setdefault('paste.remote_user', [])

And then uses the contents of that list as the thing to check or modify, 
then you will get the exact same result as the "pass the same environ" 
approach, except that it's actually compatible with PEP 333, as opposed to 
relying on implementation accidents.  This doesn't seem especially 
difficult to me.


>The WSGI middleware components that actually create their own environ
>are few and far between.  This is an uncommon edge case.

Composability of applications is a critical requirement for WSGI 
middleware.  It doesn't matter how uncommon it is.  Even if there were 
*zero* implementations of such middleware right now, that principle would 
take precedence, meaning you'd have to have a proposal that would preserve 
composability.  Right now, you haven't described a way to do that without 
introducing temporal coupling (or worse) among subrequests.


From ianb at colorstudy.com  Wed Jan 25 21:55:22 2006
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed, 25 Jan 2006 14:55:22 -0600
Subject: [Web-SIG] WSGI extension spec/doc
Message-ID: <43D7E5BA.3060504@colorstudy.com>

I'm thinking that we should start a less-formal spec about WSGI 
conventions.  Here's some initial things:

* Username callback, per recent discussion

* A key to suppress the catching of unexpected errors (e.g., when 
running in a test, you want exceptions to go all the way up)

* For internal redirects, a way to indicate the previous WSGI request 
environment

* A key for the session id, per Ben's email

* Some CGIish variables.  We all know REMOTE_USER (though I've seen 
LOGIN_USER too, IIS?).  If I want to put user group information into the 
environment as a simple string, where would that go?  I think there's 
other ones like this too.  Some of these are just collecting current 
knowledge about CGI-like environments that we're likely to encounter.

In each of these cases, you don't *need* to use any of these (you can 
make your own ad hoc techniques), or pay attention to the values that 
may be in the environment for these.  But, barring some reason to the 
contrary, why not stick with convention?  Basically I want to document 
some conventions.  Particularly the simple ones.

I almost think we can do this in the wiki, because it's not that formal 
of a process.  I imagine two stages:

* People talk about what they are already doing.  This way people know 
what precedence exists at least.

* We make a new namespace (or reuse 'wsgi') where things get moved after 
some discussion.  I would prefer a namespace other than wsgi.

* If we agree on something we later realize is stupid, we abandon the 
old key and use another one.  Because these are all optional to support 
for now (and for the forseeable future), specific versions of a spec and 
whatnot don't seem necessary.  Maybe people can use 'x-wsgi' (or 
x-whatever-we-decide) as a namespace for proposals (when accompanied by 
code).

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From cce at clarkevans.com  Thu Jan 26 02:43:04 2006
From: cce at clarkevans.com (Clark C. Evans)
Date: Wed, 25 Jan 2006 20:43:04 -0500
Subject: [Web-SIG] Communicating *up* the WSGI Stack
Message-ID: <20060126014304.GA4252@prometheusresearch.com>

In the recent discussion, "Communicating Authenticated User
Information", Jim Fulton asked for a standard way for WSGI
applications to communicate the authenticated user back to
the web server for logging purposes.   I have a related need
for signed cookies (including both the user and the session
identifier); and in some of my application development other
similar needs.

The current solution under consideration is a 'wsgi.set_user' 
callback that the application can call to inform up-stream
WSGI middleware components about the current user.  If we go
with an approach like this, I'd rather it be generalized via
something like 'wsgi.upstream(key, value)' which any WSGI
middlelayer could hook-into.

However, I'm much more intrigued by PJE's suggestion that we
use response headers, like 'X-Authenticated-User', rather
than adding an extension API for this feature.  This solution 
has the advantage that a 'WSGI' stack could be distributed
over multiple servers, as suggested by Michal Wallace.

Since I'm a generalist nut (a good or a bad thing), I was
thinking of a solution that would let me communicate session
identifiers, and other application specific information
up-stream to the logging service (and perhaps other
services).  I was wondering if this could be done in a more
general way, and took a quick stab at it:

   http://clarkevans.com/tmp/gateway-environment/spec.html

The goal of this sort of header is to allow just about any
CGI or other environment variable to be communicated *up* a
server stack in a response header (as well as down the stack
if distributing WSGI services over more than one server); 
in a way that would be stripped before the final response 
went out to the server.
 
With a header such as this, Jim's request could be addressed
(using a Russian user nick-named "sugar") as:

    Gateway-Environment:
        REMOTE_IDENT = "=?UTF-8?b?0YHQsNGF0LDRgA==?="

Any thoughts?  If this seems like a positive direction, I 
could work on a few helper tools to make it painless.

Best,

Clark

From herb at dynamic-solutions.com  Thu Jan 26 03:29:12 2006
From: herb at dynamic-solutions.com (Herb Lainchbury)
Date: Wed, 25 Jan 2006 18:29:12 -0800
Subject: [Web-SIG] Communicating *up* the WSGI Stack
In-Reply-To: <20060126014304.GA4252@prometheusresearch.com>
Message-ID: <01c501c62220$4ba68560$6601a8c0@hercules>

Sounds reasonable.

If you're going to provide a facility for passing stuff up (or sideways in
the case of multi-server feature), and if you're going to strip it off
before sending it to the web server, you might as well pickle/marshal a
whole object in which you can toss anything you want in.  

It's sort of like a global request storage area that lives only for the
current request.  The framework would be allowed to peek in and get the
userid if it's available.

Herb Lainchbury
Dynamic Solutions Inc.
www.dynamic-solutions.com

-----Original Message-----
From: web-sig-bounces+herb=dynamic-solutions.com at python.org
[mailto:web-sig-bounces+herb=dynamic-solutions.com at python.org] On Behalf Of
Clark C. Evans
Sent: Wednesday, January 25, 2006 5:43 PM
To: web-sig at python.org
Subject: [Web-SIG] Communicating *up* the WSGI Stack

In the recent discussion, "Communicating Authenticated User
Information", Jim Fulton asked for a standard way for WSGI
applications to communicate the authenticated user back to
the web server for logging purposes.   I have a related need
for signed cookies (including both the user and the session
identifier); and in some of my application development other
similar needs.

The current solution under consideration is a 'wsgi.set_user' 
callback that the application can call to inform up-stream
WSGI middleware components about the current user.  If we go
with an approach like this, I'd rather it be generalized via
something like 'wsgi.upstream(key, value)' which any WSGI
middlelayer could hook-into.

However, I'm much more intrigued by PJE's suggestion that we
use response headers, like 'X-Authenticated-User', rather
than adding an extension API for this feature.  This solution 
has the advantage that a 'WSGI' stack could be distributed
over multiple servers, as suggested by Michal Wallace.

Since I'm a generalist nut (a good or a bad thing), I was
thinking of a solution that would let me communicate session
identifiers, and other application specific information
up-stream to the logging service (and perhaps other
services).  I was wondering if this could be done in a more
general way, and took a quick stab at it:

   http://clarkevans.com/tmp/gateway-environment/spec.html

The goal of this sort of header is to allow just about any
CGI or other environment variable to be communicated *up* a
server stack in a response header (as well as down the stack
if distributing WSGI services over more than one server); 
in a way that would be stripped before the final response 
went out to the server.
 
With a header such as this, Jim's request could be addressed
(using a Russian user nick-named "sugar") as:

    Gateway-Environment:
        REMOTE_IDENT = "=?UTF-8?b?0YHQsNGF0LDRgA==?="

Any thoughts?  If this seems like a positive direction, I 
could work on a few helper tools to make it painless.

Best,

Clark
_______________________________________________
Web-SIG mailing list
Web-SIG at python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe:
http://mail.python.org/mailman/options/web-sig/herb%40dynamic-solutions.com


From cce at clarkevans.com  Thu Jan 26 06:38:43 2006
From: cce at clarkevans.com (Clark C. Evans)
Date: Thu, 26 Jan 2006 00:38:43 -0500
Subject: [Web-SIG] Communicating authenticated user information
In-Reply-To: <5.1.1.6.0.20060125121126.04130c78@mail.telecommunity.com>
References: <5.1.1.6.0.20060123142235.0398c388@mail.telecommunity.com>
	<5.1.1.6.0.20060123145609.01e4cdb8@mail.telecommunity.com>
	<20060123201255.GE31096@prometheusresearch.com>
	<5.1.1.6.0.20060123161301.0370c7c0@mail.telecommunity.com>
	<5.1.1.6.0.20060124113058.01e51ec8@mail.telecommunity.com>
	<43D657C5.5020900@zope.com> <43D65BA1.8030801@colorstudy.com>
	<5.1.1.6.0.20060124171743.03e7c568@mail.telecommunity.com>
	<5.1.1.6.0.20060124231537.023970a8@mail.telecommunity.com>
	<5.1.1.6.0.20060125121126.04130c78@mail.telecommunity.com>
Message-ID: <20060126053843.GA16369@prometheusresearch.com>

Uncle! Uncle!

On Wed, Jan 25, 2006 at 12:17:29PM -0500, Phillip J. Eby wrote:
| If each middleware or application does this:
| 
|     remote_user = environ.setdefault('paste.remote_user', [])
| 
| And then uses the contents of that list as the thing to check or modify, 
| then you will get the exact same result as the "pass the same environ" 
| approach, except that it's actually compatible with PEP 333, as opposed 
| to relying on implementation accidents.

Ok, assuming that we want an "extension API" for this sort of thing; I'd
rather have a bit more general solution. At the very least, it would be
nice to have a unified way to pass the REMOTE_USER up the WSGI stack so
that each WSGI middleware toolkit doesn't have to roll their own (ie,
paste.remote_user and zope.set_user).  But ideally, the solution should
handle more than just REMOTE_USER since I need to track session
identifiers and other environment changes.  Here is a proposal.

  wsgi.notify(key, value)

    This optional environment variable is a function used to notify
    previous stages of processing about a change in the ``environ``.
    Authentication middleware components, for example, would want to
    do something like:

      if environ.get('wsgi.notify'):
         environ.get('wsgi.notify')('REMOTE_USER','foo')
      environ['REMOTE_USER'] = 'foo'
 
    when setting an common environment variable which may be useful
    to previous processing stages.  Prior stages may then 
    watch for particularly important changes by replacing this
    function, making sure to call prior instances, like:
    
      class Logger:
         def __init__(self, application):
             self.application = application
             self.user_counts = {}
         def __call__(self, environ, start_response):
             prev_notify = environ.get('wsgi.notify', lambda k,v: None)
             def notify(k,v):
                 if 'REMOTE_USER' == k:
                     environ['bing.user'] = v
                 prev_notify(k,v)
             def _start_response(status, response_headers, exce_info=None):
                 user = environ.get('bing.user','anonymous')
                 self.access[user] = self.access.get(user,0) + 1 
                 return start_response(status, response_headers, exce_info)
             environ['wsgi.notify'] = notify
             return self.application(environ, _start_response)

| >The WSGI middleware components that actually create their own environ
| >are few and far between.  This is an uncommon edge case.
| 
| Composability of applications is a critical requirement for WSGI 
| middleware.  It doesn't matter how uncommon it is.  Even if there were 
| *zero* implementations of such middleware right now, that principle 
| would take precedence, meaning you'd have to have a proposal that would 
| preserve composability.  Right now, you haven't described a way to do 
| that without introducing temporal coupling (or worse) among subrequests.

If you assume a single thread of control; ie, all sub-requests are done
sequentially, then "extension APIs" share all of the pitfalls as
mandating a single ``environ``.  I've demonstrated how this is possible
in an earlier message.

However, in a *threaded* environment, the approach I proposed is
unworkable if sub-requests are executed in parallel.  In this case,
strange and nasty consequences would exist if multiple sub-applications
were accessing the same ``environ`` dict.  It is for this reason that
I'm throwing in the towel.

I hope something like the proposal above; or my other attempt to 
formalize a response-based approach are closer to your liking.

Best Wishes,

Clark

From floydophone at gmail.com  Tue Jan 31 04:20:09 2006
From: floydophone at gmail.com (Peter Hunt)
Date: Mon, 30 Jan 2006 22:20:09 -0500
Subject: [Web-SIG] Standardized template API
Message-ID: <6654eac40601301920n2448c068q83b33ee4c0d4fa37@mail.gmail.com>

Hi guys -

I think a lot of web frameworks and applications are using a template
engine. We should probably have an officially sanctioned templating engine
plugin API, as it would ease adoption of existing Python web framework
solutions.

I think all it would take is a Web-Sig (and perhaps a PEP) blessing the
TurboGears template engine plugin API [1].

Thoughts?

Peter Hunt

[1] http://www.turbogears.org/docs/plugins/template.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/web-sig/attachments/20060130/06b5c045/attachment.htm 

From ianb at colorstudy.com  Tue Jan 31 04:44:34 2006
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon, 30 Jan 2006 21:44:34 -0600
Subject: [Web-SIG] Standardized template API
In-Reply-To: <6654eac40601301920n2448c068q83b33ee4c0d4fa37@mail.gmail.com>
References: <6654eac40601301920n2448c068q83b33ee4c0d4fa37@mail.gmail.com>
Message-ID: <43DEDD22.9010208@colorstudy.com>

Peter Hunt wrote:
> I think all it would take is a Web-Sig (and perhaps a PEP) blessing the 
> TurboGears template engine plugin API [1].
> 
> Thoughts?
> 
> Peter Hunt
> 
> [1] http://www.turbogears.org/docs/plugins/template.html

I concur.  I've started using it in a project of mine.  There's a couple 
things I'd like to add that have come up on the TG list:

* Add "template_file" and "template_string" arguments to .render(), 
which take filenames and strings.

* Add a find_template callback, which given a filename can, for 
instance, use a search path to find the file.  So the framework or 
application would pass this function into the plugin somehow.  Maybe 
this could be extended some for use in situations where templates aren't 
found on the filesystem.

* Add some methods for quoting -- one for text that is already a 
HTML/XML literal, and one for text that should be quoted.  Some 
templates default one way (quote everything unless explicitly asked not 
to in some way), and some the other way (include everything as though it 
is markup, unless explicitly quoting it).  This makes it possible -- 
though not incredibly easy -- to write some template-language-neutral 
libraries.  I would assume most applications would actually be written 
to a specific templating language, so this is only for libraries that go 
out of their way to be neutral.

-- 
Ian Bicking  |  ianb at colorstudy.com  |  http://blog.ianbicking.org

From jim at zope.com  Tue Jan 31 11:11:49 2006
From: jim at zope.com (Jim Fulton)
Date: Tue, 31 Jan 2006 05:11:49 -0500
Subject: [Web-SIG] Standardized template API
In-Reply-To: <6654eac40601301920n2448c068q83b33ee4c0d4fa37@mail.gmail.com>
References: <6654eac40601301920n2448c068q83b33ee4c0d4fa37@mail.gmail.com>
Message-ID: <43DF37E5.1070406@zope.com>

Peter Hunt wrote:
> Hi guys -
> 
> I think a lot of web frameworks and applications are using a template 
> engine. We should probably have an officially sanctioned templating 
> engine plugin API, as it would ease adoption of existing Python web 
> framework solutions.
> 
> I think all it would take is a Web-Sig (and perhaps a PEP) blessing the 
> TurboGears template engine plugin API [1].
> 
> Thoughts?

Different frameworks will have very different ways of handling templates.
I don't think there is a standard way to do this and I don't think it
would be appropriate for the web-sig to try to pick one.

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From ianb at colorstudy.com  Tue Jan 31 16:30:43 2006
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 31 Jan 2006 09:30:43 -0600
Subject: [Web-SIG] Standardized template API
In-Reply-To: <43DF37E5.1070406@zope.com>
References: <6654eac40601301920n2448c068q83b33ee4c0d4fa37@mail.gmail.com>
	<43DF37E5.1070406@zope.com>
Message-ID: <43DF82A3.7020309@colorstudy.com>

Jim Fulton wrote:
>>I think a lot of web frameworks and applications are using a template 
>>engine. We should probably have an officially sanctioned templating 
>>engine plugin API, as it would ease adoption of existing Python web 
>>framework solutions.
>>
>>I think all it would take is a Web-Sig (and perhaps a PEP) blessing the 
>>TurboGears template engine plugin API [1].
>>
>>Thoughts?
> 
> 
> Different frameworks will have very different ways of handling templates.
> I don't think there is a standard way to do this and I don't think it
> would be appropriate for the web-sig to try to pick one.

Right now there is a significant intersection of templates and 
frameworks where this applies, but that intersection is not 
all-encompassing.  This interface will only work for template languages 
that are given some dictionary of values and have internal logic to 
render that.  This includes most current Python templating languages, 
but does not include templates like PyMeld (and derivatives) which are 
just structures manipulated from the outside, or templates like Nevow 
that are based on callbacks.

Just because there is a standard doesn't mean anyone has to use it.  I 
don't see why every standard has to satisfy everyone; if 50% of people 
use a standard instead of using framework-specific ad hoc interfaces, 
that's useful enough.  Especially a standard like this which doesn't 
imply much of any logic, it's just something you use.

-- 
Ian Bicking  |  ianb at colorstudy.com  |  http://blog.ianbicking.org

From dangoor at gmail.com  Tue Jan 31 16:53:41 2006
From: dangoor at gmail.com (Kevin Dangoor)
Date: Tue, 31 Jan 2006 10:53:41 -0500
Subject: [Web-SIG] Standardized template API
In-Reply-To: <43DEDD22.9010208@colorstudy.com>
References: <6654eac40601301920n2448c068q83b33ee4c0d4fa37@mail.gmail.com>
	<43DEDD22.9010208@colorstudy.com>
Message-ID: <3f085ecd0601310753o19edc503we008f2c4c6b80ebf@mail.gmail.com>

On 1/30/06, Ian Bicking <ianb at colorstudy.com> wrote:
> Peter Hunt wrote:
> > I think all it would take is a Web-Sig (and perhaps a PEP) blessing the
> > TurboGears template engine plugin API [1].

Thanks for bringing this up, Peter. Discussions to date have been ad
hoc on the TurboGears mailing list, and it's better to open this up.

> >
> > Thoughts?
> >
> > Peter Hunt
> >
> > [1] http://www.turbogears.org/docs/plugins/template.html
>
> I concur.  I've started using it in a project of mine.  There's a couple
> things I'd like to add that have come up on the TG list:
>
> * Add "template_file" and "template_string" arguments to .render(),
> which take filenames and strings.

Would template_file be expecting an absolute filename? (just
confirming that this wouldn't do any more lookups or path searching...
just open and go!)

These seem pleasant enough to me.

> * Add a find_template callback, which given a filename can, for
> instance, use a search path to find the file.  So the framework or
> application would pass this function into the plugin somehow.  Maybe
> this could be extended some for use in situations where templates aren't
> found on the filesystem.

Being able to pass a string to render() would also allow use where the
templates aren't in the filesystem.

In fact, if find_template is just returning a filename, then this
mechanism couldn't work for pulling the template from a database.

> * Add some methods for quoting -- one for text that is already a
> HTML/XML literal, and one for text that should be quoted.  Some
> templates default one way (quote everything unless explicitly asked not
> to in some way), and some the other way (include everything as though it
> is markup, unless explicitly quoting it).  This makes it possible --
> though not incredibly easy -- to write some template-language-neutral
> libraries.  I would assume most applications would actually be written
> to a specific templating language, so this is only for libraries that go
> out of their way to be neutral.

That sounds potentially hairy. Not the specifics of having the quoting
methods, but rather the notion of having template-language-neutral
libraries that use these. Are you thinking of something specific? that
might make it clearer to me.

Kevin

From wilk-ml at flibuste.net  Tue Jan 31 17:17:08 2006
From: wilk-ml at flibuste.net (William Dode)
Date: Tue, 31 Jan 2006 16:17:08 +0000 (UTC)
Subject: [Web-SIG] Standardized template API
References: <6654eac40601301920n2448c068q83b33ee4c0d4fa37@mail.gmail.com>
Message-ID: <dro2i4$g07$1@sea.gmane.org>

On 31-01-2006, Peter Hunt wrote:
> Hi guys -
>
> I think a lot of web frameworks and applications are using a template
> engine. We should probably have an officially sanctioned templating engine
> plugin API, as it would ease adoption of existing Python web framework
> solutions.

What about pep292 ?
string.Template

http://www.python.org/doc/current/lib/node109.html

-- 
William Dod? - http://flibuste.net


From ianb at colorstudy.com  Tue Jan 31 17:48:53 2006
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 31 Jan 2006 10:48:53 -0600
Subject: [Web-SIG] Standardized template API
In-Reply-To: <3f085ecd0601310753o19edc503we008f2c4c6b80ebf@mail.gmail.com>
References: <6654eac40601301920n2448c068q83b33ee4c0d4fa37@mail.gmail.com>	
	<43DEDD22.9010208@colorstudy.com>
	<3f085ecd0601310753o19edc503we008f2c4c6b80ebf@mail.gmail.com>
Message-ID: <43DF94F5.7080108@colorstudy.com>

Kevin Dangoor wrote:
>>>Thoughts?
>>>
>>>Peter Hunt
>>>
>>>[1] http://www.turbogears.org/docs/plugins/template.html
>>
>>I concur.  I've started using it in a project of mine.  There's a couple
>>things I'd like to add that have come up on the TG list:
>>
>>* Add "template_file" and "template_string" arguments to .render(),
>>which take filenames and strings.
> 
> 
> Would template_file be expecting an absolute filename? (just
> confirming that this wouldn't do any more lookups or path searching...
> just open and go!)

I'm thinking if it is an absolute filename, it should be treated as an 
exact reference.  If not, then you search for the file in the search 
path.  Though maybe that logic should be in find_template.  For 
instance, sometimes you might want to force the template to be under 
some base directory (e.g., for security reasons), so find_template might 
treat absolute filenames as relative to that directory.

I think some default/recommended implementation of find_template would 
also be useful, at least to see how common features are implemented in 
terms of it.

>>* Add a find_template callback, which given a filename can, for
>>instance, use a search path to find the file.  So the framework or
>>application would pass this function into the plugin somehow.  Maybe
>>this could be extended some for use in situations where templates aren't
>>found on the filesystem.
> 
> 
> Being able to pass a string to render() would also allow use where the
> templates aren't in the filesystem.

Yes; I was thinking of implementing Subway's use of docstrings for 
templates, and realized I couldn't do it with the interface as it is.

> In fact, if find_template is just returning a filename, then this
> mechanism couldn't work for pulling the template from a database.

It could work partially.  I was figuring that if you did something like 
include another template or inherit or whatnot (every language has its 
own name for this) then that template would have to be in the 
filesystem, or more generally somewhere that find_template can find it.

Maybe it would be better if find_template actually ultimately called 
load_template, and load_template became a bit more useful.  So instead 
of render being called with the template name, it is called with 
whatever load_template returns.

This would also allow languages like PyMeld to participate partially, in 
that users could use load_template to get the templates, even if they 
could not use render.

>>* Add some methods for quoting -- one for text that is already a
>>HTML/XML literal, and one for text that should be quoted.  Some
>>templates default one way (quote everything unless explicitly asked not
>>to in some way), and some the other way (include everything as though it
>>is markup, unless explicitly quoting it).  This makes it possible --
>>though not incredibly easy -- to write some template-language-neutral
>>libraries.  I would assume most applications would actually be written
>>to a specific templating language, so this is only for libraries that go
>>out of their way to be neutral.
> 
> 
> That sounds potentially hairy. Not the specifics of having the quoting
> methods, but rather the notion of having template-language-neutral
> libraries that use these. Are you thinking of something specific? that
> might make it clearer to me.

I'm not sure at this point.  You'd actually have to tell the library 
what template language you were planning to use with it; I don't think 
any single quoting will work, because the quoting might happen deep in 
some nested structure.  But the idea is that a library like a 
widget-generation library could work better across languages.

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From cce at clarkevans.com  Tue Jan 31 20:03:50 2006
From: cce at clarkevans.com (Clark C. Evans)
Date: Tue, 31 Jan 2006 14:03:50 -0500
Subject: [Web-SIG] Standardized template API
In-Reply-To: <43DF94F5.7080108@colorstudy.com>
References: <6654eac40601301920n2448c068q83b33ee4c0d4fa37@mail.gmail.com>
	<43DEDD22.9010208@colorstudy.com>
	<3f085ecd0601310753o19edc503we008f2c4c6b80ebf@mail.gmail.com>
	<43DF94F5.7080108@colorstudy.com>
Message-ID: <20060131190350.GB2211@prometheusresearch.com>

On Tue, Jan 31, 2006 at 10:48:53AM -0600, Ian Bicking wrote:
| Kevin Dangoor wrote:
| >>>[1] http://www.turbogears.org/docs/plugins/template.html
| >>
| >>I concur.  I've started using it in a project of mine.  There's a couple
| >>things I'd like to add that have come up on the TG list:
| >>
| >>* Add "template_file" and "template_string" arguments to .render(),
| >>which take filenames and strings.
| > 
| > Would template_file be expecting an absolute filename? (just
| > confirming that this wouldn't do any more lookups or path searching...
| > just open and go!)
| 
| I'm thinking if it is an absolute filename, it should be treated as an 
| exact reference.  If not, then you search for the file in the search 
| path.  Though maybe that logic should be in find_template.  For 
| instance, sometimes you might want to force the template to be under 
| some base directory (e.g., for security reasons), so find_template might 
| treat absolute filenames as relative to that directory.

I'd stick with the notion of a "template_name" that is neither the
template file nor the template body.  Then you'd want a template factory
method that takes the name and produces the template body (complied if
necessary).  This needs to be once indirect since a template may refer
to other sub-templates. This way your template could be stored
in-memory, on-disk, or in a database, or even remotely using an HTTP
cashe.  The actual storage mechanism for the template source code should
not be part of this interface.

Best,

Clark

From floydophone at gmail.com  Tue Jan 31 20:36:58 2006
From: floydophone at gmail.com (Peter Hunt)
Date: Tue, 31 Jan 2006 14:36:58 -0500
Subject: [Web-SIG] Standardized template API
In-Reply-To: <6949EC6CD39F97498A57E0FA55295B210148851E@ex9.hostedexchange.local>
References: <6949EC6CD39F97498A57E0FA55295B210148851E@ex9.hostedexchange.local>
Message-ID: <6654eac40601311136i4a9c492bqb741d2befc7f8afe@mail.gmail.com>

On 1/31/06, Robert Brewer <fumanchu at amor.org> wrote:
>
>
> That's true, but I'd caution that "it's just something you use" hides a
> mountain of difficulties. A standard like WSGI can be relatively free in
> its interface design (in order to meet a host of specialized needs),
> because it has a limited user group. A template standard, in contrast,
> will need much more attention paid to "ease of use", which will
> constrain its interface design.


The user group of a standardized template API would be a framework author.
Different frameworks would still access their templating engines in
different ways, but with a standardized template engine API, you'd be able
to plug different templating engines (i.e. django, cheetah, kid, zpt, psp)
into different frameworks, and have it Just Work.

Peter Hunt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/web-sig/attachments/20060131/4c58aba5/attachment.htm 

From fumanchu at amor.org  Tue Jan 31 20:31:52 2006
From: fumanchu at amor.org (Robert Brewer)
Date: Tue, 31 Jan 2006 11:31:52 -0800
Subject: [Web-SIG] Standardized template API
Message-ID: <6949EC6CD39F97498A57E0FA55295B210148851E@ex9.hostedexchange.local>

Ian Bicking wrote:
> Just because there is a standard doesn't mean anyone has to 
> use it.  I don't see why every standard has to satisfy
> everyone; if 50% of people use a standard instead of
> using framework-specific ad hoc interfaces, that's useful
> enough.  Especially a standard like this which doesn't
> imply much of any logic, it's just something you use.

That's true, but I'd caution that "it's just something you use" hides a
mountain of difficulties. A standard like WSGI can be relatively free in
its interface design (in order to meet a host of specialized needs),
because it has a limited user group. A template standard, in contrast,
will need much more attention paid to "ease of use", which will
constrain its interface design.

Don't forget that the current template landscape is fragmented in part
because the contenders compete over "ease of use". If a template author
feels their primary value lies in their "Pythonic API", then a common
proxy interface will be perceived as reducing their value. All of which
is to say, there's as much of a social issue to solve here as there is a
technical issue.


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org

From pje at telecommunity.com  Tue Jan 31 20:59:43 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 31 Jan 2006 14:59:43 -0500
Subject: [Web-SIG] Standardized template API
In-Reply-To: <20060131190350.GB2211@prometheusresearch.com>
References: <43DF94F5.7080108@colorstudy.com>
	<6654eac40601301920n2448c068q83b33ee4c0d4fa37@mail.gmail.com>
	<43DEDD22.9010208@colorstudy.com>
	<3f085ecd0601310753o19edc503we008f2c4c6b80ebf@mail.gmail.com>
	<43DF94F5.7080108@colorstudy.com>
Message-ID: <5.1.1.6.0.20060131144811.03ff2ff0@mail.telecommunity.com>

At 02:03 PM 1/31/2006 -0500, Clark C. Evans wrote:
>I'd stick with the notion of a "template_name" that is neither the
>template file nor the template body.  Then you'd want a template factory
>method that takes the name and produces the template body (complied if
>necessary).  This needs to be once indirect since a template may refer
>to other sub-templates. This way your template could be stored
>in-memory, on-disk, or in a database, or even remotely using an HTTP
>cashe.  The actual storage mechanism for the template source code should
>not be part of this interface.

I'd go even further than this, to note that frameworks need to be able to 
cache "compiled" versions of templates.  The template "engine" should be 
able to return a "compiled" template that the framework can use to do 
rendering in future.

Note too that frameworks such as Zope 3 and peak.web have concepts like 
localization and "skinning" that require the ability to switch out the 
actual provider of a named template or "view", and in at least the case of 
peak.web this can be polymorphic.  That is, one "skin" could implement 
template "foo" using Kid and another one using ZPT.  So the name of a 
template needs to be independent of its implementation language, or the 
ability to plug in different engines is moot.  (Currently, TurboGears 
embeds the chosen template language in code, which means a deployer can't 
skin the application effectively.)

Finally, I'd note that an increasingly common use case for template storage 
is likely to be via pkg_resources lookup, so that applications can be 
deployed as eggs.  Ideally, future versions of Zope 3 and peak.web would 
perhaps allow one egg to provide skins or "layers" for views defined in 
other eggs, so that users can plug in third-party skins for applications.

Actually, if we're going to come up with something really useful here, it 
would probably be a good idea to expand the scope from just defining 
templates, to defining a "resource access" protocol to cover both static 
resource files and templates that may need to be localized or 
skinned.  That would be a much bigger win, IMO, since it would put more 
control in the hands of deployers and customizers, instead of just making 
it possible for developers to use whatever template language they fancy.  :)


From pywebsig at xhaus.com  Tue Jan 31 23:02:26 2006
From: pywebsig at xhaus.com (Alan Kennedy)
Date: Tue, 31 Jan 2006 22:02:26 +0000
Subject: [Web-SIG] Standardized template API
In-Reply-To: <20060131190350.GB2211@prometheusresearch.com>
References: <6654eac40601301920n2448c068q83b33ee4c0d4fa37@mail.gmail.com>	<43DEDD22.9010208@colorstudy.com>	<3f085ecd0601310753o19edc503we008f2c4c6b80ebf@mail.gmail.com>	<43DF94F5.7080108@colorstudy.com>
	<20060131190350.GB2211@prometheusresearch.com>
Message-ID: <43DFDE72.9050705@xhaus.com>

[Clark C. Evans]
> I'd stick with the notion of a "template_name" that is neither the
> template file nor the template body.  Then you'd want a template factory
> method that takes the name and produces the template body (complied if
> necessary).  

I agree.

If you're looking for an existing model (in java), the Spring framework 
has "View" objects (i.e. the V in MVC) and "View Resolver" objects. The 
latter resolve logical template names to actual templates, compiled if 
necessary.

View Interface
http://static.springframework.org/spring/docs/1.2.x/api/org/springframework/web/servlet/View.html

ViewResovler Interface
http://static.springframework.org/spring/docs/1.2.x/api/org/springframework/web/servlet/ViewResolver.html

> This way your template could be stored
> in-memory, on-disk, or in a database, or even remotely using an HTTP
> cashe.  The actual storage mechanism for the template source code should
> not be part of this interface.

A very important requirement IMHO.

Regards,

Alan.


From ianb at colorstudy.com  Tue Jan 31 23:36:06 2006
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 31 Jan 2006 16:36:06 -0600
Subject: [Web-SIG] Standardized template API
In-Reply-To: <20060131190350.GB2211@prometheusresearch.com>
References: <6654eac40601301920n2448c068q83b33ee4c0d4fa37@mail.gmail.com>
	<43DEDD22.9010208@colorstudy.com>
	<3f085ecd0601310753o19edc503we008f2c4c6b80ebf@mail.gmail.com>
	<43DF94F5.7080108@colorstudy.com>
	<20060131190350.GB2211@prometheusresearch.com>
Message-ID: <43DFE656.5060906@colorstudy.com>

Clark C. Evans wrote:
> On Tue, Jan 31, 2006 at 10:48:53AM -0600, Ian Bicking wrote:
> | Kevin Dangoor wrote:
> | >>>[1] http://www.turbogears.org/docs/plugins/template.html
> | >>
> | >>I concur.  I've started using it in a project of mine.  There's a couple
> | >>things I'd like to add that have come up on the TG list:
> | >>
> | >>* Add "template_file" and "template_string" arguments to .render(),
> | >>which take filenames and strings.
> | > 
> | > Would template_file be expecting an absolute filename? (just
> | > confirming that this wouldn't do any more lookups or path searching...
> | > just open and go!)
> | 
> | I'm thinking if it is an absolute filename, it should be treated as an 
> | exact reference.  If not, then you search for the file in the search 
> | path.  Though maybe that logic should be in find_template.  For 
> | instance, sometimes you might want to force the template to be under 
> | some base directory (e.g., for security reasons), so find_template might 
> | treat absolute filenames as relative to that directory.
> 
> I'd stick with the notion of a "template_name" that is neither the
> template file nor the template body.  Then you'd want a template factory
> method that takes the name and produces the template body (complied if
> necessary).  This needs to be once indirect since a template may refer
> to other sub-templates. This way your template could be stored
> in-memory, on-disk, or in a database, or even remotely using an HTTP
> cashe.  The actual storage mechanism for the template source code should
> not be part of this interface.

I guess there's a couple issues.  I don't have a problem with 
template_name in the abstract, so long as it is not confused with module 
names (which implies all templates are modules).  Also, I think template 
names should be unambiguously meaningful only in the context of one 
plugin instance.  Lastly, I would rather template names not be 
restricted to Python module semantics, with "." as an indicator of 
hierarchy, and valid python symbols for the names.

Then, there's some question about what a template name is.  Do names 
include extensions?  Can they optionally include extensions?  Are 
extensions determined by the plugin, or by find_template, or a 
combination of the two?

Oh, and I actually don't want template_names to be read as particularly 
hierarchical, I now realize.  So a template "foo.bar" (or "foo/bar", 
depending on the syntax we might choose) should not in any way involve 
finding "foo" and then finding "bar" from that.  If find_template() 
works that way, fine (annoying but fine), but the plugins should make no 
such assumption.  So maybe I want template names to be opaque to the 
plugin, though we should agree on standard conventions.

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From mike_mp at zzzcomputing.com  Tue Jan 31 23:43:24 2006
From: mike_mp at zzzcomputing.com (Michael Bayer)
Date: Tue, 31 Jan 2006 17:43:24 -0500 (EST)
Subject: [Web-SIG] Standardized template API
In-Reply-To: <20060131190350.GB2211@prometheusresearch.com>
References: <6654eac40601301920n2448c068q83b33ee4c0d4fa37@mail.gmail.com>
	<43DEDD22.9010208@colorstudy.com>
	<3f085ecd0601310753o19edc503we008f2c4c6b80ebf@mail.gmail.com>
	<43DF94F5.7080108@colorstudy.com>
	<20060131190350.GB2211@prometheusresearch.com>
Message-ID: <16954.66.192.34.8.1138747404.squirrel@www.geekisp.com>

Clark C. Evans wrote:
>
> I'd stick with the notion of a "template_name" that is neither the
> template file nor the template body.  Then you'd want a template factory
> method that takes the name and produces the template body (complied if
> necessary).  This needs to be once indirect since a template may refer
> to other sub-templates. This way your template could be stored
> in-memory, on-disk, or in a database, or even remotely using an HTTP
> cashe.  The actual storage mechanism for the template source code should
> not be part of this interface.
>

i agree with this totally; you definitely need to have some layer of
indirection between a template identifier and the template object itself
that is more abstract than a file location or otherwise.  just a "string"
is all it needs to be.  in the case of myghty, there is an entire list of
resolution rules that can choose to serve the actual template, each of
which can use whatever datasource it wants; file locations, URLs,
in-memory identifiers, database schemes are all fair game.

- mike