From mo.babaei at gmail.com Thu Dec 1 08:48:12 2005 From: mo.babaei at gmail.com (mohammad babaei) Date: Thu, 1 Dec 2005 11:18:12 +0330 Subject: [Web-SIG] Database Module in a Web Application Message-ID: <5bf3a41f0511302348t4b84c5a4g8ac66a4ff7644adf@mail.gmail.com> Hi, I'm going to write my first web application in Python, is it an good idea to write a database module that handles the connection to database & executing queries ? Regards M.B -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/web-sig/attachments/20051201/7ea9c426/attachment.html From tsoehnli at gmu.edu Thu Dec 1 18:08:53 2005 From: tsoehnli at gmu.edu (Timothy Soehnlin) Date: Thu, 01 Dec 2005 12:08:53 -0500 Subject: [Web-SIG] Sessions and Headers Message-ID: <200512011208.53930.tsoehnli@gmu.edu> Hello All, Okay, lets get down to business. I am wondering if anyone knows of a framework independent Session library. I am looking to bring a Session library into my framework, but everything I have found so far seems to be unnecessarily integrated with the frameworks. And before I get all gung ho and go and right my own Session libraries, I was wondering if anyone knows of a library that I could use, and save myself some time. On another note, I am also wanting to integerate multiple server environments, and specifically with this question, mod_python. Now I have my framework working with mod_python but I have recently created a standard request object that all the different server environments plug into by initializing the object with an environment dictionary, a file to read the user data from(for posts and whatnot), and then a write function that gives direct control to returning the request output to the user. In mod_python the headers are automagically submitted when the function write is invoked the first time. I need this to not be. I need to have total control over the headers, as my standard Request Object handles header manipulation and submission. Thank you for your time and consideration. Sincerely, Timothy Soehnlin -- I would rather be known as a Christian and despised, than to be overlooked, and thought of as one of the world. From ben at groovie.org Thu Dec 1 18:04:25 2005 From: ben at groovie.org (Ben Bangert) Date: Thu, 1 Dec 2005 09:04:25 -0800 Subject: [Web-SIG] Sessions and Headers In-Reply-To: <200512011208.53930.tsoehnli@gmu.edu> References: <200512011208.53930.tsoehnli@gmu.edu> Message-ID: <078AD67D-7233-467A-A9EF-5425407B7058@groovie.org> On Dec 1, 2005, at 9:08 AM, Timothy Soehnlin wrote: > Okay, lets get down to business. I am wondering if anyone knows of a > framework independent Session library. I am looking to bring a > Session > library into my framework, but everything I have found so far seems > to be > unnecessarily integrated with the frameworks. And before I get all > gung ho > and go and right my own Session libraries, I was wondering if > anyone knows of > a library that I could use, and save myself some time. Many frameworks session system's can be used completely independently of the framework. Myghty's has been used in various scenarios partly as it works without a problem in mod_python, WSGI, etc. and has a consistent interface across any of the environments. Ian Bicking wrote a WSGI session middleware module that handles sessions completely independently of any framework, though I'm not sure offhand how that'd work with mod_python. I won't be surprised to see other framework authors offer advice on how to use their respective session object outside of their framework, as they're typically modular enough to function in this manner. Most of them provide a dict-style interface, some use attributes, etc. In the end, I think you'll have enough choices where you can sift it out and find the one that works best for you. Cheers, Ben From fumanchu at amor.org Thu Dec 1 19:20:47 2005 From: fumanchu at amor.org (Robert Brewer) Date: Thu, 1 Dec 2005 10:20:47 -0800 Subject: [Web-SIG] Sessions and Headers Message-ID: Timothy Soehnlin wrote: > On another note, I am also wanting to integerate > multiple server environments, and specifically > with this question, mod_python. Now I have my > framework working with mod_python but I have > recently created a standard request object that > all the different server environments plug into > by initializing the object with an environment > dictionary, a file to read the user data from > (for posts and whatnot), and then a write > function that gives direct control to returning > the request output to the user. Congratulations, you just reinvented WSGI. ;) Robert Brewer System Architect Amor Ministries fumanchu at amor.org From ianb at colorstudy.com Thu Dec 1 21:14:44 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 01 Dec 2005 14:14:44 -0600 Subject: [Web-SIG] Sessions and Headers In-Reply-To: <078AD67D-7233-467A-A9EF-5425407B7058@groovie.org> References: <200512011208.53930.tsoehnli@gmu.edu> <078AD67D-7233-467A-A9EF-5425407B7058@groovie.org> Message-ID: <438F59B4.7000900@colorstudy.com> Ben Bangert wrote: > Ian Bicking wrote a WSGI session middleware module that handles > sessions completely independently of any framework, though I'm not > sure offhand how that'd work with mod_python. It's nothing to write home about. Flup has a somewhat better session, and an object that is clearly usable outside WSGI; but it only has a couple actual stores (e.g., no database), and some room for improvements, so it isn't terribly notable either. There was some talk about this on this list a while ago, but it never really went anywhere. I proposed an interface, but since I lacked actual intention to implement it didn't go anywhere either. But it still exists, of course: http://svn.colorstudy.com/home/ianb/scarecrow_session_interface.py -- it might be useful to an implementor. In an actually-extracted form, I don't know about any session library for Python. In an extrable form, I'm sure many frameworks have something. An extracted session library would be welcome. I'm personally getting by with a session that is much lamer than the one my proposed interface would imply, which is probably fine since I only put non-critical data in it anyway. So a simpler session library would be cool too. I think it should leave out things like configuration, but there's still useful functionality to be done. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From colin at owlfish.com Sun Dec 4 15:46:14 2005 From: colin at owlfish.com (Colin Stewart) Date: Sun, 04 Dec 2005 14:46:14 +0000 Subject: [Web-SIG] ANN: WSGIUtils 0.7 Message-ID: <1133707574.3157.8.camel@roll> Hi, I've release WSGIUtils 0.7. This is a minor update, but with at one notable fix. Here's what's changed: New features: - Added minimal support for SetupTools. Bug fixes: - Changed "error.timeout" to "socket.timeout". - Changed package name from "WSGI Utils" to "WSGIUtils" for greater compatibility with other tools. The package can be found at http://www.owlfish.com/software/wsgiutils/ WSGIUtils is a package of standalone utility libraries that ease the development of simple WSGI programs. The package is divided into two main components which can be used individualy or in combination: * wsgiServer is a multi-threaded WSGI web server based on SimpleHTTPServer. * wsgiAdaptor is a simple WSGI application that provides basic authentication, signed cookies and persistent sessions. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/web-sig/attachments/20051204/6434ea39/attachment.html From tsoehnli at gmu.edu Tue Dec 6 18:38:21 2005 From: tsoehnli at gmu.edu (Timothy Soehnlin) Date: Tue, 06 Dec 2005 12:38:21 -0500 Subject: [Web-SIG] Sessions and Headers Message-ID: <200512061238.21746.tsoehnli@gmu.edu> Hello All, In a previous post I wrote about Sessions and Headers. The Sessions topic was addressed but the Headers point was never focused on. I was wondering about controlling headers in mod_python. In mod_python the headers are automagically submitted when the function write is invoked the first time. I need this to not be. I need to have total control over the headers, and control when and if they are sent to the client. I was wondering if there are any settings, examples, etc that any of you all would know about. Thank you for your time and consideration. Sincerely, Timothy Soehnlin -- I would rather be known as a Christian and despised, than to be overlooked, and thought of as one of the world. From chris.arndt at web.de Tue Dec 6 20:41:25 2005 From: chris.arndt at web.de (Christopher Arndt) Date: Tue, 06 Dec 2005 19:41:25 +0000 Subject: [Web-SIG] cgipython 2.4.x binary for FreeBSD 4.7? Message-ID: <4395E965.5040507@web.de> Hi, does anybody have, can build me, or point me to a binary of cgipython 2.4.x (preferable 2.4.2) (http://www.egenix.com/files/python/mxCGIPython.html) for FreeBSD 4.7? I am trying to install a decent Python version at a Webhoster (Verio) which apparently has FreeBSD (and only Python 1.5.2). The output of 'uname -a ' says: FreeBSD mydomain.com 4.7-RELEASE-p22 FreeBSD 4.7-RELEASE-p22 #5: Tue May 3 13:36:49 MDT 2005 root at somemachine:/usr/home/somepath i386 I've tried the binaries for Python 2.3.5 provided by Oleg Broytmann for FreeBSD 4.9 and it basically works, but it lacks the '_random' module on which cgi.py relies.* The 2.4.x versions he has, are only for FreeBSD 5.4 and did not work for me. Testing all this is very difficult, because the error.log does not show stderr from CGI scripts :-( Alternatively, are there any other single-file/cgi-ready Python distros I could try? Chris * indirectly through tempfile.py and random.py From fumanchu at amor.org Tue Dec 6 21:51:20 2005 From: fumanchu at amor.org (Robert Brewer) Date: Tue, 6 Dec 2005 12:51:20 -0800 Subject: [Web-SIG] Sessions and Headers Message-ID: Timothy Soehnlin wrote: > In mod_python the headers are automagically submitted when > the function write is invoked the first time. I need this > to not be. You can do that either informally, by not calling req.write in your own code until you've built the complete response entity, or strictly, by wrapping the request object so that the write method (and flush) spools output until you're done. I *think* you are implying more constraints than that, but until you expand on them, they're hard to address. ;) Robert Brewer System Architect Amor Ministries fumanchu at amor.org From grahamd at dscpl.com.au Tue Dec 6 22:54:03 2005 From: grahamd at dscpl.com.au (Graham Dumpleton) Date: Tue, 6 Dec 2005 16:54:03 -0500 Subject: [Web-SIG] Sessions and Headers Message-ID: <1133906043.9681@dscpl.user.openhosting.com> Timothy Soehnlin wrote .. > Hello All, > > In a previous post I wrote about Sessions and Headers. The Sessions topic > was addressed but the Headers point was never focused on. I was wondering > about controlling headers in mod_python. In mod_python the headers are > automagically submitted when the function write is invoked the first time. > I > need this to not be. I need to have total control over the headers, and > control when and if they are sent to the client. I was wondering if there > are any settings, examples, etc that any of you all would know about. Don't incrementally use req.write(), instead accumulate any response as a list of strings or using StringIO instance. Then at the point that you finally want to send content, ie., after you have set your headers, then call req.write() once with the accumulated content. Note that there is a separate mod_python mailing list, you would be better off using that if you want to get a response. The mailing list you are posting to is not specifically about mod_python and so you are less likely to get a response. See the mod_python web site for how to get onto the mod_python mailing list. Graham From jim at zope.com Thu Dec 15 19:58:49 2005 From: jim at zope.com (Jim Fulton) Date: Thu, 15 Dec 2005 13:58:49 -0500 Subject: [Web-SIG] Is the size argument to the input-stream read method optional? Message-ID: <43A1BCE9.8020403@zope.com> The PEP is unclear on this and should be clarified, IMO. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Thu Dec 15 19:47:30 2005 From: jim at zope.com (Jim Fulton) Date: Thu, 15 Dec 2005 13:47:30 -0500 Subject: [Web-SIG] Thread-management middleware components? Message-ID: <43A1BA42.8090406@zope.com> Has anyone written any thread-management middleware components for WSGI? Many web applications need to run application code in separate threads. Often, the number of threads needs to be limited, either by throttling the rate of thread creation, or by dispatching requests to a thread pool. This is a capability that could be provided by a server, however, it seems that it might be functionality better provided at an intermediate layer to make it more pluggable. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Thu Dec 15 21:01:44 2005 From: jim at zope.com (Jim Fulton) Date: Thu, 15 Dec 2005 15:01:44 -0500 Subject: [Web-SIG] When must applications call the WSGI start_response callable. Message-ID: <43A1CBA8.2020706@zope.com> I'm a bit unclear about the timing of the start_response call. I think this is because the PEP is unclear, but perhaps I missed something. It doesn't appear that the PEP says when the start_response callable must be called. It gives several examples. In most, the callback is called when the application is called, but in one example, the callback is called in the __iter__ of the result of calling the application. Here's what I think the PEP should say (something like): "The start_response callback must be: - called when the application is called, - called when the result iterator is computed, or - it must be called asynchronously, typically from an application thread. Normally an application will call the start_response callable when the application is called or when the result iterator is constructed, as shown in the first 2 examples. An application, or more commonly, a middleware component that provides it's own thread management might delay starting the response. A server should not begin iterating over the result until the start_response callable has been called." Why do I want this? It appears that this would be needed to enable middleware components that manage application threads. I can imagine though that there aren't any existing servers that handle what I've suggested correctly. I do think it would be straightforward for servers to handle this correctly, especially for asynchronous servers like Twisted and ayncore-based servers. Perhaps this could be an optional feature of the servers. Servers supporting this feature would be prepared to delay response output until start_response is called. Servers unable to do this would generate errors if start_response hasn't been called by the time the result iterator has been constructed. In any case, I think the PEP needs to specify more clearly when start_response can be called. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From ianb at colorstudy.com Thu Dec 15 21:11:21 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 15 Dec 2005 14:11:21 -0600 Subject: [Web-SIG] When must applications call the WSGI start_response callable. In-Reply-To: <43A1CBA8.2020706@zope.com> References: <43A1CBA8.2020706@zope.com> Message-ID: <43A1CDE9.1000108@colorstudy.com> Jim Fulton wrote: > I'm a bit unclear about the timing of the start_response call. > I think this is because the PEP is unclear, but perhaps I missed > something. > > It doesn't appear that the PEP says when the start_response callable > must be called. It gives several examples. In most, the callback is > called when the application is called, but in one example, the > callback is called in the __iter__ of the result of calling the > application. > > Here's what I think the PEP should say (something like): > > "The start_response callback must be: > > - called when the application is called, > > - called when the result iterator is computed, or > > - it must be called asynchronously, typically from an application > thread. > > Normally an application will call the start_response callable when the > application is called or when the result iterator is constructed, as > shown in the first 2 examples. An application, or more commonly, a > middleware component that provides it's own thread management might > delay starting the response. A server should not begin iterating > over the result until the start_response callable has been called." My impression is that it is the application's responsibility to call start_response before the first item is returned from the iterator, and it is an error if it does not. However, in paste.lint (http://svn.pythonpaste.org/Paste/trunk/paste/lint.py) I check that start_response is called before the application returns the iterator. So I guess, at least where I've been inserting paste.lint, that I haven't encountered other examples in practice. But then most of the places I've used it, I wrote the application, and so I've never felt compelled to use a different order. If that's not correct, I'd like to update paste.lint. > Why do I want this? It appears that this would be needed to enable > middleware components that manage application threads. I can imagine > though that there aren't any existing servers that handle what I've > suggested correctly. > > I do think it would be straightforward for servers to handle this > correctly, especially for asynchronous servers like Twisted > and ayncore-based servers. Perhaps this could be an optional feature > of the servers. Servers supporting this feature would be prepared to > delay response output until start_response is called. Servers unable > to do this would generate errors if start_response hasn't been called > by the time the result iterator has been constructed. I suppose this wouldn't be particularly bad for threaded or multiprocess servers either -- they use a thread/process until the request is completed regardless of what happens. I can see how it could be used to greater effect in an asynchronous server. However, I'd rather it not be optional, as most WSGI apps won't do this, and so servers won't get good testing on this or may just not implement it, and then some apps and some servers won't be compatible. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From ianb at colorstudy.com Thu Dec 15 21:22:30 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 15 Dec 2005 14:22:30 -0600 Subject: [Web-SIG] Thread-management middleware components? In-Reply-To: <43A1BA42.8090406@zope.com> References: <43A1BA42.8090406@zope.com> Message-ID: <43A1D086.1060704@colorstudy.com> Jim Fulton wrote: > Has anyone written any thread-management middleware components for WSGI? > Many web applications need to run application code in separate threads. > Often, the number of threads needs to be limited, either by throttling > the rate of thread creation, or by dispatching requests to a thread pool. > This is a capability that could be provided by a server, however, it seems > that it might be functionality better provided at an intermediate layer to > make it more pluggable. Right now all threading and generally concurrency is handled by the server. Since it *has* to be handled by the server, I'm not sure what the advantage would be to duplicating that functionality? Well, strictly speaking you could have a server with wsgi.threaded and wsgi.multiprocess both being false, and the server presumably being asynchronous, but I think that's challenging to fit into the WSGI spec -- there was some discussion some time ago that dwindled off, and I don't think there was ever any resolution on handling asynchronous servers/apps in WSGI. I don't see a need for a lot of interchangeable thread pools, a handful at most should do. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From ianb at colorstudy.com Thu Dec 15 20:59:25 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 15 Dec 2005 13:59:25 -0600 Subject: [Web-SIG] Is the size argument to the input-stream read method optional? In-Reply-To: <43A1BCE9.8020403@zope.com> References: <43A1BCE9.8020403@zope.com> Message-ID: <43A1CB1D.7000900@colorstudy.com> Jim Fulton wrote: > The PEP is unclear on this and should be clarified, IMO. My experience in using implementations is many servers do not require the read size argument (they don't give a TypeError), but they block without it, or if you read past CONTENT_LENGTH. So it should probably be required in the spec, since it's required in practice. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From foom at fuhm.net Thu Dec 15 21:29:00 2005 From: foom at fuhm.net (James Y Knight) Date: Thu, 15 Dec 2005 15:29:00 -0500 Subject: [Web-SIG] When must applications call the WSGI start_response callable. In-Reply-To: <43A1CBA8.2020706@zope.com> References: <43A1CBA8.2020706@zope.com> Message-ID: <1EFFD451-F82C-4973-B2AE-9311B1500A08@fuhm.net> On Dec 15, 2005, at 3:01 PM, Jim Fulton wrote: > Normally an application will call the start_response callable when the > application is called or when the result iterator is constructed, as > shown in the first 2 examples. An application, or more commonly, a > middleware component that provides it's own thread management might > delay starting the response. A server should not begin iterating > over the result until the start_response callable has been called." But it's my understanding that this is valid: def test_calledStartResponseLate(self): def application(environ, start_response): start_response("200 OK", {}) yield "Foo" start_response is called _inside_ the first iteration of the result. So the server has to iterate at least once, even if start_response was not called... I was led to believe this was a valid thing to do from the following wording: > (Note: the application must invoke the start_response() callable > before the iterable yields its first body string, so that the > server can send the headers before any body content. However, this > invocation may be performed by the iterable's first iteration, so > servers must not assume that start_response() has been called > before they begin iterating over the iterable.) James From jim at zope.com Thu Dec 15 21:55:19 2005 From: jim at zope.com (Jim Fulton) Date: Thu, 15 Dec 2005 15:55:19 -0500 Subject: [Web-SIG] Thread-management middleware components? In-Reply-To: <43A1D086.1060704@colorstudy.com> References: <43A1BA42.8090406@zope.com> <43A1D086.1060704@colorstudy.com> Message-ID: <43A1D837.8060404@zope.com> Ian Bicking wrote: > Jim Fulton wrote: > >> Has anyone written any thread-management middleware components for WSGI? >> Many web applications need to run application code in separate threads. >> Often, the number of threads needs to be limited, either by throttling >> the rate of thread creation, or by dispatching requests to a thread pool. >> This is a capability that could be provided by a server, however, it >> seems >> that it might be functionality better provided at an intermediate >> layer to >> make it more pluggable. > > > Right now all threading and generally concurrency is handled by the > server. Since it *has* to be handled by the server, Why does it have to be handled by the server? > I'm not sure what > the advantage would be to duplicating that functionality? The advantage is that it gives people deploying an application more control. We've recently switched to using WSGI for HTTP in Zope. Our default "out of the box" server of choice is Twisted, however, the current thread-management strategy used by Twisted's WSGI server doesn't meet out needs. I could try to get Twisted to change it's stragegy, and I probably will, but it would be more flexible to be able to plug something in. > Well, > strictly speaking you could have a server with wsgi.threaded and > wsgi.multiprocess both being false, and the server presumably being > asynchronous, but I think that's challenging to fit into the WSGI spec > -- there was some discussion some time ago that dwindled off, and I > don't think there was ever any resolution on handling asynchronous > servers/apps in WSGI. We have long experience with combining an asynchronous network server with a threaded application server. Asynchronous network servers can handle I/O with lots of network clients very efficiently, but only if an application doesn't block. Real applications often take significant time to compute results. A thread-management facility that bridges asychronous servers with threaded application can work very well. It's possible that my need is specific to using asynchronous servers, but I consider working well with asynchronous servers to be a pretty important requirement. > I don't see a need for a lot of interchangeable thread pools, a handful > at most should do. I'm not sure what you mean by this. On the one hand, I'd like to be free to choose my own thread-management stragegy. On the other hand, if there are multiple asynchronous servers, I don't see why they should each have to maintain their own thread-management subsystems if one can be shared among the different servers. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Thu Dec 15 21:59:04 2005 From: jim at zope.com (Jim Fulton) Date: Thu, 15 Dec 2005 15:59:04 -0500 Subject: [Web-SIG] When must applications call the WSGI start_response callable. In-Reply-To: <1EFFD451-F82C-4973-B2AE-9311B1500A08@fuhm.net> References: <43A1CBA8.2020706@zope.com> <1EFFD451-F82C-4973-B2AE-9311B1500A08@fuhm.net> Message-ID: <43A1D918.1010803@zope.com> James Y Knight wrote: > On Dec 15, 2005, at 3:01 PM, Jim Fulton wrote: > >> Normally an application will call the start_response callable when the >> application is called or when the result iterator is constructed, as >> shown in the first 2 examples. An application, or more commonly, a >> middleware component that provides it's own thread management might >> delay starting the response. A server should not begin iterating >> over the result until the start_response callable has been called." > > > But it's my understanding that this is valid: > > def test_calledStartResponseLate(self): > def application(environ, start_response): > start_response("200 OK", {}) > yield "Foo" > > start_response is called _inside_ the first iteration of the result. So > the server has to iterate at least once, even if start_response was not > called... > > I was led to believe this was a valid thing to do from the following > wording: > >> (Note: the application must invoke the start_response() callable >> before the iterable yields its first body string, so that the server >> can send the headers before any body content. However, this >> invocation may be performed by the iterable's first iteration, so >> servers must not assume that start_response() has been called before >> they begin iterating over the iterable.) Aargh, I didn't see that, despite looking for it. I said I may have missed it. Hm. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pje at telecommunity.com Thu Dec 15 21:35:56 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 15 Dec 2005 15:35:56 -0500 Subject: [Web-SIG] When must applications call the WSGI start_response callable. In-Reply-To: <43A1CBA8.2020706@zope.com> Message-ID: <5.1.1.6.0.20051215151106.01e176f0@mail.telecommunity.com> At 03:01 PM 12/15/2005 -0500, Jim Fulton wrote: >I'm a bit unclear about the timing of the start_response call. >I think this is because the PEP is unclear, but perhaps I missed >something. > >It doesn't appear that the PEP says when the start_response callable >must be called. It gives several examples. In most, the callback is >called when the application is called, but in one example, the >callback is called in the __iter__ of the result of calling the >application. Hm. I thought there was something there saying that it had to be called by the time the first value is yielded by the iterable, but it's not explicit. The example *server* in the PEP, however, raises an AssertionError if you violate this rule. >Here's what I think the PEP should say (something like): > >"The start_response callback must be: > >- called when the application is called, > >- called when the result iterator is computed, or > >- it must be called asynchronously, typically from an application > thread. -1 on enabling asynchrony here; it would enormously complicate the design of servers. WSGI is a purely synchronous protocol. Any asynchrony within an application must be masked from the server. >Normally an application will call the start_response callable when the >application is called or when the result iterator is constructed, as >shown in the first 2 examples. An application, or more commonly, a >middleware component that provides it's own thread management might >delay starting the response. A server should not begin iterating >over the result until the start_response callable has been called." This would completely break the existing design. Note in particular that some applications do not call start_response until they're in their first iterator next() call; notably any generator-based WSGI apps will do this. >Why do I want this? It appears that this would be needed to enable >middleware components that manage application threads. No, it's not needed. Such middleware would simply have to return iterators that communicate with the other threads (e.g. via a queue). These iterators would simply have to block until output is available. > I can imagine >though that there aren't any existing servers that handle what I've >suggested correctly. There probably aren't *any*, actually. >I do think it would be straightforward for servers to handle this >correctly, especially for asynchronous servers like Twisted >and ayncore-based servers. Perhaps this could be an optional feature >of the servers. Servers supporting this feature would be prepared to >delay response output until start_response is called. Servers unable >to do this would generate errors if start_response hasn't been called >by the time the result iterator has been constructed. About a year ago, there was some discussion of designing such an optional "async server" API extension to allow basically the same sort of thing; the only part of the idea that was incorporated, is that an iterator is allowed to yield empty strings to suggest to an async server that it should do other things for a while before trying to get another block from the iterator. The main thing that kept the async API from gelling was that there was nobody with adequate use cases to motivate the definition. Perhaps that has changed now. >In any case, I think the PEP needs to specify more clearly when >start_response can be called. It's tempting at this point to allow start_response() to occur at any time until the first non-empty string is yielded, rather than the first string. This would make your thread-management middleware possible, but unfortunately would require a protocol version change, from 1.0 to 1.1. Servers in the field (especially those based on the wsgiref.handlers module) currently require start_response() to be called before the first string, so your middleware couldn't rely on this feature unless it was either optional or a "1.1" feature. On the other hand, it would probably make more sense to define a server extension like 'wsgi_async.delayed_start'. If present, this would be a special value you could return to indicate that you'll actually respond later. So the threading middleware might look like: def threader_mw(environ, start_response): if 'wsgi_async.delayed_start' in environ: # add environ+start_response to threadqueue return environ['wsgi_async.delayed_start'] else: # run request synchronously The threads would then have to use write() to send data. Anyway, this would allow async servers to let apps handle their own thread pooling, although in the general case I think it's a lousy idea. An async server like Twisted already has a thread pooling facility, and application-specific pools would just duplicate that and waste resources. Meanwhile, this hypothetical threading middleware seems like useless overhead for synchronous servers. From pje at telecommunity.com Thu Dec 15 22:03:08 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 15 Dec 2005 16:03:08 -0500 Subject: [Web-SIG] When must applications call the WSGI start_response callable. In-Reply-To: <1EFFD451-F82C-4973-B2AE-9311B1500A08@fuhm.net> References: <43A1CBA8.2020706@zope.com> <43A1CBA8.2020706@zope.com> Message-ID: <5.1.1.6.0.20051215160226.030ba288@mail.telecommunity.com> At 03:29 PM 12/15/2005 -0500, James Y Knight wrote: >I was led to believe this was a valid thing to do from the following >wording: > > (Note: the application must invoke the start_response() callable > > before the iterable yields its first body string, so that the > > server can send the headers before any body content. However, this > > invocation may be performed by the iterable's first iteration, so > > servers must not assume that start_response() has been called > > before they begin iterating over the iterable.) Aha! I knew it was in there somewhere. :) From jim at zope.com Thu Dec 15 22:02:59 2005 From: jim at zope.com (Jim Fulton) Date: Thu, 15 Dec 2005 16:02:59 -0500 Subject: [Web-SIG] When must applications call the WSGI start_response callable. In-Reply-To: <43A1CDE9.1000108@colorstudy.com> References: <43A1CBA8.2020706@zope.com> <43A1CDE9.1000108@colorstudy.com> Message-ID: <43A1DA03.4030502@zope.com> Ian Bicking wrote: > Jim Fulton wrote: ... >> Why do I want this? It appears that this would be needed to enable >> middleware components that manage application threads. I can imagine >> though that there aren't any existing servers that handle what I've >> suggested correctly. >> >> I do think it would be straightforward for servers to handle this >> correctly, especially for asynchronous servers like Twisted >> and ayncore-based servers. Perhaps this could be an optional feature >> of the servers. Servers supporting this feature would be prepared to >> delay response output until start_response is called. Servers unable >> to do this would generate errors if start_response hasn't been called >> by the time the result iterator has been constructed. > > > I suppose this wouldn't be particularly bad for threaded or multiprocess > servers either -- they use a thread/process until the request is > completed regardless of what happens. Exacept that it makes the implementation a bit more complex. > I can see how it could be used to > greater effect in an asynchronous server. However, I'd rather it not be > optional, as most WSGI apps won't do this, and so servers won't get good > testing on this or may just not implement it, and then some apps and > some servers won't be compatible. I mostly agree, except that I think this feature may only be useful for asynchronous servers. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From ianb at colorstudy.com Thu Dec 15 22:10:51 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 15 Dec 2005 15:10:51 -0600 Subject: [Web-SIG] Thread-management middleware components? In-Reply-To: <43A1D837.8060404@zope.com> References: <43A1BA42.8090406@zope.com> <43A1D086.1060704@colorstudy.com> <43A1D837.8060404@zope.com> Message-ID: <43A1DBDB.2030805@colorstudy.com> Jim Fulton wrote: >> Right now all threading and generally concurrency is handled by the >> server. Since it *has* to be handled by the server, > > > Why does it have to be handled by the server? Because most WSGI apps are blocking, so unless you want the server to be non-concurrent it has to handle this. Of course you design a non-concurrent WSGI server that *had* to be used with some threading middleware. WSGI doesn't seem like a good fit for that, though. > > I'm not sure what > >> the advantage would be to duplicating that functionality? > > > The advantage is that it gives people deploying an application > more control. We've recently switched to using WSGI for > HTTP in Zope. Our default "out of the box" server of choice > is Twisted, however, the current thread-management strategy used by > Twisted's WSGI server doesn't meet out needs. I could try to get > Twisted to change it's stragegy, and I probably will, but it would > be more flexible to be able to plug something in. I think in this particular case -- barring direct changes to Twisted -- it would make more sense to build on Twisted's non-WSGI asyncronous application support, and build a threadpool that calls WSGI from there. > > Well, >> strictly speaking you could have a server with wsgi.threaded and >> wsgi.multiprocess both being false, and the server presumably being >> asynchronous, but I think that's challenging to fit into the WSGI spec >> -- there was some discussion some time ago that dwindled off, and I >> don't think there was ever any resolution on handling asynchronous >> servers/apps in WSGI. > > > We have long experience with combining an asynchronous network server > with a threaded application server. Asynchronous network servers can > handle I/O with lots of network clients very efficiently, but only > if an application doesn't block. Real applications often take > significant time to compute results. A thread-management facility that > bridges asychronous servers with threaded application can work very well. > > It's possible that my need is specific to using asynchronous servers, > but I consider working well with asynchronous servers to be a pretty > important requirement. I think the server has to be synchronous by the time it calls a WSGI app. There's nothing saying that the WSGI support in Twisted is the WSGI support you have to use. My impression is that it is hard to standardize anything async-related because they use slightly different conventions on how to do async (e.g., Deferred vs. ad hoc callbacks). So... whatever standardization there is to be done there is probably below WSGI. >> I don't see a need for a lot of interchangeable thread pools, a >> handful at most should do. > > > I'm not sure what you mean by this. > > On the one hand, I'd like to be free to choose my own thread-management > stragegy. On the other hand, if there are multiple asynchronous servers, > I don't see why they should each have to maintain their own > thread-management > subsystems if one can be shared among the different servers. Sure, but if there's only, say, 4 viable strategies and 3 serious async servers (are there even that many of either?) then it's easier just to figure out how to plug each strategy in on a case-by-case basis, and discuss the concrete issues with the server developers. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From jim at zope.com Thu Dec 15 22:39:07 2005 From: jim at zope.com (Jim Fulton) Date: Thu, 15 Dec 2005 16:39:07 -0500 Subject: [Web-SIG] Thread-management middleware components? In-Reply-To: <43A1DBDB.2030805@colorstudy.com> References: <43A1BA42.8090406@zope.com> <43A1D086.1060704@colorstudy.com> <43A1D837.8060404@zope.com> <43A1DBDB.2030805@colorstudy.com> Message-ID: <43A1E27B.4020306@zope.com> Ian Bicking wrote: > Jim Fulton wrote: > >>> Right now all threading and generally concurrency is handled by the >>> server. Since it *has* to be handled by the server, >> >> >> >> Why does it have to be handled by the server? > > > Because most WSGI apps are blocking, so unless you want the server to be > non-concurrent it has to handle this. Of course you design a > non-concurrent WSGI server that *had* to be used with some threading > middleware. Actually, I suggest a WSGI server that *can* be used with threading middleware. > WSGI doesn't seem like a good fit for that, though. ... > I think in this particular case -- barring direct changes to Twisted -- > it would make more sense to build on Twisted's non-WSGI asyncronous > application support, and build a threadpool that calls WSGI from there. I don't want to maintain a non-WSGI interface and I don't want to maintain my own WSGI Twisted interface. ... > I think the server has to be synchronous by the time it calls a WSGI > app. There's nothing saying that the WSGI support in Twisted is the > WSGI support you have to use. No, but I have good reasons for wanting to use it. > My impression is that it is hard to standardize anything async-related > because they use slightly different conventions on how to do async > (e.g., Deferred vs. ad hoc callbacks). So... whatever standardization > there is to be done there is probably below WSGI. ... > Sure, but if there's only, say, 4 viable strategies and 3 serious async > servers (are there even that many of either?) then it's easier just to > figure out how to plug each strategy in on a case-by-case basis, and > discuss the concrete issues with the server developers. That's what I'll do if I have to. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From foom at fuhm.net Sat Dec 17 22:50:05 2005 From: foom at fuhm.net (James Y Knight) Date: Sat, 17 Dec 2005 16:50:05 -0500 Subject: [Web-SIG] WSGI thread affinity/interleaving Message-ID: <4085084D-D9F8-4A45-8C22-D34C287519AE@fuhm.net> So this came up when I was writing the twisted WSGI support, but at that point I just took the most conservative view and forgot to revisit the issue. 1) Take the following application: def simple_wsgi_app(environ, start_response): start_response("200 OK") yield str(thread.get_ident()) yield str(thread.get_ident()) Is there any guarantee that both times the iterator's .next() is called, they will be on the same thread? 2) d = {} def simple_wsgi_app(environ, start_response): d[thread.get_ident()] = 0 start_response("200 OK") yield "Start" assert d[thread.get_ident()] == 0 d[thread.get_ident] += 1 yield "Done" Is there any guarantee that this will work? That is, is it possible that at the first "yield", another application will be allowed to run, in the same thread? (Of course you'd probably actually want to use threading.local, not a dict of thread.get_ident, but, same idea.) In Twisted, from the first entry to an application's code, until it's finished, it runs on the single thread, with nothing else running on that thread. This means that any app which is paused from generating too much output data for the client will be holding up a thread. When the app returns an iterator, Twisted could be running other requests on that thread while waiting for the client to read some data, if thread affinity is not required. James From pje at telecommunity.com Sat Dec 17 23:25:21 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat, 17 Dec 2005 17:25:21 -0500 Subject: [Web-SIG] WSGI thread affinity/interleaving In-Reply-To: <4085084D-D9F8-4A45-8C22-D34C287519AE@fuhm.net> Message-ID: <5.1.1.6.0.20051217171840.01e1ab80@mail.telecommunity.com> At 04:50 PM 12/17/2005 -0500, James Y Knight wrote: >So this came up when I was writing the twisted WSGI support, but at >that point I just took the most conservative view and forgot to >revisit the issue. > >1) >Take the following application: >def simple_wsgi_app(environ, start_response): > start_response("200 OK") > yield str(thread.get_ident()) > yield str(thread.get_ident()) > >Is there any guarantee that both times the iterator's .next() is >called, they will be on the same thread? I thought I included something in the spec to the effect that there's no guarantee that each next() will be called in the same thread. But it might just have been discussed and not actually edited into the spec. >2) >d = {} >def simple_wsgi_app(environ, start_response): > d[thread.get_ident()] = 0 > start_response("200 OK") > yield "Start" > assert d[thread.get_ident()] == 0 > d[thread.get_ident] += 1 > yield "Done" > >Is there any guarantee that this will work? That is, is it possible >that at the first "yield", another application will be allowed to >run, in the same thread? > >(Of course you'd probably actually want to use threading.local, not a >dict of thread.get_ident, but, same idea.) Yeah, that was the thing, I don't think we wanted to guarantee thread affinity across yields, either in the sense of restricting a thread for one app *or* an app to one thread. This does mean that iterator-based apps can't rely on thread-local variables. I've recently written a "Contextual" library that actually makes it easy for the task controller to manage this, by swapping a thread's context in and out when you switch between tasks, but of course it won't work for anything that doesn't use Contextual variables. I originally proposed Contextual for the stdlib in a pre-PEP, but Guido waved it off on the basis that PEPs 342 and 343 aren't field-deployed yet and the usefulness is unproven. WSGI, however, would be an example of a case where contextual task-locals are needed even with today's Python, sans PEPs 342 and 343. From foom at fuhm.net Sun Dec 18 19:27:20 2005 From: foom at fuhm.net (James Y Knight) Date: Sun, 18 Dec 2005 13:27:20 -0500 Subject: [Web-SIG] WSGI thread affinity/interleaving In-Reply-To: <5.1.1.6.0.20051217171840.01e1ab80@mail.telecommunity.com> References: <5.1.1.6.0.20051217171840.01e1ab80@mail.telecommunity.com> Message-ID: On Dec 17, 2005, at 5:25 PM, Phillip J. Eby wrote: > Yeah, that was the thing, I don't think we wanted to guarantee > thread affinity across yields, either in the sense of restricting a > thread for one app *or* an app to one thread. > > This does mean that iterator-based apps can't rely on thread-local > variables. I've recently written a "Contextual" library that > actually makes it easy for the task controller to manage this, by > swapping a thread's context in and out when you switch between > tasks, but of course it won't work for anything that doesn't use > Contextual variables. I originally proposed Contextual for the > stdlib in a pre-PEP, but Guido waved it off on the basis that PEPs > 342 and 343 aren't field-deployed yet and the usefulness is > unproven. WSGI, however, would be an example of a case where > contextual task-locals are needed even with today's Python, sans > PEPs 342 and 343. I'm worried about database access. Most DBAPI adapters have threadsafety level 2: "Threads may share the module and connections.". So with those, at least, it should be fine to move a connection between threads, since "share OK" implies "move OK". However, no documentation I've found has said anything separately about whether it's safe to _move_ a cursor between threads. It seems likely to me that it would not be safe, at least in some database adapters. And if it's not safe, that means a WSGI result iterator cannot use any DBAPI cursor functionality which seems a drag. Does anybody have practical experience with the safety of moving a DBAPI cursor between threads? James From ianb at colorstudy.com Sun Dec 18 20:33:05 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Sun, 18 Dec 2005 13:33:05 -0600 Subject: [Web-SIG] WSGI thread affinity/interleaving In-Reply-To: References: <5.1.1.6.0.20051217171840.01e1ab80@mail.telecommunity.com> Message-ID: <43A5B971.1010408@colorstudy.com> James Y Knight wrote: > I'm worried about database access. Most DBAPI adapters have > threadsafety level 2: "Threads may share the module and > connections.". So with those, at least, it should be fine to move a > connection between threads, since "share OK" implies "move OK". > However, no documentation I've found has said anything separately > about whether it's safe to _move_ a cursor between threads. It seems > likely to me that it would not be safe, at least in some database > adapters. And if it's not safe, that means a WSGI result iterator > cannot use any DBAPI cursor functionality which seems a drag. > > Does anybody have practical experience with the safety of moving a > DBAPI cursor between threads? I haven't done that, but SQLite (2?) notably doesn't allow you to move a connection between threads. I'm not actually sure what problems it causes if you do move them -- it may simply be an overzealous warning. CCing DB-SIG -- people there might know more details. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From p.f.moore at gmail.com Sun Dec 18 22:33:12 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 18 Dec 2005 21:33:12 +0000 Subject: [Web-SIG] WSGI thread affinity/interleaving In-Reply-To: <43A5B971.1010408@colorstudy.com> References: <5.1.1.6.0.20051217171840.01e1ab80@mail.telecommunity.com> <43A5B971.1010408@colorstudy.com> Message-ID: <79990c6b0512181333h7445b21ch4153f127b74ca556@mail.gmail.com> On 12/18/05, Ian Bicking wrote: > James Y Knight wrote: > > Does anybody have practical experience with the safety of moving a > > DBAPI cursor between threads? > > I haven't done that, but SQLite (2?) notably doesn't allow you to move a > connection between threads. I'm not actually sure what problems it > causes if you do move them -- it may simply be an overzealous warning. > > CCing DB-SIG -- people there might know more details. I can confirm that cx_Oracle does not like cursors being shared between threads. I even recall crashes (but can't verify this - once I checked and found I shouldn't be doing this, I stopped - the problem was intermittent, as is the nature of thread bugs :-(). Paul. From gh at ghaering.de Sun Dec 18 23:18:23 2005 From: gh at ghaering.de (=?ISO-8859-1?Q?Gerhard_H=E4ring?=) Date: Sun, 18 Dec 2005 23:18:23 +0100 Subject: [Web-SIG] WSGI thread affinity/interleaving In-Reply-To: <43A5B971.1010408@colorstudy.com> References: <5.1.1.6.0.20051217171840.01e1ab80@mail.telecommunity.com> <43A5B971.1010408@colorstudy.com> Message-ID: <43A5E02F.4090206@ghaering.de> Ian Bicking wrote: > James Y Knight wrote: >> I'm worried about database access. Most DBAPI adapters have >> threadsafety level 2: "Threads may share the module and >> connections.". So with those, at least, it should be fine to move a >> connection between threads, since "share OK" implies "move OK". >> However, no documentation I've found has said anything separately >> about whether it's safe to _move_ a cursor between threads. It seems >> likely to me that it would not be safe, at least in some database >> adapters. And if it's not safe, that means a WSGI result iterator >> cannot use any DBAPI cursor functionality which seems a drag. >> >> Does anybody have practical experience with the safety of moving a >> DBAPI cursor between threads? > > I haven't done that, but SQLite (2?) notably doesn't allow you to move a > connection between threads. I'm not actually sure what problems it > causes if you do move them -- it may simply be an overzealous warning. It's the same for SQLite 3. The problem is, as far as I understand, that POSIX file locks don't work reliably when they're accessed from multiple threads. That's why the SQLite *docs* always said that you cannot share a SQLite database handle between threads. And pysqlite as well as apsw both fire exceptions if you try to do so. In recent SQLite 3.x versions, SQLite itself would detect this and return an error on *nix too FWIW. pysqlite does have an option to turn the check off, for people who want to shoot themselves in the foot. Fortunately for them, they nowadays get an error-message from SQLite on non-Windows systems anyway ;-) -- Gerhard From foom at fuhm.net Mon Dec 19 01:34:56 2005 From: foom at fuhm.net (James Y Knight) Date: Sun, 18 Dec 2005 19:34:56 -0500 Subject: [Web-SIG] [DB-SIG] WSGI thread affinity/interleaving In-Reply-To: <43A5F757.2030906@egenix.com> References: <5.1.1.6.0.20051217171840.01e1ab80@mail.telecommunity.com> <43A5B971.1010408@colorstudy.com> <43A5F757.2030906@egenix.com> Message-ID: <6B850331-4947-4824-84A3-2C04BC32BEA8@fuhm.net> On Dec 18, 2005, at 6:57 PM, M.-A. Lemburg wrote: > Ian Bicking wrote: > >> James Y Knight wrote: >> >>> I'm worried about database access. Most DBAPI adapters have >>> threadsafety level 2: "Threads may share the module and >>> connections.". So with those, at least, it should be fine to move a >>> connection between threads, since "share OK" implies "move OK". >>> > > What exactly do you mean with "move" ? Sharing a > connection refers to multiple threads creating cursors > on this connection. I'm asking about moving a cursor, that is, accessing it sequentially first from one thread, then later from another thread. This is potentially asking less than sharing, that is, accessing it simultaneously from two threads. For example, a simple class without any locking, that only modifies itself, would generally be movable between threads, but not sharable. Adding a mutex would make it both. James From foom at fuhm.net Mon Dec 19 19:48:03 2005 From: foom at fuhm.net (James Y Knight) Date: Mon, 19 Dec 2005 13:48:03 -0500 Subject: [Web-SIG] WSGI thread affinity/interleaving In-Reply-To: <43A5B971.1010408@colorstudy.com> References: <5.1.1.6.0.20051217171840.01e1ab80@mail.telecommunity.com> <43A5B971.1010408@colorstudy.com> Message-ID: On Dec 18, 2005, at 2:33 PM, Ian Bicking wrote: > James Y Knight wrote: > >> I'm worried about database access. Most DBAPI adapters have >> threadsafety level 2: "Threads may share the module and >> connections.". So with those, at least, it should be fine to move >> a connection between threads, since "share OK" implies "move >> OK". However, no documentation I've found has said anything >> separately about whether it's safe to _move_ a cursor between >> threads. It seems likely to me that it would not be safe, at >> least in some database adapters. And if it's not safe, that means >> a WSGI result iterator cannot use any DBAPI cursor functionality >> which seems a drag. >> Does anybody have practical experience with the safety of moving >> a DBAPI cursor between threads? >> > > I haven't done that, but SQLite (2?) notably doesn't allow you to > move a connection between threads. I'm not actually sure what > problems it causes if you do move them -- it may simply be an > overzealous warning. > > CCing DB-SIG -- people there might know more details. Okay, so I think the overall recommendation from DB-SIG is "don't do that". I'm not sure where that leaves the WSGI discussion now? "Don't use databases from a result iterator", I guess (unless threadsafety == 3)? But do anybody else's WSGI server implementations move apps between threads? I don't especially want to make Twisted's be unique in this way even if it is technically allowed, as I can only see it causing problems when people's apps *do* try to use databases from result iterators and *do* work everywhere else... James From fumanchu at amor.org Mon Dec 19 20:59:28 2005 From: fumanchu at amor.org (Robert Brewer) Date: Mon, 19 Dec 2005 11:59:28 -0800 Subject: [Web-SIG] WSGI thread affinity/interleaving Message-ID: <6949EC6CD39F97498A57E0FA55295B2153CB79@ex9.hostedexchange.local> James Y Knight wrote: > >> I'm worried about database access. Most DBAPI adapters have > >> threadsafety level 2: "Threads may share the module and > >> connections.". So with those, at least, it should be fine to move > >> a connection between threads, since "share OK" implies "move > >> OK". However, no documentation I've found has said anything > >> separately about whether it's safe to _move_ a cursor between > >> threads. It seems likely to me that it would not be safe, at > >> least in some database adapters. And if it's not safe, > that means > >> a WSGI result iterator cannot use any DBAPI cursor functionality > >> which seems a drag. > > Okay, so I think the overall recommendation from DB-SIG is "don't do > that". I'm not sure where that leaves the WSGI discussion > now? "Don't > use databases from a result iterator", I guess (unless threadsafety > == 3)? But do anybody else's WSGI server implementations move apps > between threads? I don't especially want to make Twisted's be unique > in this way even if it is technically allowed, as I can only see it > causing problems when people's apps *do* try to use databases from > result iterators and *do* work everywhere else... I have to admit that none of the apps, servers, or gateways I've worked on have allowed for thread-moving or -sharing. I'm pretty well convinced that CherryPy, for example, won't be able to support that anytime soon--thread isolation is too well baked in. Couldn't someone write a piece of WSGI middleware that takes requests from an async server and dispatches them to a pool of Queues? The consumer side of the Queue would then call the WSGI app with the same thread each time for a given request, but the async-server side would be free to create new requests and fetch results from different threads. Sort of an async-to-threaded bridge. I would think, even if you chose not to build that into your WSGI wrapper, that it would be generic enough to be quite useful for any async server + threaded app. I'll refrain from any predictions about performance, however... ;) Robert Brewer System Architect Amor Ministries fumanchu at amor.org From pje at telecommunity.com Mon Dec 19 21:34:32 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 19 Dec 2005 15:34:32 -0500 Subject: [Web-SIG] WSGI thread affinity/interleaving In-Reply-To: <6949EC6CD39F97498A57E0FA55295B2153CB79@ex9.hostedexchange. local> Message-ID: <5.1.1.6.0.20051219152139.034a22c8@mail.telecommunity.com> At 11:59 AM 12/19/2005 -0800, Robert Brewer wrote: >Couldn't someone write a piece of WSGI middleware that takes requests >from an async server and dispatches them to a pool of Queues? The >consumer side of the Queue would then call the WSGI app with the same >thread each time for a given request, but the async-server side would be >free to create new requests and fetch results from different threads. >Sort of an async-to-threaded bridge. I would think, even if you chose >not to build that into your WSGI wrapper, that it would be generic >enough to be quite useful for any async server + threaded app. I'll >refrain from any predictions about performance, however... ;) This was Jim Fulton's suggestion, and it's beginning to makes more sense. :) Unfortunately I don't think there's a reasonable way to integrate it with the host server's threadpool (e.g. the Twisted threadpool). We should keep an eye, however, on the fact that the vast majority of WSGI apps' requests can and should be handled in a single synchronous iteration. Multiple iterations are primarily useful for large files, and streaming/push applications. These are the *only* reason the spec allows multiple writes or iterations. Applications are supposed to do their own buffering in all other cases, to minimize the number of blocks shuffled up and down the middleware chain. That being the case, the simplest way to ensure thread affinity in Twisted is to just farm out the entire processing of a given request to a reactor.callInThread(). The only applications for which this is not suitable will be large files and streaming/push, which will tie up threads that they probably shouldn't. To handle those use cases, a customized threadpool mechanism would be needed, wherein each thread would have an event loop going over the currently active iterators and adding new ones from a master request queue whenever the thread-local queue dropped below a threshold. From renesd at gmail.com Mon Dec 19 22:45:11 2005 From: renesd at gmail.com (Rene Dudfield) Date: Tue, 20 Dec 2005 08:45:11 +1100 Subject: [Web-SIG] WSGI thread affinity/interleaving In-Reply-To: <5.1.1.6.0.20051219152139.034a22c8@mail.telecommunity.com> References: <5.1.1.6.0.20051219152139.034a22c8@mail.telecommunity.com> Message-ID: <64ddb72c0512191345l7f5295f6o141dc3aee0931560@mail.gmail.com> Large files should just return a file. So that the file descriptor is available for the most efficient sending. So you could use sendfile(2), or another helper process send the file. On 12/20/05, Phillip J. Eby wrote: > At 11:59 AM 12/19/2005 -0800, Robert Brewer wrote: > >Couldn't someone write a piece of WSGI middleware that takes requests > >from an async server and dispatches them to a pool of Queues? The > >consumer side of the Queue would then call the WSGI app with the same > >thread each time for a given request, but the async-server side would be > >free to create new requests and fetch results from different threads. > >Sort of an async-to-threaded bridge. I would think, even if you chose > >not to build that into your WSGI wrapper, that it would be generic > >enough to be quite useful for any async server + threaded app. I'll > >refrain from any predictions about performance, however... ;) > > This was Jim Fulton's suggestion, and it's beginning to makes more > sense. :) Unfortunately I don't think there's a reasonable way to > integrate it with the host server's threadpool (e.g. the Twisted threadpool). > > We should keep an eye, however, on the fact that the vast majority of WSGI > apps' requests can and should be handled in a single synchronous > iteration. Multiple iterations are primarily useful for large files, and > streaming/push applications. These are the *only* reason the spec allows > multiple writes or iterations. Applications are supposed to do their own > buffering in all other cases, to minimize the number of blocks shuffled up > and down the middleware chain. > > That being the case, the simplest way to ensure thread affinity in Twisted > is to just farm out the entire processing of a given request to a > reactor.callInThread(). The only applications for which this is not > suitable will be large files and streaming/push, which will tie up threads > that they probably shouldn't. To handle those use cases, a customized > threadpool mechanism would be needed, wherein each thread would have an > event loop going over the currently active iterators and adding new ones > from a master request queue whenever the thread-local queue dropped below a > threshold. > > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: http://mail.python.org/mailman/options/web-sig/renesd%40gmail.com > From foom at fuhm.net Mon Dec 19 23:22:09 2005 From: foom at fuhm.net (James Y Knight) Date: Mon, 19 Dec 2005 17:22:09 -0500 Subject: [Web-SIG] WSGI thread affinity/interleaving In-Reply-To: <5.1.1.6.0.20051219152139.034a22c8@mail.telecommunity.com> References: <5.1.1.6.0.20051219152139.034a22c8@mail.telecommunity.com> Message-ID: <31A611AA-5C46-4516-AF89-9FCDF054FFCE@fuhm.net> On Dec 19, 2005, at 3:34 PM, Phillip J. Eby wrote: > We should keep an eye, however, on the fact that the vast majority > of WSGI apps' requests can and should be handled in a single > synchronous iteration. Multiple iterations are primarily useful > for large files, and streaming/push applications. These are the > *only* reason the spec allows multiple writes or iterations. > Applications are supposed to do their own buffering in all other > cases, to minimize the number of blocks shuffled up and down the > middleware chain. > > That being the case, the simplest way to ensure thread affinity in > Twisted is to just farm out the entire processing of a given > request to a reactor.callInThread(). Yes, this is how it works currently. I was pondering relaxing that, if the spec allowed. I'm now pretty much convinced that WSGI servers _should not_ move applications among threads between yields of the result iterator, and thus, will be leaving the twisted code that handles this alone. Even though the requirement is not stated in the spec, it seems to be a practical requirement. > The only applications for which this is not suitable will be large > files and streaming/push, which will tie up threads that they > probably shouldn't. Large files is already supported by wsgi.file_wrapper, at least if you're not fiddling with the file as it goes through. That leaves streaming/push, which I'm not sure is a big enough use case to actually care about. At least IMO, if you want efficient streaming support without using up a bunch of threads, use twisted's APIs directly rather than some yet-to-be-invented WSGI extension. James From fumanchu at amor.org Mon Dec 19 23:23:46 2005 From: fumanchu at amor.org (Robert Brewer) Date: Mon, 19 Dec 2005 14:23:46 -0800 Subject: [Web-SIG] WSGI thread affinity/interleaving Message-ID: <6949EC6CD39F97498A57E0FA55295B2153CB7D@ex9.hostedexchange.local> Rene Dudfield wrote: > Large files should just return a file. So that the file descriptor is > available for the most efficient sending. > > So you could use sendfile(2), or another helper process send the file. Large *files*, perhaps, but using HTTP for static files is so 2001 ;). The "streaming/push" requirement is more important to me. Just this morning, one of my users ran a large report (which I thought had been set up to stream its output, but isn't doing that now). He specifically asked that it not wait to be completely-formed before rendering: The GSR used to build immediately on the screen when choosing "Current Trips". Now the entire GSR builds "off screen"...then pops on the screen when it is completely built. This makes for a lot of waiting. Can we get the GSR to build immediately on the screen again? Robert Brewer System Architect Amor Ministries fumanchu at amor.org From pje at telecommunity.com Mon Dec 19 23:25:05 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 19 Dec 2005 17:25:05 -0500 Subject: [Web-SIG] WSGI thread affinity/interleaving In-Reply-To: <64ddb72c0512191345l7f5295f6o141dc3aee0931560@mail.gmail.co m> References: <5.1.1.6.0.20051219152139.034a22c8@mail.telecommunity.com> <5.1.1.6.0.20051219152139.034a22c8@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051219172250.0209c488@mail.telecommunity.com> At 08:45 AM 12/20/2005 +1100, Rene Dudfield wrote: >Large files should just return a file. So that the file descriptor is >available for the most efficient sending. > >So you could use sendfile(2), or another helper process send the file. This isn't an option for e.g. files stored in a database (including ZODB), or generated on the fly, although I suppose you could use a temporary file. Even with a temporary file, however, it doesn't address streaming/push. From jim at zope.com Wed Dec 21 16:20:44 2005 From: jim at zope.com (Jim Fulton) Date: Wed, 21 Dec 2005 10:20:44 -0500 Subject: [Web-SIG] Is the size argument to the input-stream read method optional? In-Reply-To: <43A1CB1D.7000900@colorstudy.com> References: <43A1BCE9.8020403@zope.com> <43A1CB1D.7000900@colorstudy.com> Message-ID: <43A972CC.9090204@zope.com> Ian Bicking wrote: > Jim Fulton wrote: > >> The PEP is unclear on this and should be clarified, IMO. > > > My experience in using implementations is many servers do not require > the read size argument (they don't give a TypeError), but they block > without it, or if you read past CONTENT_LENGTH. So it should probably > be required in the spec, since it's required in practice. Does this constitude a decision? Can somebody update the PEP? I am able and willing to if requested to. :) Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Wed Dec 21 17:25:05 2005 From: jim at zope.com (Jim Fulton) Date: Wed, 21 Dec 2005 11:25:05 -0500 Subject: [Web-SIG] Questions/suggestions on 'wsgi.file_wrapper' Message-ID: <43A981E1.4090609@zope.com> Here are some questions and sugesstions on the 'wsgi.file_wrapper' part of the WSGI API: 1. Does this need to be optional? It seems that it would be easy for any server to provide this, it would be nice for applications to be able to rely in it. 2. If the file-like object passed has a close method, wouldn't it be acceptable for the iterator returned by wsgi.file_wrapper to close it when iteration is done? I would slightly prefer: "It may have a close() method, and if so, the iterable returned by wsgi.file_wrapper must have a close() method that invokes the original file-like object's close() method, or the iterable must close the file when the file-like object's read method returns no data." I prefer this because it allows a simple generator implementation of a default wsgi.file_wrapper. 3. The server should be allowed to use the file wrapper in a different thread than the one used to run the application. This should be noted. Applications should not return file-like objects that rely on running in the same thread. This too should be noted. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pje at telecommunity.com Wed Dec 21 18:29:13 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 21 Dec 2005 12:29:13 -0500 Subject: [Web-SIG] Is the size argument to the input-stream read method optional? In-Reply-To: <43A972CC.9090204@zope.com> References: <43A1CB1D.7000900@colorstudy.com> <43A1BCE9.8020403@zope.com> <43A1CB1D.7000900@colorstudy.com> Message-ID: <5.1.1.6.0.20051221122030.03cf9858@mail.telecommunity.com> At 10:20 AM 12/21/2005 -0500, Jim Fulton wrote: >Ian Bicking wrote: > > Jim Fulton wrote: > > > >> The PEP is unclear on this and should be clarified, IMO. > > > > > > My experience in using implementations is many servers do not require > > the read size argument (they don't give a TypeError), but they block > > without it, or if you read past CONTENT_LENGTH. So it should probably > > be required in the spec, since it's required in practice. > >Does this constitude a decision? Can somebody update the PEP? I thought the PEP was actually pretty clear on this already. It says that the application should not attempt to read more data than is specified by CONTENT_LENGTH - which means that you can't omit the read() argument and avoid that. An application that omits the argument is therefore off-spec, and a server is thus well within its rights to reject this. As far as I know, there is also no circumstance under which a previously-working application (using CGI or some similar protocol) would be able to use read() without an argument and work correctly with any non-ancient version of HTTP. I'm happy to entertain suggestions for language that would make this more obvious. How about just adding """The "size" argument is required and must be a positive integer.""" to the existing note 1? From jim at zope.com Wed Dec 21 18:38:26 2005 From: jim at zope.com (Jim Fulton) Date: Wed, 21 Dec 2005 12:38:26 -0500 Subject: [Web-SIG] Is the size argument to the input-stream read method optional? In-Reply-To: <5.1.1.6.0.20051221122030.03cf9858@mail.telecommunity.com> References: <43A1CB1D.7000900@colorstudy.com> <43A1BCE9.8020403@zope.com> <43A1CB1D.7000900@colorstudy.com> <5.1.1.6.0.20051221122030.03cf9858@mail.telecommunity.com> Message-ID: <43A99312.1060502@zope.com> Phillip J. Eby wrote: > At 10:20 AM 12/21/2005 -0500, Jim Fulton wrote: > >> Ian Bicking wrote: >> > Jim Fulton wrote: >> > >> >> The PEP is unclear on this and should be clarified, IMO. >> > >> > >> > My experience in using implementations is many servers do not require >> > the read size argument (they don't give a TypeError), but they block >> > without it, or if you read past CONTENT_LENGTH. So it should probably >> > be required in the spec, since it's required in practice. >> >> Does this constitude a decision? Can somebody update the PEP? > > > I thought the PEP was actually pretty clear on this already. It says > that the application should not attempt to read more data than is > specified by CONTENT_LENGTH - which means that you can't omit the read() > argument and avoid that. An application that omits the argument is > therefore off-spec, and a server is thus well within its rights to > reject this. As far as I know, there is also no circumstance under > which a previously-working application (using CGI or some similar > protocol) would be able to use read() without an argument and work > correctly with any non-ancient version of HTTP. In Zope and twisted's wsgi server implementation, the input read method treats the character at position content length (counting from 1) as the last character in the file. So read without argument reads the remaining characters up to the content length. This isn't inconsistent with the current language. > I'm happy to entertain suggestions for language that would make this > more obvious. How about just adding """The "size" argument is required > and must be a positive integer.""" to the existing note 1? I think this is an improvement. +1. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pje at telecommunity.com Wed Dec 21 18:41:39 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 21 Dec 2005 12:41:39 -0500 Subject: [Web-SIG] Questions/suggestions on 'wsgi.file_wrapper' In-Reply-To: <43A981E1.4090609@zope.com> Message-ID: <5.1.1.6.0.20051221122927.0278a5b8@mail.telecommunity.com> At 11:25 AM 12/21/2005 -0500, Jim Fulton wrote: >Here are some questions and sugesstions on the 'wsgi.file_wrapper' >part of the WSGI API: > >1. Does this need to be optional? It seems that it would be > easy for any server to provide this, it would be nice for > applications to be able to rely in it. It's intentionally optional because its presence signifies that the server can do things *better* than the application, if and only if the object is a "real" operating system file or other "special" object. The only reason the spec requires only a "file-like" object rather than an object with a valid "fileno()" method, is because somebody wanted to support Jython objects wrapping Java sio(?) objects, for a Java equivalent of sendfile(). >2. If the file-like object passed has a close method, wouldn't > it be acceptable for the iterator returned by wsgi.file_wrapper > to close it when iteration is done? > > I would slightly prefer: > > "It may have a close() method, and if so, the iterable returned by > wsgi.file_wrapper must have a close() method that invokes the original > file-like object's close() method, or the iterable must close the file > when the file-like object's read method returns no data." > > I prefer this because it allows a simple generator implementation of > a default wsgi.file_wrapper. I'm sorry, I don't understand what you're asking for here. I think maybe you have a misunderstanding about why the spec is arranged the way it is here. It is intended to ensure that any middleware between the server and the application will be able to treat the wrapper as a valid WSGI application return value. The server is allowed to strip off the wrapper, if that's in fact what it receives. But the wrapper has to be a 100% valid WSGI return value, or middleware will get confused. The server must also only do special handling *if* it receives the wrapper as a return value; it can't assume that just because you called file_wrapper() that it is going to use that handler. If I understand your suggestion correctly, you're asking to change that in a way that disallows early closing, and I don't think that should be allowed. If the file has a close(), any middleware involved needs to be allowed to call it. >3. The server should be allowed to use the file wrapper in a different > thread than the one used to run the application. This should be noted. > Applications should not return file-like objects that rely on running > in the same thread. This too should be noted. This seems reasonable to me. For the actual use cases file_wrapper was intended to support (sendfile() and the Java equivalent thereof) this should be no problem at all. From jim at zope.com Wed Dec 21 19:06:50 2005 From: jim at zope.com (Jim Fulton) Date: Wed, 21 Dec 2005 13:06:50 -0500 Subject: [Web-SIG] Questions/suggestions on 'wsgi.file_wrapper' In-Reply-To: <5.1.1.6.0.20051221122927.0278a5b8@mail.telecommunity.com> References: <5.1.1.6.0.20051221122927.0278a5b8@mail.telecommunity.com> Message-ID: <43A999BA.9040207@zope.com> Phillip J. Eby wrote: > At 11:25 AM 12/21/2005 -0500, Jim Fulton wrote: > >> Here are some questions and sugesstions on the 'wsgi.file_wrapper' >> part of the WSGI API: >> >> 1. Does this need to be optional? It seems that it would be >> easy for any server to provide this, it would be nice for >> applications to be able to rely in it. > > > It's intentionally optional because its presence signifies that the > server can do things *better* than the application, if and only if the > object is a "real" operating system file or other "special" object. The > only reason the spec requires only a "file-like" object rather than an > object with a valid "fileno()" method, is because somebody wanted to > support Jython objects wrapping Java sio(?) objects, for a Java > equivalent of sendfile(). I guess I'm puzzled how the server can fail to do at least as well as the application. Can you think of a case where an application wants to output a file and can do better than a simple fallback iterator provided by the server? > >> 2. If the file-like object passed has a close method, wouldn't ... > If I understand your suggestion correctly, you're asking to change that > in a way that disallows early closing, and I don't think that should be > allowed. Ah! I see. Good point. OK, I withdraw my suggestion. ... >> 3. The server should be allowed to use the file wrapper in a different >> thread than the one used to run the application. This should be >> noted. >> Applications should not return file-like objects that rely on running >> in the same thread. This too should be noted. > > > This seems reasonable to me. For the actual use cases file_wrapper was > intended to support (sendfile() and the Java equivalent thereof) this > should be no problem at all. Cool. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pje at telecommunity.com Wed Dec 21 19:31:05 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 21 Dec 2005 13:31:05 -0500 Subject: [Web-SIG] Questions/suggestions on 'wsgi.file_wrapper' In-Reply-To: <43A999BA.9040207@zope.com> References: <5.1.1.6.0.20051221122927.0278a5b8@mail.telecommunity.com> <5.1.1.6.0.20051221122927.0278a5b8@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051221132221.02102060@mail.telecommunity.com> At 01:06 PM 12/21/2005 -0500, Jim Fulton wrote: >Phillip J. Eby wrote: >>At 11:25 AM 12/21/2005 -0500, Jim Fulton wrote: >> >>>Here are some questions and sugesstions on the 'wsgi.file_wrapper' >>>part of the WSGI API: >>> >>>1. Does this need to be optional? It seems that it would be >>> easy for any server to provide this, it would be nice for >>> applications to be able to rely in it. >> >>It's intentionally optional because its presence signifies that the >>server can do things *better* than the application, if and only if the >>object is a "real" operating system file or other "special" object. The >>only reason the spec requires only a "file-like" object rather than an >>object with a valid "fileno()" method, is because somebody wanted to >>support Jython objects wrapping Java sio(?) objects, for a Java >>equivalent of sendfile(). > >I guess I'm puzzled how the server can fail to do at least as well >as the application. Can you think of a case where an application wants to >output a file and can do better than a simple fallback iterator provided >by the server? Again, file_wrapper was created as an optional hack to allow sendfile() and java.nio.FileChannel to work. It's a little late to go back and make it required unless we want to start trying to make a WSGI 1.1 spec. At this point, it's optional because it was optional and everybody's gone and implemented servers that either do or don't comply with the existing spec. We're not really in a position to change the spec without a new spec. About a year ago the SIG consensus was basically, "it's done; anything from here on out has to be either a clarification of something already decided, or addition of new optional features (like an async API)". Once that was done, people have been making implementations left and right, so it's not fair to go back and make them retroactively noncompliant for not implementing an explicitly optional feature. From jim at zope.com Wed Dec 21 19:49:48 2005 From: jim at zope.com (Jim Fulton) Date: Wed, 21 Dec 2005 13:49:48 -0500 Subject: [Web-SIG] Questions/suggestions on 'wsgi.file_wrapper' In-Reply-To: <5.1.1.6.0.20051221132221.02102060@mail.telecommunity.com> References: <5.1.1.6.0.20051221122927.0278a5b8@mail.telecommunity.com> <5.1.1.6.0.20051221122927.0278a5b8@mail.telecommunity.com> <5.1.1.6.0.20051221132221.02102060@mail.telecommunity.com> Message-ID: <43A9A3CC.4090801@zope.com> Phillip J. Eby wrote: > At 01:06 PM 12/21/2005 -0500, Jim Fulton wrote: > >> Phillip J. Eby wrote: >> >>> At 11:25 AM 12/21/2005 -0500, Jim Fulton wrote: >>> >>>> Here are some questions and sugesstions on the 'wsgi.file_wrapper' >>>> part of the WSGI API: >>>> >>>> 1. Does this need to be optional? It seems that it would be >>>> easy for any server to provide this, it would be nice for >>>> applications to be able to rely in it. >>> >>> >>> It's intentionally optional because its presence signifies that the >>> server can do things *better* than the application, if and only if >>> the object is a "real" operating system file or other "special" >>> object. The only reason the spec requires only a "file-like" object >>> rather than an object with a valid "fileno()" method, is because >>> somebody wanted to support Jython objects wrapping Java sio(?) >>> objects, for a Java equivalent of sendfile(). >> >> >> I guess I'm puzzled how the server can fail to do at least as well >> as the application. Can you think of a case where an application >> wants to >> output a file and can do better than a simple fallback iterator provided >> by the server? > > > Again, file_wrapper was created as an optional hack to allow sendfile() > and java.nio.FileChannel to work. It's a little late to go back and > make it required unless we want to start trying to make a WSGI 1.1 spec. > > At this point, it's optional because it was optional and everybody's > gone and implemented servers that either do or don't comply with the > existing spec. We're not really in a position to change the spec > without a new spec. About a year ago the SIG consensus was basically, > "it's done; anything from here on out has to be either a clarification > of something already decided, or addition of new optional features (like > an async API)". > > Once that was done, people have been making implementations left and > right, so it's not fair to go back and make them retroactively > noncompliant for not implementing an explicitly optional feature. That's a fair point. I suggest it is something to consider in a later rev of the PEP, but I don't think it alone would justify a later rev. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Wed Dec 21 20:17:07 2005 From: jim at zope.com (Jim Fulton) Date: Wed, 21 Dec 2005 14:17:07 -0500 Subject: [Web-SIG] Should system environment variables appear in a WSGI environ? Message-ID: <43A9AA33.1090001@zope.com> The PEP describes CGI and WSGI ("wsgi.") environment variables that must and should be included. It also describes a mechanism for the server to add server-specific environment variables. It doesn't explicitly say that the server should not include other environment variables, such as process environment variables. It does say that all additional variables it provides should be documented, which could be construed to mean that it shouldn't add additional variables. :) Would it be reasonable to say that a server should not include process environment variables? Zope currently exposes most of the environment it's given and I don't want to expose process environment variables. I'm wondering if I need to cleanse the environment I'm given. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From ianb at colorstudy.com Wed Dec 21 20:22:02 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Wed, 21 Dec 2005 13:22:02 -0600 Subject: [Web-SIG] WSGI: QUERY_STRING and cgi stdlib module Message-ID: <43A9AB5A.30803@colorstudy.com> I thought I'd note that in testing I noticed that if QUERY_STRING is missing the cgi module falls back on sys.argv, which is aweful. WSGI says QUERY_STRING is optional, but if you pass the WSGI environment to cgi.FieldStorage you get this bug. Should QUERY_STRING just be required? It's almost always set anyways. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From pje at telecommunity.com Wed Dec 21 20:40:12 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 21 Dec 2005 14:40:12 -0500 Subject: [Web-SIG] Should system environment variables appear in a WSGI environ? In-Reply-To: <43A9AA33.1090001@zope.com> Message-ID: <5.1.1.6.0.20051221143530.02098510@mail.telecommunity.com> At 02:17 PM 12/21/2005 -0500, Jim Fulton wrote: >The PEP describes CGI and WSGI ("wsgi.") environment variables that must >and should be included. It also describes a mechanism for the server to >add server-specific environment variables. It doesn't explicitly say >that the server should not include other environment variables, such as >process environment variables. It does say that all additional variables >it provides should be documented, which could be construed to mean that >it shouldn't add additional variables. :) The intent was to say that if you provide additional CGI-like variables (like HTTPS=on and SSL_PROTOCOL), you should document them. >Would it be reasonable to say that a server should not include process >environment variables? No; the spec explicitly says the server can, and strongly implies they should as a way to allow configuration of applications that expect to use their environment as configuration. See: http://www.python.org/peps/pep-0333.html#application-configuration """Servers and gateways should support this by allowing an application's deployer to specify name-value pairs to be placed in environ. In the simplest case, this support can consist merely of copying all operating system-supplied environment variables from os.environ into the environ.""" So, if you cleanse the process environment, you should provide an alternative way for application deployers to put name-value pairs into the environ. From mso at oz.net Thu Dec 22 04:32:28 2005 From: mso at oz.net (Mike Orr) Date: Wed, 21 Dec 2005 19:32:28 -0800 Subject: [Web-SIG] Questions/suggestions on 'wsgi.file_wrapper' In-Reply-To: <5.1.1.6.0.20051221122927.0278a5b8@mail.telecommunity.com> References: <5.1.1.6.0.20051221122927.0278a5b8@mail.telecommunity.com> Message-ID: <43AA1E4C.6050601@oz.net> Phillip J. Eby wrote: >At 11:25 AM 12/21/2005 -0500, Jim Fulton wrote: > > > >>Here are some questions and sugesstions on the 'wsgi.file_wrapper' >>part of the WSGI API: >> >>1. Does this need to be optional? It seems that it would be >> easy for any server to provide this, it would be nice for >> applications to be able to rely in it. >> >> > >It's intentionally optional because its presence signifies that the server >can do things *better* than the application, if and only if the object is a >"real" operating system file or other "special" object. The only reason >the spec requires only a "file-like" object rather than an object with a >valid "fileno()" method, is because somebody wanted to support Jython >objects wrapping Java sio(?) objects, for a Java equivalent of sendfile(). > > Allowing a file-like object like StringIO also allows the environment to be pickled and sent to another process. This lets a Python web server talk directly to a Python application server using WSGI, rather than having to kludge through SCGI and then repackage it to WSGI. I don't know of any web servers that do this yet but it would be a shame to lose the capability. If we require a file object, the environment becomes non-pickleable because you can't serialize an open file. SCGI uses passfd, which somehow works, but not on Windows. If we require .fileno(), one could have an object that quickly writes the content to a file and passes that fileno, but I don't see what that gains. -- Mike Orr From pje at telecommunity.com Thu Dec 22 04:53:10 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 21 Dec 2005 22:53:10 -0500 Subject: [Web-SIG] Questions/suggestions on 'wsgi.file_wrapper' In-Reply-To: <43AA1E4C.6050601@oz.net> References: <5.1.1.6.0.20051221122927.0278a5b8@mail.telecommunity.com> <5.1.1.6.0.20051221122927.0278a5b8@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051221225120.020ed5b8@mail.telecommunity.com> At 07:32 PM 12/21/2005 -0800, Mike Orr wrote: >Phillip J. Eby wrote: > > >At 11:25 AM 12/21/2005 -0500, Jim Fulton wrote: > > > > > > > >>Here are some questions and sugesstions on the 'wsgi.file_wrapper' > >>part of the WSGI API: > >> > >>1. Does this need to be optional? It seems that it would be > >> easy for any server to provide this, it would be nice for > >> applications to be able to rely in it. > >> > >> > > > >It's intentionally optional because its presence signifies that the server > >can do things *better* than the application, if and only if the object is a > >"real" operating system file or other "special" object. The only reason > >the spec requires only a "file-like" object rather than an object with a > >valid "fileno()" method, is because somebody wanted to support Jython > >objects wrapping Java sio(?) objects, for a Java equivalent of sendfile(). > > > > > >Allowing a file-like object like StringIO also allows the environment to >be pickled and sent to another process. This lets a Python web server >talk directly to a Python application server using WSGI, rather than >having to kludge through SCGI and then repackage it to WSGI. I don't >know of any web servers that do this yet but it would be a shame to lose >the capability. > >If we require a file object, the environment becomes non-pickleable >because you can't serialize an open file. SCGI uses passfd, which >somehow works, but not on Windows. If we require .fileno(), one could >have an object that quickly writes the content to a file and passes that >fileno, but I don't see what that gains. I think perhaps you've confused the 'file_wrapper' API with the file-like objects in the environment. The discussion above is about 'file_wrapper' objects *returned* by the application, not the input/stderr objects in the environment. From kai.keliikuli at gmail.com Thu Dec 22 18:24:59 2005 From: kai.keliikuli at gmail.com (kai) Date: Thu, 22 Dec 2005 12:24:59 -0500 Subject: [Web-SIG] transaction progress with cgi.FieldStorage Message-ID: <43AAE16B.9040006@gmail.com> Hi All, this is my first post on this list. I am working on a way to monitor the progress of reading a file upload from wsgi.input. I can currently monitor the overall transfer and when individual files of a multiple file upload are completed. The ultimate goal of this is to be able to display a progress meter when someone is uploading a file. To do this I subclassed cgi.FieldStorage but when I finished I had modified most of the non-trivial methods just to hook in something to monitor the transfer progress, oops. Has anyone else found FieldStorage insufficient for certain tasks? Is there a general need for a more flexible FieldStorage replacement? kai keliikuli From tsoehnli at gmu.edu Sat Dec 24 18:53:16 2005 From: tsoehnli at gmu.edu (tsoehnli@gmu.edu) Date: Sat, 24 Dec 2005 12:53:16 -0500 Subject: [Web-SIG] cgi.fieldstorage In-Reply-To: References: Message-ID: I found cgi library to be too bulky for cgi actually. Its load time was enought to double the processing time of my scripts. I changed that tho, by hand recoding most of everything, and removed certain things, like regular expressions, from the process, and replaced it with urllib's fast quote and unquote. Right now to process file uploads is quite simple, though I have even seen some more modularized versions. If you would like a copy of mine, I would be more than happy to email it to you, but yeah, the standard cgi lib is not all that great, and performance is weak. From cce at clarkevans.com Sun Dec 25 04:45:34 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Sat, 24 Dec 2005 22:45:34 -0500 Subject: [Web-SIG] Why is response_headers a list instead of a dict? Message-ID: <20051225034534.GA88508@prometheusresearch.com> Why is response_headers a list instead of a dict? >From RFC 2616 Section 4.2: The order in which header fields with differing field names are received is not significant. However, it is "good practice" to send general-header fields first, followed by request-header or response- header fields, and ending with the entity-header fields. Multiple message-header fields with the same field-name MAY be present in a message if and only if the entire field-value for that header field is defined as a comma-separated list [i.e., #(values)]. It MUST be possible to combine the multiple header fields into one "field-name: field-value" pair, without changing the semantics of the message, by appending each subsequent field-value to the first, each separated by a comma. In other words: (a) order does not matter, (b) it is reasonable to restrict a header field to a single (header_name, header_value) pair. Indeed, according to the specification, a HTTP Proxy could re-arrange headers and condense N headers of the same type by simply concatenating their values with a comma. I'm asking this because it is quite painful (and very much an unnecessary pain) to work with headers in complex WSGI-based middleware applications. Kind Regards, Clark From foom at fuhm.net Sun Dec 25 05:48:39 2005 From: foom at fuhm.net (James Y Knight) Date: Sat, 24 Dec 2005 23:48:39 -0500 Subject: [Web-SIG] Why is response_headers a list instead of a dict? In-Reply-To: <20051225034534.GA88508@prometheusresearch.com> References: <20051225034534.GA88508@prometheusresearch.com> Message-ID: <3BFED61A-97DE-48EB-BC10-152861D988B5@fuhm.net> On Dec 24, 2005, at 10:45 PM, Clark C. Evans wrote: > Why is response_headers a list instead of a dict? > > [ RFC quote ] > > In other words: (a) order does not matter True, order between headers does not matter. > (b) it is reasonable to > restrict a header field to a single (header_name, header_value) pair. Yes, the RFC says that, and I certainly wish it were true, but it's simply not. The RFC lies. The primary example is the Set-Cookie header, which by _definition_ cannot be combined, as it uses an unquoted date which includes a comma. Also, multiple WWW-Authenticate headers should be okay to combine, but I've heard rumors of UAs being confused by that format. WSGI could have spec'd a dictionary of lists of strings, rather than a list of strings, but it did not. You can transform the result into that if you like... James From pje at telecommunity.com Sun Dec 25 06:13:09 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun, 25 Dec 2005 00:13:09 -0500 Subject: [Web-SIG] Why is response_headers a list instead of a dict? In-Reply-To: <20051225034534.GA88508@prometheusresearch.com> Message-ID: <5.1.1.6.0.20051224233526.03d0cf60@mail.telecommunity.com> At 10:45 PM 12/24/2005 -0500, Clark C. Evans wrote: >Why is response_headers a list instead of a dict? The short answer is because of "Set-Cookie:" headers, and quoting issues with the 'expires' parameter. The slightly longer answer is that it gives the application more control of the response, which may be important to work around bugs in browsers, caches, and proxies currently deployed in the field. :( From cce at clarkevans.com Sun Dec 25 19:04:29 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Sun, 25 Dec 2005 13:04:29 -0500 Subject: [Web-SIG] Why is response_headers a list instead of a dict? In-Reply-To: <5.1.1.6.0.20051224233526.03d0cf60@mail.telecommunity.com> References: <20051225034534.GA88508@prometheusresearch.com> <5.1.1.6.0.20051224233526.03d0cf60@mail.telecommunity.com> Message-ID: <20051225180429.GA93279@prometheusresearch.com> I'm going to play the devil's advocate here; although I really love WSGI -- I think this particular decision is a wart and will greatly hinder adoption. On Sun, Dec 25, 2005 at 12:13:09AM -0500, Phillip J. Eby wrote: | At 10:45 PM 12/24/2005 -0500, Clark C. Evans wrote: | >Why is response_headers a list instead of a dict? | | The short answer is because of "Set-Cookie:" headers, and quoting issues | with the 'expires' parameter. The slightly longer answer is that it | gives the application more control of the response, which may be | important to work around bugs in browsers, caches, and proxies currently | deployed in the field. :( You are, of course, referring to the horribly old Netscape Specification for Set-Cookie, http://wp.netscape.com/newsref/std/cookie_spec.html. I'd like to note that RFC 2109 (1997), and RFC 2965 (2000) have no such problems. Just about every major browser out there supports max-age parameter instead of "Expires". Doing a quick "unofficial" survey of major websites, 'max-age' usage (RFC 2109) is the most common usage, as it is far easier for server implementations to specify an age in seconds rather than compute a GMT timestamp. More control over the response is fine; but really, this should be in the domain of web-server software -- which will have much more eyes on it and has a greater chance of being correct and handling variants among browsers. For example, Twisted or the Zope community have a much better chance of making WSGI work in pratice if they are given the freedom to re-arrange the Headers (splitting or joining as appropiriate) to match browsers which commonly visit their site. In this particular case, you've taken control from the writers of the web-server software (who have much greater chance of getting it right) and given it to framework/application writers -- which have a much larger chance of not reading the specifciations correctly or not having enough deployment experience to cover browser quirks. On Sat, Dec 24, 2005 at 11:48:39PM -0500, James Y Knight wrote: | On Dec 24, 2005, at 10:45 PM, Clark C. Evans wrote: | >Why is response_headers a list instead of a dict? | > | >[ RFC quote ] | > | >In other words: (a) order does not matter | | True, order between headers does not matter. Yes, however, the HTTP/1.1 specification explicitly suggests that general headers come first, then request/response headers, followed by entity headers. It also recommends that headers take a "common form" when sent by servers (that is, in Camel-Dash-Case, except ones like ETag or WWW-Authenticate). I think that server platforms should be able to implement these suggestions so that applications/frameworks don't have to be bothered with such details. | >(b) it is reasonable to | >restrict a header field to a single (header_name, header_value) pair. | | Yes, the RFC says that, and I certainly wish it were true, but it's | simply not. The RFC lies. The primary example is the Set-Cookie | header, which by _definition_ cannot be combined, as it uses an | unquoted date which includes a comma. This seems to be the only use-case for the decision. If it is that important; make it an exception. A small bit of code for 'Set-Cookie', if it is even necessary (I contend that it isn't), is an acceptable price to pay for simpler WSGI applications. | Also, multiple WWW-Authenticate | headers should be okay to combine, but I've heard rumors of UAs being | confused by that format. First, there is _nothing_ preventing a Server (such as Zope or Twisted) handling this case by splitting out comma-separated WWW-Authenticate or Set-Cookie (RFC 2109, or even the _broken_ netscape spec with a very small amount of code) into mutltiple lines. Second, is combination of needing _multiple_ WWW-Authenticate headers on that particular User-Agent a real-live use case? Frankly -- this is programming for Edge Cases; it is a 1% issue and your average Framework/Application developer won't do it, or if they do do it, it will most likely be done incorrectly. It's not like the servers we have to run WSGI apps are closed-source, non-responsive. The Twisted and Zope team (among others) are very quick at making things work. | WSGI could have spec'd a dictionary of lists of strings, rather than | a list of strings, but it did not. You can transform the result into | that if you like... Well, I agree it should have been a dictionary (with lower-case keys). I don't think that a list would have been helpful; 90% of the time you're dealing with something that isn't a list. And when it is a list, appending ",mystuff" to the list isn't that hard. Kind Regards, Clark From pje at telecommunity.com Sun Dec 25 20:13:00 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun, 25 Dec 2005 14:13:00 -0500 Subject: [Web-SIG] Why is response_headers a list instead of a dict? In-Reply-To: <20051225180429.GA93279@prometheusresearch.com> References: <5.1.1.6.0.20051224233526.03d0cf60@mail.telecommunity.com> <20051225034534.GA88508@prometheusresearch.com> <5.1.1.6.0.20051224233526.03d0cf60@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051225135446.0220b2b8@mail.telecommunity.com> At 01:04 PM 12/25/2005 -0500, Clark C. Evans wrote: >More control over the response is fine; but really, this should be >in the domain of web-server software -- which will have much more eyes >on it and has a greater chance of being correct and handling variants >among browsers. For example, Twisted or the Zope community have a much >better chance of making WSGI work in pratice if they are given the >freedom to re-arrange the Headers (splitting or joining as appropiriate) >to match browsers which commonly visit their site. > >In this particular case, you've taken control from the writers of the >web-server software (who have much greater chance of getting it right) >and given it to framework/application writers -- which have a much >larger chance of not reading the specifciations correctly or not having >enough deployment experience to cover browser quirks. WSGI puts this particular power in the application writer's hands, because then *they* can fix a problem. If it's in the server author's hands, the application writer can be screwed, whether the server is open source or not. >I think that server platforms should be able to >implement these suggestions so that applications/frameworks don't have >to be bothered with such details. WSGI is not designed - and is definitely not intended! - to encourage writing new web frameworks. >This seems to be the only use-case for the decision. If it is that >important; make it an exception. A small bit of code for 'Set-Cookie', >if it is even necessary (I contend that it isn't), is an acceptable >price to pay for simpler WSGI applications. No, I'm sorry, but it's not. Read the PEP again, which explains why having a nicer API for the application side was never a goal - in fact, it was an explicit *anti*-goal. Having it be ugly and primitive was both necessary and intentional. Ironically, headers are the one use case where I felt we could make an exception to the "crude is better" principle, but was argued down by others. I had originally proposed using an email.Message object to manage headers, since it had all the needed functionality (including the necessary ordering control), but others argued that it's easy enough for a framework to do that itself, and that in any case email.Message had too many distracting non-HTTP-header methods. >Frankly -- this is programming for Edge Cases; it is a 1% issue and your >average Framework/Application developer won't do it, or if they do do >it, it will most likely be done incorrectly. It's not like the servers >we have to run WSGI apps are closed-source, non-responsive. The Twisted >and Zope team (among others) are very quick at making things work. FYI, If I understand correctly, Jim Fulton has stated that Zope isn't going to *have* a server in the future, if they can avoid it. In any case, the point is moot; this isn't a compatible change to the spec, so it would have to wait for a WSGI 2.0. Note that in any case, every framework, application, or middleware is free to invent its own solution for managing headers - and most already had one before WSGI came into being. As written, the WSGI spec allows those existing applications and frameworks to produce the same output that they used to. Backward compatibility with field-deployed software was a key criterion for WSGI design decisions. Moving from a non-WSGI interface to WSGI should not alter an application's output unnecessarily. If you want a friendly API for WSGI header management, please see the wsgiref.headers.Headers class, which offers a dictionary-like interface to manipulate a WSGI header list. From cce at clarkevans.com Sun Dec 25 20:21:59 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Sun, 25 Dec 2005 14:21:59 -0500 Subject: [Web-SIG] Why is response_headers a list instead of a dict? In-Reply-To: <5.1.1.6.0.20051225135446.0220b2b8@mail.telecommunity.com> References: <5.1.1.6.0.20051224233526.03d0cf60@mail.telecommunity.com> <20051225034534.GA88508@prometheusresearch.com> <5.1.1.6.0.20051224233526.03d0cf60@mail.telecommunity.com> <5.1.1.6.0.20051225135446.0220b2b8@mail.telecommunity.com> Message-ID: <20051225192159.GA93982@prometheusresearch.com> Thank you for taking time to respond Phillip. On Sun, Dec 25, 2005 at 02:13:00PM -0500, Phillip J. Eby wrote: | WSGI puts this particular power in the application writer's hands, | because then *they* can fix a problem. If it's in the server author's | hands, the application writer can be screwed, whether the server is open | source or not. } | Having it be ugly and primitive was both necessary and intentional. Ok. | In any case, the point is moot; this isn't a compatible change to the | spec, so it would have to wait for a WSGI 2.0. Right; it's quite a large change. Also, my sample set was limited to mostly sites that didn't use 'long-lasting' cookies. It seems that Microsoft's SDK still uses 'expires' in their Set-Cookie header [1], despite almost 8 years of it being expliclty removed from the RFC. | If you want a friendly API for WSGI header management, please see the | wsgiref.headers.Headers class, which offers a dictionary-like interface | to manipulate a WSGI header list. I'll have a look at it; thanks. Best, Clark [1] http://msdn.microsoft.com/library/default.asp?url=/library/en-us/wininet/wininet/http_cookies.asp From cce at clarkevans.com Sun Dec 25 20:51:23 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Sun, 25 Dec 2005 14:51:23 -0500 Subject: [Web-SIG] Why is response_headers a list instead of a dict? In-Reply-To: <5.1.1.6.0.20051225135446.0220b2b8@mail.telecommunity.com> References: <5.1.1.6.0.20051224233526.03d0cf60@mail.telecommunity.com> <20051225034534.GA88508@prometheusresearch.com> <5.1.1.6.0.20051224233526.03d0cf60@mail.telecommunity.com> <5.1.1.6.0.20051225135446.0220b2b8@mail.telecommunity.com> Message-ID: <20051225195123.GA95491@prometheusresearch.com> On Sun, Dec 25, 2005 at 02:13:00PM -0500, Phillip J. Eby wrote: | In any case, the point is moot; this isn't a compatible change to the | spec, so it would have to wait for a WSGI 2.0. In paragraph #3 of the "start_response()" definition, it states that type(response_headers) is ListType. I'm wondering if you'd be willing to modify this to isinstance(response_headers, list)? A similar assertion is not made about `environ` parameters, only that it is a 'dictionary'. Could a server or middleware provide a special environment handler object (as long as isinstance(environ, dict))? The idea is that these two objects could be customized to provide low-level RFC support and helper methods; but yet still be 'list of tuples' and 'dictionary' as required by the WSGI specification. For example: (a) the specialized `environ` could provide attributes which get common HTTP_HEADERs; or raise an error if they do not exist -- this would prevent spelling mistakes. (b) the specialized `headers` could override the list[selector] to take a string argumnet, doing a lookup and replacement; it could also do HTTP Header checking, etc. Of course, the goal of these objects would be to present the _normal_ dict and list interfaces so that intermediate WSGI applications that didn't know about the specialization would remain unaffected. With Python 2.2's __new__ operator, this could be done transparently at each level, where the intermediate object "adorns" the underlying native representation. my_start_response(status, response_headers): response_headers = ResponseHeaders(response_headers) response_headers['My-Header'] = 'some-value' response_headers.set_content_disposition(filename="bing",inline=True) ... The ResponseHeader class in this case would derive from 'list', and be a valid WSGI list-of-tuples; for those that know it is a ResponseHeaders however, they can use the goodness and type-checking provided. The implementation of ResponseHeaders() constructor is simple; if the object is already a ResponseHeaders, it returns self -- otherwise, it constructs the wrapper as needed. Kind Regards, Clark From pje at telecommunity.com Mon Dec 26 01:26:58 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun, 25 Dec 2005 19:26:58 -0500 Subject: [Web-SIG] Why is response_headers a list instead of a dict? In-Reply-To: <20051225195123.GA95491@prometheusresearch.com> References: <5.1.1.6.0.20051225135446.0220b2b8@mail.telecommunity.com> <5.1.1.6.0.20051224233526.03d0cf60@mail.telecommunity.com> <20051225034534.GA88508@prometheusresearch.com> <5.1.1.6.0.20051224233526.03d0cf60@mail.telecommunity.com> <5.1.1.6.0.20051225135446.0220b2b8@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051225190848.0220b018@mail.telecommunity.com> At 02:51 PM 12/25/2005 -0500, Clark C. Evans wrote: >On Sun, Dec 25, 2005 at 02:13:00PM -0500, Phillip J. Eby wrote: >| In any case, the point is moot; this isn't a compatible change to the >| spec, so it would have to wait for a WSGI 2.0. > >In paragraph #3 of the "start_response()" definition, it states that >type(response_headers) is ListType. I'm wondering if you'd be willing >to modify this to isinstance(response_headers, list)? No. :) See below. >A similar assertion is not made about `environ` parameters, only that >it is a 'dictionary'. From http://www.python.org/peps/pep-0333.html#specification-details : """This object must be a builtin Python dictionary (not a subclass, UserDict or other dictionary emulation),...""" > Could a server or middleware provide a special >environment handler object (as long as isinstance(environ, dict))? No; this is explicitly forbidden. See also Q&A item #1, under: http://www.python.org/peps/pep-0333.html#questions-and-answers A different argument applies to the headers list, but it's even worse in the headers case. There is essentially zero probability that a server is going to be able to make use of any auxiliary methods of a headers object, and it would be crazy for the server to try and introspect to see which of the dozens of possible header extensions *might* exist. The simple solution for code which wants a higher-level interface to either environ or headers is to wrap the raw data structures in its own enhancements - such as a request and response object. This is what maybe 99% of existing applications and frameworks do, so there was no sense in duplicating this in WSGI. Meanwhile, optional features and flexibility are things to be *avoided* in a low-level protocol like this, if at all possible. From foom at fuhm.net Mon Dec 26 17:59:16 2005 From: foom at fuhm.net (James Y Knight) Date: Mon, 26 Dec 2005 11:59:16 -0500 Subject: [Web-SIG] Why is response_headers a list instead of a dict? In-Reply-To: <20051225192159.GA93982@prometheusresearch.com> References: <5.1.1.6.0.20051224233526.03d0cf60@mail.telecommunity.com> <20051225034534.GA88508@prometheusresearch.com> <5.1.1.6.0.20051224233526.03d0cf60@mail.telecommunity.com> <5.1.1.6.0.20051225135446.0220b2b8@mail.telecommunity.com> <20051225192159.GA93982@prometheusresearch.com> Message-ID: <77DD120D-CC63-4FFA-BC68-01EF57F162EB@fuhm.net> On Dec 25, 2005, at 2:21 PM, Clark C. Evans wrote: > It seems that > Microsoft's SDK still uses 'expires' in their Set-Cookie header [1], > despite almost 8 years of it being expliclty removed from the RFC. Sorry, but, despite the RFC writers best efforts, the newer RFCs are almost universally ignored by servers, frameworks, and browsers. When I looked a few months ago, even mozilla did not support the new cookie RFC. I think Opera is the only browser that does. Netscape cookies are unfortunately still the de facto standard. James From cce at clarkevans.com Tue Dec 27 21:38:21 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Tue, 27 Dec 2005 15:38:21 -0500 Subject: [Web-SIG] Why is response_headers a list instead of a dict? In-Reply-To: <5.1.1.6.0.20051225190848.0220b018@mail.telecommunity.com> References: <5.1.1.6.0.20051225135446.0220b2b8@mail.telecommunity.com> <5.1.1.6.0.20051224233526.03d0cf60@mail.telecommunity.com> <20051225034534.GA88508@prometheusresearch.com> <5.1.1.6.0.20051224233526.03d0cf60@mail.telecommunity.com> <5.1.1.6.0.20051225135446.0220b2b8@mail.telecommunity.com> <5.1.1.6.0.20051225190848.0220b018@mail.telecommunity.com> Message-ID: <20051227203821.GA28430@prometheusresearch.com> Phillip, Thank you for humoring the discussion (I realize it was covered in the PEP). I've since found a solution which covers my requirements of making header access easier in ``environ`` and ``response_headers`` yet keeping to the spirt of WSGI (but I'll let you be the final judge). It involves turning header "definitions" into objects: http://svn.w4py.org/Paste/trunk/paste/httpheaders.py http://svn.w4py.org/Paste/trunk/tests/test_httpheaders.py Anyway, the final result is actually much better than I expected, it is far more modular/extendable than the extensions/wrappers I had started to implement earlier. So, I must thank you for sticking to your policy; despite my complaints earlier, it seems to be a very wise choice. Kind Regars, Clark P.S. The work above is usable; but incomplete in a few minor ways. It will soon be getting concrete (rather than generic) implementations for the more complicated HTTP headers that I work with: Content-Disposition, Cache-Control, Set-Cookie, etc. Suggestions, of course, are very welcome. At this time the module has no dependencies, and so far the rest of Paste does not depend upon it, however, if Ian agrees, much of paste could be re-configured to use this module (especially the fileapp.py module which is one of the motivators). From foom at fuhm.net Wed Dec 28 16:34:51 2005 From: foom at fuhm.net (James Y Knight) Date: Wed, 28 Dec 2005 10:34:51 -0500 Subject: [Web-SIG] Is the size argument to the input-stream read method optional? In-Reply-To: <43A972CC.9090204@zope.com> References: <43A1BCE9.8020403@zope.com> <43A1CB1D.7000900@colorstudy.com> <43A972CC.9090204@zope.com> Message-ID: <74365E33-DC78-461D-A880-6B4580548C22@fuhm.net> On Dec 21, 2005, at 10:20 AM, Jim Fulton wrote: > Ian Bicking wrote: > >> Jim Fulton wrote: >> >> >>> The PEP is unclear on this and should be clarified, IMO. >>> >> >> >> My experience in using implementations is many servers do not require >> the read size argument (they don't give a TypeError), but they block >> without it, or if you read past CONTENT_LENGTH. So it should >> probably >> be required in the spec, since it's required in practice. >> > > Does this constitude a decision? Can somebody update the PEP? > I am able and willing to if requested to. :) Surely that's a bug in the server, not the spec? Indeterminate length uploads (with transfer-encoding chunked) are allowed by HTTP, after all. The CGI spec explicitly rejects such requests, but WSGI doesn't seem to. James From ianb at colorstudy.com Wed Dec 28 19:14:25 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Wed, 28 Dec 2005 12:14:25 -0600 Subject: [Web-SIG] Is the size argument to the input-stream read method optional? In-Reply-To: <74365E33-DC78-461D-A880-6B4580548C22@fuhm.net> References: <43A1BCE9.8020403@zope.com> <43A1CB1D.7000900@colorstudy.com> <43A972CC.9090204@zope.com> <74365E33-DC78-461D-A880-6B4580548C22@fuhm.net> Message-ID: <43B2D601.30007@colorstudy.com> James Y Knight wrote: >>>> The PEP is unclear on this and should be clarified, IMO. >>>> >>> >>> >>> My experience in using implementations is many servers do not require >>> the read size argument (they don't give a TypeError), but they block >>> without it, or if you read past CONTENT_LENGTH. So it should probably >>> be required in the spec, since it's required in practice. >>> >> >> Does this constitude a decision? Can somebody update the PEP? >> I am able and willing to if requested to. :) > > > Surely that's a bug in the server, not the spec? Indeterminate length > uploads (with transfer-encoding chunked) are allowed by HTTP, after > all. The CGI spec explicitly rejects such requests, but WSGI doesn't > seem to. But while it is possible, if an application uses this then it won't be portable, right? I think chunking has been explicitly excluded from WSGI too, as something that should be handled/isolated in the server. Not that I really know much about chunking, except that it was discussed at one point. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From foom at fuhm.net Wed Dec 28 19:36:48 2005 From: foom at fuhm.net (James Y Knight) Date: Wed, 28 Dec 2005 13:36:48 -0500 Subject: [Web-SIG] Is the size argument to the input-stream read method optional? In-Reply-To: <43B2D601.30007@colorstudy.com> References: <43A1BCE9.8020403@zope.com> <43A1CB1D.7000900@colorstudy.com> <43A972CC.9090204@zope.com> <74365E33-DC78-461D-A880-6B4580548C22@fuhm.net> <43B2D601.30007@colorstudy.com> Message-ID: On Dec 28, 2005, at 1:14 PM, Ian Bicking wrote: >> Surely that's a bug in the server, not the spec? Indeterminate >> length uploads (with transfer-encoding chunked) are allowed by >> HTTP, after all. The CGI spec explicitly rejects such requests, >> but WSGI doesn't seem to. >> > > But while it is possible, if an application uses this then it won't > be portable, right? I think chunking has been explicitly excluded > from WSGI too, as something that should be handled/isolated in the > server. Not that I really know much about chunking, except that it > was discussed at one point. The server handles the unchunking, but the unchunked stream it passes to the client has no content-length. The only way to indicate when the stream is done is via EOF. It doesn't seem a good idea to me for the WSGI spec to disallow chunked uploads. The reason it's disallowed in the CGI spec is that it was added to HTTP after CGI was defined. There's no similar excuse for WSGI. However, I see that in the spec, indeterminate length uploads have already been disallowed implicitly, by not requiring the server to return EOF from reads at the end of the stream: "The server is not required to read past the client's specified Content-Length, and is allowed to simulate an end-of-file condition if the application attempts to read past that point. The application SHOULD NOT attempt to read more data than is specified by the CONTENT_LENGTH variable." If the client cannot depend on an EOF at the end of the stream, it cannot read a stream without a length. I'd much rather it say something like: "The server MUST NOT read past the end of the request, and MUST simulate an end-of-file condition if the application attempts to read past that point. Attempting to read from an input stream when no data has been provided MUST result in an end-of-file result (the empty string)." but it doesn't. At least the spec does allow the server to implement read correctly. James From cce at clarkevans.com Thu Dec 29 16:44:08 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Thu, 29 Dec 2005 10:44:08 -0500 Subject: [Web-SIG] WSGI and Content-Type Message-ID: <20051229154408.GA20693@prometheusresearch.com> I'm puzzled why CONTENT_TYPE/CONTENT_LENGTH is listed as an ``environ`` CGI variable when it seems the corresponding corresponding HTTP_CONTENT_TYPE/HTTP_CONTENT_LENGTH would work. Is there a reason for this redundancy? Which one should I use? If they differ, which one is correct? Kind Regards, Clark From ianb at colorstudy.com Thu Dec 29 17:27:14 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 29 Dec 2005 10:27:14 -0600 Subject: [Web-SIG] WSGI and Content-Type In-Reply-To: <20051229154408.GA20693@prometheusresearch.com> References: <20051229154408.GA20693@prometheusresearch.com> Message-ID: <43B40E62.90209@colorstudy.com> Clark C. Evans wrote: > I'm puzzled why CONTENT_TYPE/CONTENT_LENGTH is listed as an ``environ`` > CGI variable when it seems the corresponding corresponding > HTTP_CONTENT_TYPE/HTTP_CONTENT_LENGTH would work. Is there a reason for > this redundancy? Which one should I use? If they differ, which one is > correct? Probably HTTP_CONTENT_TYPE and HTTP_CONTENT_LENGTH shouldn't be in there, and they should be ignored if they are in there. CGI translates all headers by adding HTTP_, except for these two particular headers. WSGI is just following CGI on this one. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From ianb at colorstudy.com Thu Dec 29 17:38:27 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 29 Dec 2005 10:38:27 -0600 Subject: [Web-SIG] transaction progress with cgi.FieldStorage In-Reply-To: <43AAE16B.9040006@gmail.com> References: <43AAE16B.9040006@gmail.com> Message-ID: <43B41103.1040308@colorstudy.com> kai wrote: > Hi All, > this is my first post on this list. I am working on a way to monitor the > progress of reading a file upload from wsgi.input. I can currently > monitor the overall transfer and when individual files of a multiple > file upload are completed. The ultimate goal of this is to be able to > display a progress meter when someone is uploading a file. > > To do this I subclassed cgi.FieldStorage but when I finished I had > modified most of the non-trivial methods just to hook in something to > monitor the transfer progress, oops. > > Has anyone else found FieldStorage insufficient for certain tasks? > Is there a general need for a more flexible FieldStorage replacement? Incidentally, one way I've considered implementing this is to simply write the entire request body to a file, and parse it later, probably in the context of whatever framework I'm using (but typical web frameworks don't actually deal well with tracking an upload, hence a custom WSGI application). -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From ianb at colorstudy.com Thu Dec 29 17:51:52 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 29 Dec 2005 10:51:52 -0600 Subject: [Web-SIG] WSGI and Content-Type In-Reply-To: <20051229154408.GA20693@prometheusresearch.com> References: <20051229154408.GA20693@prometheusresearch.com> Message-ID: <43B41428.3080100@colorstudy.com> Clark C. Evans wrote: > I'm puzzled why CONTENT_TYPE/CONTENT_LENGTH is listed as an ``environ`` > CGI variable when it seems the corresponding corresponding > HTTP_CONTENT_TYPE/HTTP_CONTENT_LENGTH would work. Is there a reason for > this redundancy? Which one should I use? If they differ, which one is > correct? Incidentally, I've added a check for QUERY_STRING (missing QUERY_STRING causes buggy cgi module behavior, per my previous email) and HTTP_CONTENT_TYPE/LENGTH to paste.lint (it's an error now, but maybe it should be a warning). I'd encourage people to use it to check server and application behavior (it just watches things go by, it doesn't effect the request). -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From cce at clarkevans.com Fri Dec 30 00:31:26 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Thu, 29 Dec 2005 18:31:26 -0500 Subject: [Web-SIG] transaction progress with cgi.FieldStorage In-Reply-To: <43AAE16B.9040006@gmail.com> References: <43AAE16B.9040006@gmail.com> Message-ID: <20051229233126.GA24311@prometheusresearch.com> On Thu, Dec 22, 2005 at 12:24:59PM -0500, kai wrote: | this is my first post on this list. I am working on a way to monitor the | progress of reading a file upload from wsgi.input. I can currently | monitor the overall transfer and when individual files of a multiple | file upload are completed. The ultimate goal of this is to be able to | display a progress meter when someone is uploading a file. You could do this in a few stages: #1 Use an async XMLHttpRequest on the client side to POST the file to your file upload servlet; in the URL for the post use a unique identifier, say MY-ID #2 Override make_file /w your own that monitors how much of the file's content has been sent; store that in a global mapping using MY-ID as the key #3 Create a monitor URL on your server that reads the mapping and returns an hour glass or something /w a refresh page #4 When you send your application request; open up an iframe /w refresh setting to that monitor URL (using MY-ID) Although, you've probably already done something similar... | To do this I subclassed cgi.FieldStorage but when I finished I had | modified most of the non-trivial methods just to hook in something to | monitor the transfer progress, oops. | | Has anyone else found FieldStorage insufficient for certain tasks? | Is there a general need for a more flexible FieldStorage replacement? I've found make_file sufficient for all of my needs (so far) Best, Clark From janssen at parc.com Fri Dec 30 20:37:29 2005 From: janssen at parc.com (Bill Janssen) Date: Fri, 30 Dec 2005 11:37:29 PST Subject: [Web-SIG] WSGI for Medusa? Message-ID: <05Dec30.113733pst."58633"@synergy1.parc.xerox.com> If no one has done a WSGI implementation for Medusa, I think I'll take a shot at it this weekend... Bill From kai.keliikuli at gmail.com Sat Dec 31 02:56:16 2005 From: kai.keliikuli at gmail.com (kai) Date: Fri, 30 Dec 2005 20:56:16 -0500 Subject: [Web-SIG] transaction progress with cgi.FieldStorage In-Reply-To: <43B41103.1040308@colorstudy.com> References: <43AAE16B.9040006@gmail.com> <43B41103.1040308@colorstudy.com> Message-ID: <43B5E540.8030709@gmail.com> > Incidentally, one way I've considered implementing this is to simply > write the entire request body to a file, and parse it later, probably in > the context of whatever framework I'm using (but typical web frameworks > don't actually deal well with tracking an upload, hence a custom WSGI > application). I put aside my rewrite of FieldStorage and went this route. I'm working on this using lighttpd and the flup wsgi implementation. When I do an upload though I'm seeing a delay before I start getting a progress read it seems like all the data is getting to the server and only then is environ['wsgi.input'] available. I'm looking at this just using a print statement in the loop I use to read in data. So when I upload a 10 MB file. It sits for about 2.5 minutes then bursts the progress read all at once in under a second. I need to investigate more may very well be me doing something silly. An aside on cgi.FieldStorage itself. It reads data using readline instead of reading in blocks of limited size. doing this I think means a file with very long lines, 20MB, 100MB, ... could cause excessive memory consumption. Kai From chad at zetaweb.com Sat Dec 31 06:21:29 2005 From: chad at zetaweb.com (Chad Whitacre) Date: Sat, 31 Dec 2005 00:21:29 -0500 Subject: [Web-SIG] transaction progress with cgi.FieldStorage In-Reply-To: <43B5E540.8030709@gmail.com> References: <43AAE16B.9040006@gmail.com> <43B41103.1040308@colorstudy.com> <43B5E540.8030709@gmail.com> Message-ID: <43B61559.3050304@zetaweb.com> > I need to investigate more may very well be me > doing something silly. Are your prints buffered? sys.stdout.flush() chad From chrism at plope.com Sat Dec 31 06:50:48 2005 From: chrism at plope.com (Chris McDonough) Date: Sat, 31 Dec 2005 00:50:48 -0500 Subject: [Web-SIG] transaction progress with cgi.FieldStorage In-Reply-To: <43B5E540.8030709@gmail.com> References: <43AAE16B.9040006@gmail.com> <43B41103.1040308@colorstudy.com> <43B5E540.8030709@gmail.com> Message-ID: > An aside on cgi.FieldStorage itself. It reads data using readline > instead of reading in blocks of limited size. doing this I think means > a file with very long lines, 20MB, 100MB, ... could cause excessive > memory consumption. This was reported and solved a long time ago (but not yet fixed in any Python distro): https://sourceforge.net/tracker/? func=detail&aid=1112549&group_id=5470&atid=105470 From mal at egenix.com Mon Dec 19 00:57:13 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Sun, 18 Dec 2005 23:57:13 -0000 Subject: [Web-SIG] [DB-SIG] WSGI thread affinity/interleaving In-Reply-To: <43A5B971.1010408@colorstudy.com> References: <5.1.1.6.0.20051217171840.01e1ab80@mail.telecommunity.com> <43A5B971.1010408@colorstudy.com> Message-ID: <43A5F757.2030906@egenix.com> Ian Bicking wrote: > James Y Knight wrote: >> I'm worried about database access. Most DBAPI adapters have >> threadsafety level 2: "Threads may share the module and >> connections.". So with those, at least, it should be fine to move a >> connection between threads, since "share OK" implies "move OK". What exactly do you mean with "move" ? Sharing a connection refers to multiple threads creating cursors on this connection. >> However, no documentation I've found has said anything separately >> about whether it's safe to _move_ a cursor between threads. It seems >> likely to me that it would not be safe, at least in some database >> adapters. Thread level 3 adapters would allow for sharing a cursor meaning that you can call cursor.execute() from different threads. Given that you usually already have to be careful with sharing connections, sharing cursors is rather unlikely to work in a general setting. > And if it's not safe, that means a WSGI result iterator >> cannot use any DBAPI cursor functionality which seems a drag. >> >> Does anybody have practical experience with the safety of moving a >> DBAPI cursor between threads? > > I haven't done that, but SQLite (2?) notably doesn't allow you to move a > connection between threads. I'm not actually sure what problems it > causes if you do move them -- it may simply be an overzealous warning. > > CCing DB-SIG -- people there might know more details. Sharing cursors is possible with some database drivers and can be used to e.g. pool cursors with prepared commands. mxODBC does support this if the ODBC driver is thread-safe (which it should be if it adheres to the ODBC standard). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Dec 19 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mal at egenix.com Mon Dec 19 15:17:51 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 19 Dec 2005 14:17:51 -0000 Subject: [Web-SIG] [DB-SIG] WSGI thread affinity/interleaving In-Reply-To: <6B850331-4947-4824-84A3-2C04BC32BEA8@fuhm.net> References: <5.1.1.6.0.20051217171840.01e1ab80@mail.telecommunity.com> <43A5B971.1010408@colorstudy.com> <43A5F757.2030906@egenix.com> <6B850331-4947-4824-84A3-2C04BC32BEA8@fuhm.net> Message-ID: <43A6C10D.5000503@egenix.com> James Y Knight wrote: > On Dec 18, 2005, at 6:57 PM, M.-A. Lemburg wrote: > >> Ian Bicking wrote: >> >>> James Y Knight wrote: >>> >>>> I'm worried about database access. Most DBAPI adapters have >>>> threadsafety level 2: "Threads may share the module and >>>> connections.". So with those, at least, it should be fine to move a >>>> connection between threads, since "share OK" implies "move OK". >>>> >> What exactly do you mean with "move" ? Sharing a >> connection refers to multiple threads creating cursors >> on this connection. > > I'm asking about moving a cursor, that is, accessing it sequentially > first from one thread, then later from another thread. This is > potentially asking less than sharing, that is, accessing it > simultaneously from two threads. > > For example, a simple class without any locking, that only modifies > itself, would generally be movable between threads, but not sharable. > Adding a mutex would make it both. Ok. In that sense, I think "moving" is not really possible with database connections or cursors: these always rely on external resources and these may be relying on having the same thread context around when being called. Why would you want to "move" cursors or connections around ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Dec 19 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mal at egenix.com Sat Dec 31 13:13:15 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 31 Dec 2005 12:13:15 -0000 Subject: [Web-SIG] [DB-SIG] WSGI thread affinity/interleaving In-Reply-To: References: <5.1.1.6.0.20051217171840.01e1ab80@mail.telecommunity.com> <43A5B971.1010408@colorstudy.com> <43A5F757.2030906@egenix.com> <6B850331-4947-4824-84A3-2C04BC32BEA8@fuhm.net> <43A6C10D.5000503@egenix.com> Message-ID: <43B675D9.4090708@egenix.com> Guido van Rossum wrote: > On 12/19/05, M.-A. Lemburg wrote: >> Ok. In that sense, I think "moving" is not really possible >> with database connections or cursors: these always rely on >> external resources and these may be relying on having the >> same thread context around when being called. >> >> Why would you want to "move" cursors or connections around ? > > A typical connection (or cursor) caching implementation used from a > multi-threaded program might easily do this: a resource is created in > one thread, used for a while, then given back to the cache; when > another thread requests a resource, it gets one from the cache that > might have been used previously in a different thread. Keeping a cache > per thread is a bit cumbersome and reduces the efficacy of the cache > (if a thread goes away all the resources cached on its behalf must be > closed instead of being made available to other threads). > > I'm not sure I understand what resources a typical DB client library > might have that are associated with a thread instead of with a > connection or cursor -- IOW I don't understand why you think moving > resources between threads would be a problem, as long as only one > thread "owns" them at any time. IOW if I maintain my own locking, why > would I still be limited in sharing connections/cursors between > threads? What am I missing? All this would be easily possible if the Python cursor object had full control over the external resources in use. However, most Python database cursor objects rely on external libraries and therefore do not have control over where state is stored. If you use a resource from a different thread than the one where it was created, this can cause situations where part of the state is missing, or worse, a different state is used. Many database libraries do their own caching at various levels (network connection, logical connections, cursors). Not all of them are fully thread-safe and its hard to find out. To be on the safe side, you should only use connections from the thread they were created with. It's still worthwhile to cache the connections (and even cursors on that connection): * connecting to a database can take anything from micro-seconds to several seconds; * preparing statements for execution on a cursor also takes time (and in most cases costs a network roundtrip), so caching already prepared cursors also makes sense for commonly used statements. The latter is especially useful with bound parameters since the database will usually only have to prepare the statement once and can then take any number of parameter sets to execute the statement with. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Dec 31 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::