[Web-SIG] Re: Bill's comments on WSGI draft 1.4

Phillip J. Eby pje at telecommunity.com
Tue Sep 7 02:29:32 CEST 2004


At 11:41 AM 9/6/04 -0700, tony at lownds.com wrote:
>I'm specifically advocating that servers be required to use read() if they
>can't use fileno().

But with what block size?  If the block size is the whole file, why not 
just use:

     return [filelike.read()]

If it's some other block size, why not be explicit?


>  When an application returns an open file object,
>servers that send it out line by line (ie, as an interator) would be far
>far slower than servers that use fileno(). So that technique wouldn't
>really be portable across WSGI implementations. Using read() would make
>returning an open file a viable technique on all WSGI servers.

Okay, you've convinced me: the fileno() optimization (as it's currently 
specified) needs to be removed, and I need to strip out all mention of 
returning files from the application.  (Except maybe to mention that it's a 
bad idea!)

Instead of using 'fileno' as an extension attribute on the iterable, we'll 
add a 'wsgi.file_wrapper' key, usable as follows by an application:

     return environ['wsgi.file_wrapper'](something,blksize)

The 'file_wrapper' may introspect "something" in order to do a fileno() 
check, or other "I know how to send this kind of object quickly" 
optimizations.  It must return an iterable, that the application may return 
back to the server.  The server *must not* assume that the application 
*will* return the iterable; it is perfectly legal to do something like this:

     an_iter = environ['wsgi.file_wrapper'](something,blksize)

     for block in an_iter:
          yield block.replace('\n', '\r\n')

In this case, the application iterates over the file, but the original 
iterator's contents are not yielded.  In the same way, middleware may 
transform or ignore data yielded by the iterator.  So, in effect 
'file_wrapper' should just wrap the original file-like object in an 
iterator that the server can recognize and perform an optimization on, in 
the event that it *actually* is returned by the application.

Here's the simplest possible conforming implementation of 'file_wrapper', 
that works for any modern (1.5.2+) Python:

     class file_wrapper:

         def __init__(self,readable,blocksize=8192):
             self.readable, self.blocksize = readable, blocksize
             self.close = readable.close

         def __getitem__(self,index):
             data = self.readable.read(self.blocksize)
             if data:
                 return data
             raise IndexError

     environ['wsgi.file_wrapper'] = file_wrapper

     result = application(environ, start_response)

     if isinstance(result, file_wrapper):
         # check result.readable for fileno() or other optimizations
     else:
         # do normal iteration over 'result'

Unfortunately, this is a lot more boilerplate than I'd like to impose on 
server authors.  But, if we don't, then the same boilerplate is effectively 
imposed on all application/framework/middleware authors who want to return 
file-like objects.

The other hassle here is going to be adjusting the PEP's presentation 
sequence so that this complication doesn't obscure the simplicity of the 
"CGI Gateway" example.  :(

The other alternative is to check for a 'read()' method as an alternative 
to iterability, but it leaves open the question of appropriate block 
size.  I suppose we could say that this is up to the server.

But, no matter how the introspection works, it's going to work strongly 
against the appearance of simplicity in the examples.  :(



More information about the Web-SIG mailing list