BaseHTTPServer weirdness

Tue Sep 12 05:17:10 EDT 2006

Ron Garret wrote:
> In article <3qoNg.820$5i7.307 at newsreading01.news.tds.net>,
>  Kent Johnson <kent at kentsjohnson.com> wrote:
> 
> 
>>Steve Holden wrote:
>>
>>>Ron Garret wrote:
>>>
>>>>In article <mailman.294.1158010161.5279.python-list at python.org>,
>>>> Steve Holden <steve at holdenweb.com> wrote:
>>>>
>>>>
>>>>
>>>>>But basically, you aren't providing a CGI environment, and that's why 
>>>>>cgi.parse() isn't working.
>>>>
>>>>Clearly.  So what should I be doing?  Surely I'm not the first person to 
>>>>have this problem?
>>>>
>>>>I have managed to work around this for now by copying and modifying the 
>>>>code in cgi.parse, but this still feels like a Horrible Hack to me.
>>>>
>>>
>>>Let me get this right. You are aware that CGIHTTPServer module exists. 
>>>But you don't want to use that. Instead you want to use your own code. 
>>>So you have ended up duplicating some of the functionality of the cgi 
>>>library. And it feels like a hack.
>>>
>>>Have I missed anything? :-)
>>
>>Hey, be nice. Wanting to write a request handler that actually handles a 
>>POST request doesn't seem so unreasonable.
>>
>>Except...when there are about a bazillion Python web frameworks to 
>>choose from, why start from BaseHTTPServer? Why not use one of the 
>>simpler frameworks like Karrigell or Snakelets or CherryPy?
> 
> 
> It may come to that.  I just thought that what I'm trying to do is so 
> basic that it ought to be part of the standard library.  I mean, what do 
> people use BaseHTTPServer for if you can't parse form input?
> 
> 
>>Here is the query-handling code from Karrigell's CustomHTTPServer.py, 
>>good at least for a second opinion:
>>
>>     def do_POST(self):
>>         """Begin serving a POST request. The request data must be readable
>>         on a file-like object called self.rfile"""
>>         ctype, pdict = 
>>cgi.parse_header(self.headers.getheader('content-type'))
>>         self.body = cgi.FieldStorage(fp=self.rfile,
>>             headers=self.headers, environ = {'REQUEST_METHOD':'POST'},
>>             keep_blank_values = 1, strict_parsing = 1)
>>         # throw away additional data [see bug #427345]
>>         while select.select([self.rfile._sock], [], [], 0)[0]:
>>             if not self.rfile._sock.recv(1):
>>                 break
>>         self.handle_data()
>>
>>Here is CherryPy's version from CP 2.1:
>>
>>         # Create a copy of headerMap with lowercase keys because
>>         # FieldStorage doesn't work otherwise
>>         lowerHeaderMap = {}
>>         for key, value in request.headerMap.items():
>>             lowerHeaderMap[key.lower()] = value
>>
>>         # FieldStorage only recognizes POST, so fake it.
>>         methenv = {'REQUEST_METHOD': "POST"}
>>         try:
>>             forms = _cpcgifs.FieldStorage(fp=request.rfile,
>>                                       headers=lowerHeaderMap,
>>                                       environ=methenv,
>>                                       keep_blank_values=1)
>>
>>where _cpcgifs.FieldStorage is cgi.FieldStorage with some extra accessors.
> 
> 
> Here's what I actually ended up doing:
> 
> def parse(r):
>   ctype = r.headers.get('content-type')
>   if not ctype: return None
>   ctype, pdict = cgi.parse_header(ctype)
>   if ctype == 'multipart/form-data':
>     return cgi.parse_multipart(r.rfile, pdict)
>   elif ctype == 'application/x-www-form-urlencoded':
>     clength = int(r.headers.get('Content-length'))
>     if maxlen and clength > maxlen:
>       raise ValueError, 'Maximum content length exceeded'
>     return cgi.parse_qs(r.rfile.read(clength), 1)
>   else:
>     return None
> 
> which is copied more or less directly from cgi.py.  But it still seems 
> to me like this (or something like it) ought to be standardized in one 
> of the *HTTPServer.py modules.
> 
> But what do I know?
> 
I wouldn't necessarily say you are wrong here, It's just that the cgi 
module has sort of "just growed", so it isn't conveniently factyored for 
reusability in other contexts. Several people (including me) have taken 
a look at it with a view to possible re-engineering and backed away 
because of the difficulty of maintaining compatibility. Python 3K will 
be an ideal oppoertunity to replace it, but until then it's probably 
going to stay in the same rather messy but working state.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden