Extending and altering httplib to handle bad servers

Michael Ekstrand mekstran at scl.ameslab.gov
Mon Aug 8 10:54:33 EDT 2005


In the course of my current project, I've had to deal with connecting
to an HTTP server that isn't fully compliant with the HTTP requirements
for chunked encodings. Rather than sending the end-of-data sentinel
(0-length chunk), it just closes the connection (without even sending
the CRLF at the end of the data).

Because of this, using httplib would always throw nasty errors and not
give me any useful data. Therefore, I've had to modify the httplib.py
code to do something reasonable when the server just closes the
connection.

So, my questions are (my changes follow, against Python 2.3):

- Did I patch the right place to do Something Reasonable in this case
of server non-compliance?

- Is there a better way to handle this case that may be more robust? Or
handle more similar cases?

- Is there anything special I should do (besides obviously diff-ing
against CVS) before submitting a patch for this to SourceForge? (it
seems to me that being tolerant of bad servers is something that would
be of general interest.)

Thanks,
Michael

---8<------- BEGIN CONTEXT DIFF ------------
*** /usr/lib/python2.3/httplib.py	2005-05-04 02:08:57.000000000 -0500
--- httplib.py	2005-08-05 10:33:08.000000000 -0500
***************
*** 1,5 ****
--- 1,7 ----
  """HTTP/1.1 client library
  
+ Copyright (c) 2001 Python Software Foundation; All Rights Reserved
+ 
  <intro stuff goes here>
  <other stuff, too>
  
***************
*** 64,69 ****
--- 66,75 ----
  Unread-response                _CS_IDLE           <response_class>
  Req-started-unread-response    _CS_REQ_STARTED    <response_class>
  Req-sent-unread-response       _CS_REQ_SENT       <response_class>
+ 
+ Modified 2005-07-20 by Michael Ekstrand <mekstran at iastate.edu> to deal
+ gracefully wtih non-compliant systems which just terminate the connection
+ rather than sending the end-of-data chunk in chunked HTTP responses.
  """
  
  import errno
***************
*** 442,448 ****
                  amt -= chunk_left
  
              # we read the whole chunk, get another
!             self._safe_read(2)      # toss the CRLF at the end of the chunk
              chunk_left = None
  
          # read and discard trailer up to the CRLF terminator
--- 448,460 ----
                  amt -= chunk_left
  
              # we read the whole chunk, get another
!             try:
!                 self._safe_read(2)  # toss the CRLF at the end of the chunk
!             except IncompleteRead:
!                 # The server just closed on us, without providing appropriate
!                 # end-of-data things.
!                 self.close()
!                 return value
              chunk_left = None
  
          # read and discard trailer up to the CRLF terminator
---8<--------- END CONTEXT DIFF -------------



More information about the Python-list mailing list