httplib/socket problems reading 404 Not Found response

Patrick Altman paltman at gmail.com
Mon Mar 12 23:07:55 EDT 2007


I am attempting to use a HEAD request against Amazon S3 to check
whether a file exists or not and if it does parse the md5 hash from
the ETag in the response to verify the contents of the file so as to
save on bandwidth of uploading files when it is not necessary.

If the file exist, the HEAD works as expected and I get valid headers
back that I can parse and pull the ETag out of the dictionary using
getheader('ETag')[1:-1] (using the slice to trim off the double-quotes
in the string.

The problem lies when I attempt to send a HEAD request when no file
exists.   As expected, a 404 Not Found response is sent back from
Amazon however, my test scripts seem to hang.  I run python with
trace.py and it hangs here:

 --- modulename: httplib, funcname: _read_chunked
httplib.py(536):         assert self.chunked != _UNKNOWN
httplib.py(537):         chunk_left = self.chunk_left
httplib.py(538):         value = ''
httplib.py(542):         while True:
httplib.py(543):             if chunk_left is None:
httplib.py(544):                 line = self.fp.readline()
 --- modulename: socket, funcname: readline
socket.py(321):         data = self._rbuf
socket.py(322):         if size < 0:
socket.py(324):             if self._rbufsize <= 1:
socket.py(326):                 assert data == ""
socket.py(327):                 buffers = []
socket.py(328):                 recv = self._sock.recv
socket.py(329):                 while data != "\n":
socket.py(330):                     data = recv(1)

It eventually completes with an exception here:

  File "C:\Python25\lib\httplib.py", line 509, in read
    return self._read_chunked(amt)
  File "C:\Python25\lib\httplib.py", line 548, in _read_chunked
    chunk_left = int(line, 16)
ValueError: invalid literal for int() with base 16: ''

For reference, ethereal captured the following request and response:

HEAD <REMOVED> HTTP/1.1
Host: s3.amazonaws.com
Accept-Encoding: identity
Date: Tue, 13 Mar 2007 02:54:12 GMT
Authorization: AWS <REMOVED>

HTTP/1.1 404 Not Found
x-amz-request-id: E20B4C0D0C48B2EF
x-amz-id-2: <REMOVED>
Content-Type: application/xml
Transfer-Encoding: chunked
Date: Tue, 13 Mar 2007 02:54:16 GMT
Server: AmazonS3

Am I doing something wrong?  Is this a known issue?  I am an
experienced developer, but pretty new to Python and dynamic languages
in general.

Thanks,
Patrick




More information about the Python-list mailing list