[Python-Dev] Bug in SimpleHTTPRequestHandler.send_head?

Michael Foord fuzzyman at voidspace.org.uk
Fri Sep 5 14:19:50 CEST 2008


Hello Kim,

Thanks for your post. The source code control used for Python is Subversion.

Patches submitted to this list will unfortunately get lost. Please post 
the bug report along with your comments and patch to the Python bug tracker:

http://bugs.python.org/

Michael Foord

Kim Gräsman wrote:
> Hi all,
>
> I'm new to this group and the Python language as such. I stumbled on
> it when I joined a project to build a rich network library for C++,
> which in turn uses Python and its CGI HTTP server implementation as
> part of its unit test suite.
>
> We're having a little trouble when serving a text file containing
> Windows line endings (CRLF) -- the resulting content contains Unix
> line endings only (LF). This breaks our tests, because we can't verify
> that the body, as parsed by our HTTP client, is the same as the source
> file we're serving through the Python HTTP server.
>
> I've isolated it to the SimpleHTTPRequestHandler.send_head method in
> SimpleHTTPServer.py:
>
> --
>         ctype = self.guess_type(path)
>         if ctype.startswith('text/'):
>             mode = 'r'
>         else:
>             mode = 'rb'
>         try:
>             f = open(path, mode)
>         except IOError:
>             self.send_error(404, "File not found")
>             return None
> --
>
> The f object is returned from this method, and used with
> shutil.copyfileobj to copy the contents to the output stream.
>
> This is easily fixed by omitting the content-type check entirely, and
> blindly using mode 'rb', and I think that makes sense, because the
> server should not be concerned with the contents of the body, so
> treating it as a binary stream seems right.
>
> This also fixes another issue, where the actual body size differs from
> what's specified in the Content-Length header, because CR characters
> are stripped when the body is served, but Content-Length contains the
> source file's binary size.
>
> I'm not sure which source control system you're using, so I won't try
> to provide a patch, but I believe the code should read:
>
> --
>         if os.path.isdir(path):
>             if not self.path.endswith('/'):
>                 # redirect browser - doing basically what apache does
>                 self.send_response(301)
>                 self.send_header("Location", self.path + "/")
>                 self.end_headers()
>                 return None
>             for index in "index.html", "index.htm":
>                 index = os.path.join(path, index)
>                 if os.path.exists(index):
>                     path = index
>                     break
>             else:
>                 return self.list_directory(path)
>         #patch: removed content-type check
>         try:
>             f = open(path, 'rb')  #patch: always open in binary mode
>         except IOError:
>             self.send_error(404, "File not found")
>             return None
>         self.send_response(200)
>         self.send_header("Content-type", self.guess_type(path))
> #patch: content-type check here instead
>         fs = os.fstat(f.fileno())
> --
>
> My changes marked with "#patch[...]".
>
> Grateful for any comments!
>
> Best wishes,
> - Kim
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>   


-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/
http://www.trypython.org/
http://www.ironpython.info/
http://www.theotherdelia.co.uk/
http://www.resolverhacks.net/



More information about the Python-Dev mailing list