[Patches] [ python-Patches-839496 ] SimpleHTTPServer reports wrong content-length for text files

SourceForge.net noreply at sourceforge.net
Mon Jan 10 10:28:06 CET 2005


Patches item #839496, was opened at 2003-11-10 21:42
Message generated for change (Comment added) made by jlgijsbers
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=839496&group_id=5470

Category: Library (Lib)
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Irmen de Jong (irmen)
>Assigned to: Johannes Gijsbers (jlgijsbers)
Summary: SimpleHTTPServer reports wrong content-length for text files

Initial Comment:
(Python 2.3.2 on Windows)

SimpleHTTPServer reports the size of the file on disk
as Content-Length. This works except for text files.
If the content type starts with "text/" it is opening the
file in 'text' mode rather than 'binary' mode. At least on
Windows this causes newline translations, thereby making
the  actual size of the content transmitted *less* than
the content-length!

I don't know why SimpleHTTPServer is reading text files
with text mode. The included patch removes this distinction
so all files are opened in binary mode (and, also on
windows,
the actual size transmitted is the same as the reported
content-length).

--Irmen de Jong


----------------------------------------------------------------------

>Comment By: Johannes Gijsbers (jlgijsbers)
Date: 2005-01-10 10:28

Message:
Logged In: YES 
user_id=469548

Okay, I've checked in the fix on maint24 and HEAD.

----------------------------------------------------------------------

Comment By: Irmen de Jong (irmen)
Date: 2005-01-10 01:44

Message:
Logged In: YES 
user_id=129426

Upon re-reading the w3 spec, it seems that we're safe. As
long as the use of CR or LF or CR+LF is consistent in the
whole text file.
The spec says: "HTTP relaxes this requirement [=the
requirement of being in canonical form] and allows the
transport of text media with plain CR or LF alone
representing a line break when it is done consistently for
an entire entity-body. HTTP applications MUST accept CRLF,
bare CR, and bare LF as being representative of a line break
in text media received via HTTP."

So my patch is safe, I think.

----------------------------------------------------------------------

Comment By: Irmen de Jong (irmen)
Date: 2004-07-19 22:59

Message:
Logged In: YES 
user_id=129426

Hm, perhaps the easy way out (see my patch) is not the best
solution:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.1

It seems it's best to convert text responses to CR LF
format? If we should do this, we must somehow 're-calculate'
the content-length after the CR LF conversion.

----------------------------------------------------------------------

Comment By: Irmen de Jong (irmen)
Date: 2004-06-06 13:18

Message:
Logged In: YES 
user_id=129426

The attached httptest.zip contains a test scenario. When run
on windows, it will show the problem.
First start 'startserver.py' and then from the same
directory run test.py.
I get this:

[E:\temp\httptest]python test.py
The reported content-length is: 1047 bytes
The real filesize is: 1047 bytes
The data I actually received from the httpserver is: 1028 bytes

----------------------------------------------------------------------

Comment By: Irmen de Jong (irmen)
Date: 2004-05-31 18:51

Message:
Logged In: YES 
user_id=129426

The attached trivial patch removes the special case for text
files.

----------------------------------------------------------------------

Comment By: Irmen de Jong (irmen)
Date: 2004-05-13 13:21

Message:
Logged In: YES 
user_id=129426

This bug is also still present in the SimpleHTTPServer.py
from Python 2.3.3 (and in the current CVS version, too).

Is there a reason why it treats text files differently? If
so, then at least the reported content-length must be fixed.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=839496&group_id=5470


More information about the Patches mailing list