[Web-SIG] Implementing File Upload Size Limits

Randy Syring randy at rcs-comp.com
Sat Nov 22 19:06:15 CET 2008


[forgot to copy list]

Graham Dumpleton wrote:
> 2008/11/22 Randy Syring <randy at rcs-comp.com>:
>   
>> I am looking for opinions and thoughts on best practice for limiting file
>> upload size.  I have a few considerations:
>>
>> <snip>
>>     
> If you use Apache/mod_wsgi to host your WSGI application, the best way
> of handling this is use the Apache LimitRequestNody directive for
> appropriate context. This will result in Apache returning a
> HTTP_REQUEST_ENTITY_TOO_LARGE (413) error response to the client. If
> you need a custom error document for that response type use Apache
> ErrorDocument directive to specify URL of handler which would generate
> it.
>   
Graham,

Thank you for your response.  What you noted above does seem to be the
lowest level solution possible if you are using apache.  I suppose using
an error document that is part of the application would at least allow
me to serve a specific page from my application that could detail the
error.  If I wanted to get fancy, each time a form with an input element
was sent to a user, I could save that path in a special variable in the
user's session.  My error page could then look for that value in the
user session and if present, load the correct form, giving the user an
error message noting that the file uploaded was too big.  The downfall
to that approach is that the form comes back empty.  It might be better
to just have the error page give them some details and encourage them to
use the back button, in which case the form's fields would hopefully
still be filled in.
> Except for the custom error document if delegated to the WSGI
> application, doing it this way results in it all being handled by
> Apache/mod_wsgi and your WSGI application will not even be invoked.
> The request body content would also not even be read by Apache at all.
> Do note that whether this avoids the client sending the request body
> input depends on whether the client was expecting a '100 Continue'
> response before it send the data. Most web browsers still I believe
> don't use '100 Continue' response.
>
> This would be the preferred solution for Apache/mod_wsgi as it is
> handled at lowest levels and guaranteed that request content wouldn't
> be read at that point. It is however taking control out of your
> application.
>   
Hopefully you can clarify something for me.  Lets assume that the client
does not use '100 Continue' but sends data immediately, after sending
the headers.  If the server never reads the request content, what does
that mean exactly?  Does the data get transferred over the wire but then
discarded or does the client not get to send the data until the server
reads the request body?  I.e. the client tries to "send" it, but the
content isn't actually transferred across the wire until the server
reads it.  I am just wondering if there is a buffer or queue or
something between the server and the client that allows data to be
transferred even if the server doesn't "read" the request body.  Or, is
it just like a straight pipe where one end (the client) can't push data
through until the other end (the server) reads it.

I agree that it does take control out of the application.  From a
usability perspective, the best solution IMO would be for the user to
get the form back and have a red error messsage under the input field
indicating the file size uploaded was too big and giving them the max
file size allowed.  However, on second thought, that may not be true.
As noted above, because the entire request body was rejected, the form
loaded would have none of the information they submitted and most users
would probably think they have to fill out the whole form again.
Probably better to just give them a non-form error page and let them use
the back button (or even provide a link that uses javascript to go back)
and in so doing hopefully salvage the time they put into the form.

I suppose, though, that two different kinds of file size limits need to
be thought through.  The first limit would be an application wide limit
that is set for security/resource reasons.  That, I believe, is what we
have been discussing up to this point.  I am just realizing that it
would also be fine to limit upload sizes at the application level and
give more user-friendly error messages.  So I might decide on a 10MB
application-wide upload limit, but I might also restrict free accounts
and paid accounts to 256k and 5MB respectively.  As long as a user
uploads something less than 10MB, they get a friendly in-line error
message.  If they upload over 10MB, we handle that at the apache level
and send them to a custom error page.
> For Apache/mod_wsgi, if you do not do it this way but instead validate
> content length in the WSGI application and have the WSGI application
> return HTTP_REQUEST_ENTITY_TOO_LARGE (413) error response, then
> whether the request content gets read depends on whether you are using
> embedded mode or daemon mode of mod_wsgi.
>
> If you use embedded mode, so long as your WSGI application doesn't
> read the input and just returns the error response, the request
> content wouldn't be read at all. If you are using daemon mode however,
> then the request content would always be read by Apache child worker
> process, even if client asked for '100 Continue' response. This is
> because the Apache child worker process will always proxy request
> content to the daemon process.
>
>   
Thats good to know.  I think at this point I have talked myself into
thinking that there is no good reason to handle it at the application
level, but would appreciate any further feedback you might have.

One other thing, what would be a good upload size limit?  Should it
always be as low as possible?  What might be a good "middle-ground" for
the average web application uploading documents and pictures?

Thank you for taking the time to respond.

--------------------------------------
Randy Syring
RCS Computers & Web Solutions
502-644-4776
http://www.rcs-comp.com

"Whether, then, you eat or drink or
whatever you do, do all to the glory
of God." 1 Cor 10:31






More information about the Web-SIG mailing list