suggestions please "what should i watch for/guard against' in a file upload situation?"

Diez B. Roggisch deets at web.de
Wed Oct 6 18:58:30 EDT 2010


Martin Gregorie <martin at address-in-sig.invalid> writes:

> On Wed, 06 Oct 2010 09:02:21 -0700, geekbuntu wrote:
>
>> in general, what are things i would want to 'watch for/guard against' in
>> a file upload situation?
>> 
>> i have my file upload working (in the self-made framework @ work without
>> any concession for multipart form uploads), but was told to make sure
>> it's cleansed and cannot do any harm inside the system.
>>
> Off the top of my head, and assuming that you get passed the exact 
> filename that the user entered:
>
> - The user may need to use an absolute pathname to upload a file
>   that isn't in his current directory, so retain only the basename
>   by discarding the rightmost slash and everything to the left of it:
>     /home/auser/photos/my_photo.jpg   ===> my_photo.jpg
>     c:\My Photos\My Photo.jpg         ===> My Photo.jpg
>
> - If your target system doesn't like spaces in names or you want to be
>   on the safe side there, replace spaces in the name with underscores:
>     My Photo.jpg     ===>    My_Photo.jpg
>
> - reject any filenames that could cause the receiving system to do
>   dangerous things, e.g. .EXE or .SCR if the upload target is Windows.
>   This list will be different for each upload target, so make it 
>   configurable.

Erm, this assumes that the files are executed in some way. Why should
they? It's perfectly fine to upload *anything*, and of course filenames
mean nothing wrt to the actual file contents ("Are you sure you want to
change the extension of this file?"). 

It might make no sense for the user, because you can't shon an exe as profile
image. But safe-guarding against that has nothing to do with OS. And
even "safe" file formats such as PNGs have been attack
vectors. Precisely because they are processed client-side in the browser
through some library with security issues.

For serving the files, one could rely on the "file"-command or similar
means to determine the mime-type. So far, I've never done that - as
faking the extension for something else doesn't buy you something unless
there is a documented case of "internet explorer ignoring mime-type, and
executing downloaded file as program".


>   You can't assume anything about else about the extension. 
>   .py .c .txt and .html are all valid in the operating systems I use
>   and so are their capitalised equivalents. 
>
> - check whether the file already exists. You need
>   rules about what to do if it exists (do you reject the upload,
>   silently overwrite, or alter the name, e.g. by adding a numeric
>   suffix to make the name unique:
>
>      my_photo.jpg  ===>  my_photo-01.jpg

Better, associate the file with the uploader and or it's hash. Use the
name as pure meta-information only.

> There's probably something I've forgotten, but that list should get you 
> going.

Dealing with to large upload requests I'd say is much more important, as
careless reading of streams into memory has at least the potential for a
DOS-attack.

Diez



More information about the Python-list mailing list