[issue40763] zipfile.extractall is safe by now

Ama Aje My Fren report at bugs.python.org
Wed May 27 05:25:52 EDT 2020


Ama Aje My Fren <amaajemyfren at gmail.com> added the comment:

On Tue, May 26, 2020 at 2:47 PM Va <report at bugs.python.org> wrote:
>
> What hasn't been handled then?
>

The rules for naming files in Windows is long
(https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file).
It is e.g. possible to create files under WSL within Windows that
break these rules. In my case it was to add the colon (:) to a file
name. In Python for windows this would fail because the underlying
system API would stop it from happening (and in zip, it will be
changed to an underscore (_)) - but it is unclear what would actually
happen if you do so. In the old days just trying to open C:\Con\Con
(which did not exist) caused a BSOD.

>
> What is the safe way to use it?
>

The Security message suggests _with care_ - to wit - "Never extract
archives from untrusted sources without prior inspection."
There may be no absolutely safe way if the zipfile was crafted
maliciously. Just like there are inherent vulnerabilities in using XML
... (https://docs.python.org/3/library/xml.html#xml-vulnerabilities).
If a zipped file had a tree starting at C:\ and replaced a dll in
C:\Windows (and was running as Admin), a lot of caveats I know, but it
could be a problem.

> I think documenting "this function is unsafe" without suggesting a replacement or a safe way to use it isn't very constructive: as a developer, I want to extract a zip archive, but the only function supposed to do the job tells me "this is unsafe". Ok, so what am I supposed to do to be safe?

Does it say that unzipping a file is unsafe? It looks to me like it
says that in special conditions the extraction of a zipped file tree
may be unsafe and it is important to use caution. It is the case in a
lot of programming, is it not, that there are instances of security
vulnerabilities entering ordinary looking code? It happens in sql
(https://xkcd.com/327/) and many places within Python's Standard
Library (https://hackernoon.com/10-common-security-gotchas-in-python-and-how-to-avoid-them-e19fbe265e03)
even something as innocuous as using the new-style string format
(https://lucumr.pocoo.org/2016/12/29/careful-with-str-format/).

>
> That's what documentation should tell me, not let me puzzled with doubt.
>

This is an interesting point. What is the scope of Python Library
Documentation? I disagree with your view on scope. In my view the
Library Documentation should focus on what is exposed in the library
for ordinary use. So e.g. implementation details may not be expected
to be shown in the Documentation (like there is no documentation for
zipfile._extract_member()). It does have a duty of care - especially
to well known gotchas - but it is _not_ security documentation. I
think (this is my view, it is not god given) that in many cases it is
fair to assume that if one told a developer to be careful with her
code it is enough in so far as library documentation is concerned.

Thanks.

----------
title: zipfile.extractall is safe by now? -> zipfile.extractall is safe by now

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue40763>
_______________________________________


More information about the Python-bugs-list mailing list