[Python-Dev] Remove tempfile.mktemp()

Nathaniel Smith njs at pobox.com
Tue Mar 19 19:33:37 EDT 2019


On Tue, Mar 19, 2019 at 3:43 PM Giampaolo Rodola' <g.rodola at gmail.com> wrote:
>
> On Tue, Mar 19, 2019 at 6:21 PM Guido van Rossum <guido at python.org> wrote:
>
> >> On Tue, Mar 19, 2019 at 10:14 AM Giampaolo Rodola' <g.rodola at gmail.com> wrote:
> >> Technically you cannot make it 100% safe, only less likely to occur.
> > That seems unnecessarily pedantic. A cryptographic random generator, like the one in the secrets module, *is* safe enough for all practical purposes (with a large margin).
> > [...]
> > Hm, the random sequence (implemented in tempfile._RandomNameSequence) is currently derived from the random module, which is not cryptographically secure. Maybe all we need to do is replace its source of randomness with one derived from the secrets module. That seems a one-line change.
>
> Seems to make sense to me. Currently the random string is 8 chars
> long, meaning (if I'm not mistaken) max 40320 possible combinations:
>
>     >>> len(list(itertools.permutations('wuoa736m'))))
>     40320

It's 8 chars taken from a-z0-9_, so there are 37**8 ~= 10**12 possible
names, or ~41 bits.

> We may use 9 chars and get to 362880 and increase "security".
> Regardless, it seems to me the original deprecation warning for
> mktemp() was not really justified and should be removed.

Historically, mktemp variants have caused *tons* of serious security
vulnerabilities. It's not a theoretical issue. See e.g.
https://www.owasp.org/index.php/Insecure_Temporary_File

I believe that if we used sufficient unguessable randomness to
generate the name, then that would resolve the issues and let us
un-deprecate it. Though like Brett I would like to double-check this
with security specialists :-).

This would also simplify the implementation a *lot*, down to just:

def mktemp(suffix='', prefix='tmp', dir=None):
    if dir is None:
        dir = gettempdir()
    return _os.path.join(dir, prefix +
secrets.token_urlsafe(ENTROPY_BYTES) + suffix)

The choice of ENTROPY_BYTES is an interesting question. 16 (= 128
bits) would be a nice "obviously safe" number, and gives 22-byte
filenames. We might be able to get away with fewer, if we had a
plausible cost model for the attack. This is another point where a
security specialist might be helpful :-).

-n

-- 
Nathaniel J. Smith -- https://vorpus.org


More information about the Python-Dev mailing list