[issue38656] mimetypes for python 3.7.5 fails to detect matroska video

David K. Hess report at bugs.python.org
Mon Nov 18 08:23:28 EST 2019


David K. Hess <david_k_hess at mac.com> added the comment:

Hi, I'm the author of the commit that's been fingered. Some comments about the behavior being reported....

First, as pointed out by @xtreak, indeed the mimetypes module uses mimetypes files present on the platform to add to the built in list of mimetypes. In this case, "video/x-mastroska" and ".mkv" are not found in the mimetypes module and were never there - they are coming from the host OS.

Also, for better or worse, the mimetypes module has an internal "init" method that does more than just instantiates a MimeTypes instance for default use:

https://github.com/python/cpython/blob/5c0c325453a175350e3c18ebb10cc10c37f9595c/Lib/mimetypes.py#L345

It also loads in these system files (and also Windows Registry entries on Win32) into a fresh MimeTypes instance. So, addressing what @The Compiler is seeing, properly resetting the mimetypes module really involves calling mimetypes.init(). By historical design, instantiating a MimeTypes class instance directly will not use host OS system mime type files.

As to why this commit is causing a change in the observed behavior, the problem that was corrected in this commit was that the mimetypes module had non-deterministic behavior related to initialization. In the original init code, the module level mime types tables are changed (really corrupted) after first load and you can never reinitialize the module back to a known good state (i.e. to original module defaults without information from the host OS system).

So, realistically, the behavior currently observed is the correct behavior given the presence and historical nature of the init function. The fact that a fresh MimeTypes instance without having been init()'d or with no filenames provided is returning an OS entry prior to this commit is really part of the initialization bug which was fixed.

Regarding the ranger bug, the main thing is you should not use a MimeTypes instance directly unless you run it through the same initializations that the init code does.

Anyway, that's my perspective having waded through all of that during the original BPO. I don't claim it's the correct one but that's where we are at.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue38656>
_______________________________________


More information about the Python-bugs-list mailing list