[Python-Dev] mimetypes broken on Windows

Terry Jan Reedy tjreedy at udel.edu
Tue Apr 16 20:00:53 CEST 2013


On 4/15/2013 10:04 PM, Ben Hoyt wrote:
> Hi folks,
>
> The built-in mimetypes module is broken on Windows, and it has been
> since Python 2.7 alpha 1. On all Windows systems I've tried,
> guess_type() returns the wrong mime type for common types like .png and
> .jpg. For example (on Python 2.7.4 and 3.3.1):
>
>  >>> import mimetypes
>  >>> mimetypes.guess_type('f.png')
> ('image/x-png', None)
>  >>> mimetypes.guess_type('f.jpg')
> ('image/pjpeg', None)
>
> These should be 'image/png' and 'image/jpeg', respectively.
>
> There's an open issue for this: http://bugs.python.org/issue15207.
> However, it hasn't gotten any love in the last few months, so per
> r.david.murray's comment, I'm posting it here.
>
> Dave Chambers, who opened the bug, has proposed a fix, which is
> significantly better (i.e., not totally broken for common types).
> However, as I mentioned in http://bugs.python.org/issue15207#msg177030,
> using the Windows registry for this at all is basically a bad idea, because:

The actual mapping is fixed and more or less system independent while 
the windows registry is for volatile system and user dependent mappings.

> 1) Important keys like .jpg and .png aren't in the registry anyway.
> 2) Some that do exist are wrong in the Windows registry. This includes
> .zip, which is "application/x-zip-compressed" (at least in my registry)
> but should be "application/zip".
> 3) It makes the first call to guess_type() slow (~100ms), which isn't
> terrible, but with the above concerns, not worth it.
> 4) Perhaps most importantly: the keys in the Windows registry depend on
> what programs you have installed. And the users and programs can change
> registry keys at will.

And change what a given key is mapped to.

> Obviously one can work around this bug, either by calling
> mimetypes.init(files=[]) before any calls to guess_type, or calling
> init() with your own mime types file. However, "broken out of the box"
> is going to cause a lot of people headaches. :-)
>
> So my proposal is simply to get rid of read_windows_registry()
> altogether, and fall back to the default type mapping in mimetypes.py on
> Windows systems. This is correct and fast, even if not complete. As

I basicallly agree, but am not sure what to do about back-compatibility 
considerations. But we do not have to reproduce buggy behavior.

> always, folks can always use their own mimetypes file if they want.
>
> In summary: the current behaviour is buggy and broken, the behaviour
> proposed in Issue 15207 is problematic, getting this from the Windows
> registry is bad idea, and we should revert the whole registry thing. :-)
>
> If folks agree with my reasoning above, I can provide a patch to fix
> this, along with a patch to the Windows unit tests.
>
> -Ben
>
> P.S. Kind of proving my point about the fragility of using the registry,
> the Python 2.7.4 unit test test_registry_parsing in test_mimetypes.py
> fail on my machine. It's because I've installed some SQL server, and
> text/plain is my registry is mapped from .sql (instead of .txt), causing
> this:
>
> Traceback (most recent call last):
>    File "C:\python27\lib\test\test_mimetypes.py", line 85, in
> test_registry_parsing
>      eq(self.db.guess_type("foo.txt"), ("text/plain", None))
> AssertionError: Tuples differ: (None, None) != ('text/plain', None)




More information about the Python-Dev mailing list