mimetypes.guess_type broken in windows on py2.7 and python 3.X

Gelonida N gelonida at gmail.com
Wed Sep 26 04:54:34 EDT 2012


Hi,

I'm still experiencing the pleasure of migrating legacy code from Python 
2.6. to 2.7 which I expected to be much less painful.
(In fact migration on Linux is rather smooth, but Windows is another story)


Let's take the simple command

import mimetypes
print mimetypes.guess_type('a.jpg')


The result an old pythons ( 2.7)
is  ('image/jpeg', None)

Ther result on non windows platform is
for python 2.7 / 3.X is the same

However. The result for 2.7 / 3.x on windows is now
('image/pjpeg', None)  # pjpeg instead of jpeg

On Windows many file suffixes  will report wrong mime types.

The problem is know for about two years.
http://bugs.python.org/issue10551


The main reason is, that under wWindows default values are
fetched from Python and then 'updated' mime-types are
fetched from the Windows registry.
The major problem is, that almost ALL windows PCs have BROKEN mime 
types. so the good predefined mime types are just replaced with broken 
MS mime types.


I wonder how many applications, that will try to migrate to 2.7 / 3.0 
will fail due to this incompatibility in the mimetypes library


There is a workaround (but first people have to detect the problem and 
to find it):
Add these two lines somewhere in your code BEFORE any other imported 
library might have called a mimetypes function

import mimetypes
mimetypes.init([])

I still wonder if it wouldn't be better to have the default behaviour of 
2.7 / 3.0 on windows such, that all the users who're not aware of this 
issue will not have their code broken.

My suggestion for windows would be to have following default behaviour:

- !st read the mimetypes from the registry if possible
- 2nd read the Python default mimetypes and override the
    'broken' MS definitions

Only if a user explicitely calls mimetypes.init() they would have 
differente behaviour.

The new behaviour breaks portability of Python code between Windows and 
Linux and I think the attempt should be to be as cross platform as 
possible. and not to be. At least one of the reasons why I use pythin 
is, that it allows to be rather cross-platform

An alternative suggestion could be to never read the registry or 
/etc/mimetypes by default.

What would definitely be rather important is add a big warning in the 
documentation and a recommendation of how to write cross platform 
compatible code.

Somebody developing on Linux might not even know, that the code will not 
work on windows jsst because of this tiny issue.


The unfortunate fact, that this issue was not fixed two years ago means, 
that perhaps meanwhile some code is out, that relies on the current 
behaviour. However I'm not sure, that anybody relies on the fact, that 
code will not work the same way on windows and on Linux.

Any thoughts?












More information about the Python-list mailing list