[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

Tim Golden report at bugs.python.org
Wed Apr 17 14:23:36 CEST 2013


Tim Golden added the comment:

There seems to be a consensus that the current behaviour is undesirable,
indeed "broken" for any meaningful use. 

The critical argument against the current Registry approach is that it
returns unexpected (or outright incorrect) mimetypes for very standard
extensions.

The arguments against reading the Registry at all are:

* That it requires some extra level of privilege to read the appropriate
keys.

* That there is a startup cost to reading the Registry

* That it can be and is updated by arbitrary programs (typically during
installation) and therefore its values cannot be relied upon.


We have 3.5 proposals on the table:

1) Don't read the registry at all, ie revert issue4969 (this is what Ben
Hoyt is advocating) [noregistry]

2) Read the registry *before* reading the standard types (this is not
strongly advocated by anyone).

3) Read the registry but in a different way, mapping from extension to
mimetype rather than vice versa. (This is Dave Chambers' patch from
issue15207). [newregistry]

3a) Lookup as per (3) but only on demand. This eliminates any startup cost.

I've produced three output files from a simple dump of the mimetypes database. For the purposes of taking this  forward, we're really comparing the noregistry and the newregistry variants.

One key issue is what to do when the same key occurs in both sets but with a different value. (Examples include .avi -> video/x-msvideo vs video/avi; and .zip -> application/zip vs application/x-zip-compressed).

And the other key issue is whether the overheads (security, speed) of using the registry outweigh its usefulness in any case.

Could I ask those who would remove the registry use altogether to comment on the newregistry output (generating your own if it helps) to see whether it changes your views at all.

Either approach -- no-registry or new-registry -- feasible and the code churn is about equal. I'm unsure about compatibility issues: it's hard to see anyone relying on the incorrect mimetypes; but it's certainly possible to see someone relying on the additional (correct) mimetypes.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue15207>
_______________________________________


More information about the Python-bugs-list mailing list