Refactoring; arbitrary expression in lists

Bengt Richter bokr at oz.net
Thu Jan 13 00:18:57 EST 2005


On Thu, 13 Jan 2005 12:19:06 +1000, Stephen Thorne <stephen.thorne at gmail.com> wrote:

>On Thu, 13 Jan 2005 01:24:29 GMT, Bengt Richter <bokr at oz.net> wrote:
>>     extensiondict = dict(
>>         php = 'application/x-php',
>>         cpp = 'text/x-c-src',
>>         # etcetera
>>         xsl = 'test/xsl'
>>     )
>> 
>>     def detectMimeType(filename):
>>         extension = os.path.splitext(filename)[1].replace('.', '')
           extension = os.path.splitext(filename)[1].replace('.', '').lower() # better

>>         try: return extensiondict[extension]
>>         except KeyError:
>>             basename = os.path.basename(filename)
>>             if "Makefile" in basename: return 'text/x-makefile' # XXX case sensitivity?
>>             raise NoMimeError
>
>Why not use a regexp based approach.
ISTM the dict setup closely reflects the OP's if/elif tests and makes for an efficient substitute
for the functionality when later used for lookup. The regex list is O(n) and the regexes themselves
are at least that, so I don't see a benefit. If you are going to loop through extensionlist, you
might as well write (untested)

     flowerew = filename.lower().endswith
     for ext, mimetype: 
         if flowerew(ext): return mimetype
     else:
         if 'makefile' in filename.lower(): return 'text/x-makefile'
     raise NoMimeError

using a lower case extension list including the dot. I think it would run faster
than a regex, and not scare anyone unnecessarily ;-)

The dict eliminates the loop, and is easy to understand, so IMO it's a better choice.

>extensionlist = [
>(re.compile(r'.*\.php') , "application/x-crap-language"),
>(re.compile(r'.*\.(cpp|c)') , 'text/x-c-src'),
>(re.compile(r'[Mm]akefile') , 'text/x-makefile'),
>]
>for regexp, mimetype in extensionlist:
>  if regexp.match(filename):
>     return mimetype
>
>if you were really concerned about efficiency, you could use something like:
>class SimpleMatch:
>  def __init__(self, pattern): self.pattern = pattern
>  def match(self, subject): return subject[-len(self.pattern):] == self.pattern

I'm not clear on what you are doing here, but if you think you are going to compete
with the timbot's dict efficiency with a casual few lines, I suspect you are PUI ;-)
(Posting Under the Influence ;-)

Regards,
Bengt Richter



More information about the Python-list mailing list