Convert month name to month number faster

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Wed Jan 6 07:48:32 EST 2010


On Wed, 06 Jan 2010 12:03:36 +0100, wiso wrote:

> I'm optimizing the inner most loop of my script. I need to convert month
> name to month number. I'm using python 2.6 on linux x64.

According to your own figures below, it takes less than a nanosecond per 
lookup, at worst, even using a remarkably inefficient technique. Are you 
trying to tell us that this is the bottleneck in your script? I'm sorry, 
I find that implausible. I think you're wasting your time trying to 
optimise something that doesn't need optimizing.

Even if you halve the time, and deal with a million data points each time 
you run your script, you will only save half a second per run. I can see 
from the times you posted that you've spent at least an hour trying to 
optimise this. To make up for that one hour, you will need to run your 
script 7200 times, before you see *any* time savings at all.


> month_dict = {"Jan":1,"Feb":2,"Mar":3,"Apr":4, "May":5, "Jun":6,
> 	   "Jul":7,"Aug":8,"Sep":9,"Oct":10,"Nov":11,"Dec":12}
> 
> def to_dict(name):
>   return month_dict[name]

This leads to a pointless function call. Just call month_dict[name] 
instead of calling a function that calls it.



> def to_if(name):
>     if name == "Jan": return 1
>     elif name == "Feb": return 2
>     elif name == "Mar": return 3
>     elif name == "Apr": return 4
>     elif name == "May": return 5
>     elif name == "Jun": return 6
>     elif name == "Jul": return 7
>     elif name == "Aug": return 8
>     elif name == "Sep": return 9
>     elif name == "Oct": return 10
>     elif name == "Nov": return 11
>     elif name == "Dec": return 12
>     else: raise ValueError

That is remarkably awful.

 
> import random
> l = [random.choice(month_dict.keys()) for _ in range(1000000)]
> 
> from time import time
> t = time(); xxx=map(to_dict,l); print time() - t # 0.5 
> t = time(); xxx=map(to_if,l); print time() - t   # 1.0

This is not a reliable way to do timings. You should use the timeit 
module.



> is there a faster solution? Maybe something with str.translate?

What makes you think str.translate is even remotely useful for this?




-- 
Steven



More information about the Python-list mailing list