help with making my code more efficient

Larry.Martell at gmail.com Larry.Martell at gmail.com
Thu Dec 20 19:19:03 EST 2012


I have a list of tuples that contains a tool_id, a time, and a message. I want to select from this list all the elements where the message matches some string, and all the other elements where the time is within some diff of any matching message for that tool. 

Here is how I am currently doing this:

# record time for each message matching the specified message for each tool
messageTimes = {}
for row in cdata:   # tool, time, message
    if self.message in row[2]:
        messageTimes[row[0], row[1]] = 1

# now pull out each message that is within the time diff for each matched message
# as well as the matched messages themselves

def determine(tup):
    if self.message in tup[2]: return True      # matched message 

    for (tool, date_time) in messageTimes:
        if tool == tup[0]:
            if abs(date_time-tup[1]) <= tdiff: 
               return True

    return False
        
cdata[:] = [tup for tup in cdata if determine(tup)]

This code works, but it takes way too long to run - e.g. when cdata has 600,000 elements (which is typical for my app) it takes 2 hours for this to run. 

Can anyone give me some suggestions on speeding this up?

TIA!



More information about the Python-list mailing list