improving a huge double-for cycle

km srikrishnamohan at gmail.com
Tue Sep 23 10:02:06 EDT 2008


how abt this ?

N = len(IN)
for k  in range(N):
    for j in range(N):
        if j >= k:                 # or k <= j
            doSomething()

KM
~~~~~~~~~~~~~~~
On Thu, Sep 18, 2008 at 6:27 PM, Tim Chase <python.list at tim.thechases.com>wrote:

> Code: Select all
>>    for i in range(len(IN)): #scan all elements of the list IN
>>      for j in range(len(IN)):
>>        if i <> j:
>>         if IN[i].coordinates[0] == IN[j].coordinates[0]:
>>           if IN[i].coordinates[1] == IN[j].coordinates[1]:
>>              SN.append(IN[i].label)
>>
>>
>> Unfortunately my len(IN) is about 100.000 and the running time about
>> 15h !!!! :(
>>
>> Any idea to improve it?
>>
> [snip]
>
>> I have already tried to group the "if statements" in a single one:
>>
>> Code: Select all
>>    if i <> j and if IN[i].coordinates[0] == IN[j].coordinates[0] and
>> if IN[i].coordinates[1] == IN[j].coordinates[1]:
>>
>> but no improvements.
>>
>
> It's like rearranging deck-chairs on the Titanic :)  Yes, it may give a
> speed up, but what's 3 seconds when you're waiting 15hr :)
>
> Not knowing the len(IN[x].coordinates) or their structure, if it's a list
> of len==2, you should be able to just do
>
>  if i <> j and IN[i].coordinates == IN[j].coordinates
>
> or
>
>  if i <> j and IN[i].coordinates[:2] == IN[j].coordinates[:2]
>
> However, again, this is just polish.  The big problem is that you have an
> O(N^2) algorithm that's killing you.
>
> 1) use xrange instead of range to save eating memory with a huge unneeded
> array.
>
> 2) unless you need to append duplicate labels, you know that when I and J
> are swapped, you'll reach the same condition again, so it might be worth
> writing the outer loops to eliminate this scenario, and in the process, but
> just starting at i+1 rather than i, you can forgo the check if "i<>j".
>
> Such changes might look something like
>
>  for i in xrange(len(IN)):
>    for j in xrange(i+1, len(IN)):
>      if IN[i].coordinates == IN[j].coordinates:
>        SN.append(IN[i].label)
>
> If my college algorithms memory serves me sufficiently, this reduces your
> O(N^2) to O(N log N) which will garner you some decent time savings.
>
> -tkc
>
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20080923/f0fe4a68/attachment-0001.html>


More information about the Python-list mailing list