improving a huge double-for cycle

Thu Sep 18 09:30:47 EDT 2008

On Thu, 18 Sep 2008 05:25:02 -0700, Alexzive wrote:

> Hello there :) ,
> 
> I am a python newbie and need to run following code for a task in an
> external simulation programm called "Abaqus" which makes use of python
> to access the mesh (ensamble of nodes with xy coordinates) of a certain
> geometrical model.
> 
> [IN is the starting input containing the nodes to be check, there are
> some double nodes with the same x and y coordinates which need to be
> removed. SN is the output containing such double nodes]
> 
> Code: Select all
>     for i in range(len(IN)): #scan all elements of the list IN
>       for j in range(len(IN)):
>         if i <> j:
>          if IN[i].coordinates[0] == IN[j].coordinates[0]:
>            if IN[i].coordinates[1] == IN[j].coordinates[1]:
>               SN.append(IN[i].label)

Here's a better version of your algorithm, one which avoids the minor 
inefficiencies but keeps the huge inefficiency:

for node1 in IN:
    for node2 in IN:
        if node1 is not node2:
            if node1.coordinates == node2.coordinates:
                SN.append(node1.label)

This assumes that node.coordinates is a type where equality is defined. 
If they are a tuple or list, that should work fine.

But the huge inefficiency is that you are checking each node not once, 
not twice, but 100,000 times! So you have to iterate 10,000,000,000 
times, which is going to be slow no matter what you do. Especially in 
pure Python.

Here's a better idea: iterate over the list once only:

seen = set()
for node in IN:
    coords = tuple(node.coordinates)
    if coords in seen:
        SN.append(node.label)
    else:
        seen.add(coords)

Hope this helps.

-- 
Steven