improving a huge double-for cycle

Thu Sep 18 16:03:38 EDT 2008

On Sep 18, 2:25 pm, Alexzive <zasaconsult... at gmail.com> wrote:
> Hello there :) ,
>
> I am a python newbie and need to run following code for a task in an
> external simulation programm called "Abaqus" which makes use of python
> to access the mesh (ensamble of nodes with xy coordinates) of a
> certain geometrical model.
>
> [IN is the starting input containing the nodes to be check, there are
> some double nodes with the same x and y coordinates which need to be
> removed. SN is the output containing such double nodes]
>
> Code: Select all
>     for i in range(len(IN)): #scan all elements of the list IN
>       for j in range(len(IN)):
>         if i <> j:
>          if IN[i].coordinates[0] == IN[j].coordinates[0]:
>            if IN[i].coordinates[1] == IN[j].coordinates[1]:
>               SN.append(IN[i].label)
>
> Unfortunately my len(IN) is about 100.000 and the running time about
> 15h !!!! :(
>
> Any idea to improve it?
>
> I have already tried to group the "if statements" in a single one:
>
> Code: Select all
>     if i <> j and if IN[i].coordinates[0] == IN[j].coordinates[0] and
> if IN[i].coordinates[1] == IN[j].coordinates[1]:

A simple O(N) algorithm:

from collections import defaultdict

d = defaultdict(int)
for x in IN:
    d[x] += 1

SN = [x for (x, c) in d.iteritems() if c > 1]

--
Paul Hankin