improving a huge double-for cycle
Jake Anderson
jake at vapourforge.com
Fri Sep 19 10:30:53 EDT 2008
Bruno Desthuilliers wrote:
> Alexzive a écrit :
>> Hello there :) ,
>>
>> I am a python newbie and need to run following code for a task in an
>> external simulation programm called "Abaqus" which makes use of python
>> to access the mesh (ensamble of nodes with xy coordinates) of a
>> certain geometrical model.
>>
>> [IN is the starting input containing the nodes to be check, there are
>> some double nodes with the same x and y coordinates which need to be
>> removed. SN is the output containing such double nodes]
>>
>> Code: Select all
>> for i in range(len(IN)): #scan all elements of the list IN
>> for j in range(len(IN)):
>> if i <> j:
>> if IN[i].coordinates[0] == IN[j].coordinates[0]:
>> if IN[i].coordinates[1] == IN[j].coordinates[1]:
>> SN.append(IN[i].label)
>>
>>
>>
>> Unfortunately my len(IN) is about 100.000 and the running time about
>> 15h !!!! :(
>>
>> Any idea to improve it?
>
> A couple ones have been submitted. Harald gets a point about the
> redundant tests (even if his solution seems to be broken, cf below) -
> your inner loop should have looked like :
>
> for j in xrange(i+1, len(IN))
>
> Now the obvious winner is pruebono - even unoptimized, using sets seems
> to be *way* faster than even the most optimized corrected version of
> your algorithm.
>
> Here's a quick bench - please everyone doublecheck it to make sure it's ok:
>
<snip code>
> Results here (py2.5, gentoo linux, athlonxp1800, 512 ram):
>
> >>> test_results()
> True
> >>> test_times()
> doubles0 : 1.55667901039
> doubles1 : 0.719144105911
> doubles2 : 0.703393936157
> doubles3 : 0.700654983521
> doubles4 : 0.706257104874
> doubles5 : 0.528184890747
> doubles6 : 0.461633205414
> doubles8 : 0.0134379863739
> doubles9 : 0.0108540058136
> >>>
>
> Not surprisingly, half less iterations makes for half less time.
> Aliasing, as often, proves to be a good optimization too. But obviously,
> using the correct data structure / algorithm combo is the key : simpler
> code, and 115 times faster (143 times with aliasing). If pruebono
> solution's is correct (and it as AFAICT), your 15 hours computation
> should by now take less than 10 minutes...
>
>
Ubuntu 8.04 core2 2.6(i think)
without psycho
doubles0 : 0.610555171967
doubles1 : 0.29314494133
doubles2 : 0.286273956299
doubles3 : 0.281984090805
doubles4 : 0.28240609169
doubles5 : 0.207377910614
doubles6 : 0.156388044357
doubles8 : 0.00533080101013
doubles9 : 0.00458884239197
with psycho
doubles0 : 0.127684116364
doubles1 : 0.069571018219
doubles2 : 0.064826965332
doubles3 : 0.0702300071716
doubles4 : 0.0647261142731
doubles5 : 0.0522589683533
doubles6 : 0.0437579154968
doubles8 : 0.00190806388855
doubles9 : 0.00214099884033
On this small test its a variance between ~6x to 2X still its basically
free so why not ;->
More information about the Python-list
mailing list