[Numpy-discussion] Help using numPy to create a very large multi dimensional array

Bruno Santos bacmsantos at gmail.com
Wed Apr 18 07:04:37 EDT 2007


Finally I was able to read the data, by using the command you sair with some
small changes:
matrix = numpy.array([[float(x) for x in line.split()[1:]] for line in
vecfile])

But that doesn't solve my speed problem, now instead of taking 40seconds in
the slow step, takes 1min ant 10seconds :(
The slow step is this cycle:
for j in range(0, clust):
            list_j= numpy.asarray(matrix[j])
            for k in range(j+1, clust):
                list_k=numpy.asarray(matrix[k])
                dist=0
                for e in range(0, columns):
                    result = list_j[e] - list_k[e]
                    dist += result * result
                if (dist < min):
                    ind[0] = j
                    ind[1] = k
                    min = dist

I also try with list_j = numpy.array but it only slower even more the
calculation,
Does anyone have any ideia how I can speed up this step?


2007/4/18, Christian K. <ckkart at hoc.net>:
>
> Bruno Santos wrote:
> > I try to use the expression as you said, but I'm not getting the desired
> > result,
> > My text file look like this:
> >
> > # num rows=115 num columns=2634
> > AbassiM.txt 0.033023 0.033023 0.033023 0.165115 0.462321....0.000000
> > AgricoleW.txt 0.038691 0.038691 0.038691 0.232147 0.541676....0.215300
> > AliR.txt 0.041885 0.041885 0.041885 0.125656 0.586395....0.633580
> > .....
> > ....
> > ....
> > ZhangJ.txt 0.047189 0.047189 0.047189 0.155048 0.613452....0.000000
>
> I guess N.fromfile can't handle non numeric data. Use something like
> this instead (not tested):
>
> import numpy as N
>
> data = open('name of file').readlines()
>
> data = N.array([[float(x) for x in row.split(' ')[1:]] for row in
> data[1:]])
>
> (the above expression should be one line)
>
> Christian
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070418/a0b701bb/attachment.html>


More information about the NumPy-Discussion mailing list