[SciPy-Dev] scipy improve performance by parallelizing

Sai Rajeshwar rajsai24 at gmail.com
Thu Jul 24 13:34:24 EDT 2014


hi david,

tried as you suggested
----------------------------------------------------------------
)for i in xrange(pooled_shape[1]):
            for j in xrange(pooled_shape[2]):
                for k in xrange(pooled_shape[3]):
                    for l in xrange(pooled_shape[4]):
                        pooled[0][i][j][k][l]=math.
>
> tanh((numpy.sum(conv_out[0][i][j][k*3][l*3:(l+1)*3])+numpy.
> sum(conv_out[0][i][j][k*3+1][l*3:(l+1)*3])+numpy.sum(conv_
> out[0][i][j][k*3+2][l*3:(l+1)*3]))/9.0+b[i][j])


You should get a speed up by accessing the arrays in a more efficient way:

pooled[0, i, j, k, l] = math.tanh((numpy.sum(conv_out[0, i, j, k*3,
l*3:(l+1)*3]) + numpy.sum(conv_out[0, i, j, k*3+1, l*3:(l+1)*3]) +
numpy.sum(conv_out[0, i, j, k*3+2, l*3:(l+1)*3]))/9.0+b[i, j])

In fact:

numpy.sum(conv_out[0, i, j, k*3, l*3:(l+1)*3]) + numpy.sum(conv_out[0, i,
j, k*3+1, l*3:(l+1)*3])

seems equivalent to:

numpy.sum(conv_out[0, i, j, k*3: k*3 +1, l*3:(l+1)*3])

To take the last one into account:

vec = numpy.sum(conv_out[0, i, j, k*3: k*3 + 2, l*3:(l+1)*3], axis=-1)
pooled[0, i, j, k, l] = vec[0] + vec[1] + vec[2] / 9.0

And you can probably get rid of the i and j indexes all together. Something
like this should work (untested):

for k in...
for l in...
output = numpy.sum(conv_out[0, :, :, k*3: k*3 +1, l*3:(l+1)*3]), axis=-1)
output += numpy.sum(conv_out[0, :, :, k*3 + 2, l*3 : (l+1)*3])),
axis=-1)/9.0
output += b
pooled[0, :, :, k, l] = numpy.tanh(output)
-----------------------------------------------------------------


for i in xrange(self.pooled_shape[1]):
            for j in xrange(self.pooled_shape[2]):
                for k in xrange(self.pooled_shape[3]):
                    for l in xrange(self.pooled_shape[4]):

                        #-- commented--
self.pooled[0][i][j][k][l]=math.tanh((numpy.sum(self.conv_out[0][i][j][k*3][l*3:(l+1)*3])+numpy.sum(self.conv_out[0][i][j][k*3+1][l*3:(l+1)*3])+numpy.sum(self.conv_out[0][i][j][k*3+2][l*3:(l+1)*3]))/9.0+self.b[i][j])

                        vec = numpy.sum(self.conv_out[0, i, j, k*3: k*3 +
2, l*3:(l+1)*3], axis=-1)
                        self.pooled[0, i, j, k, l] = math.tanh((vec[0] +
vec[1] + vec[2] )/ 9.0+self.b[i][j])


but it gave following error
----------------------------------------------------------------------------------
Traceback (most recent call last):
  File "3dcnn_test.py", line 401, in <module>
    check()
  File "3dcnn_test.py", line 392, in check
    layer1.change_input(numpy.reshape(test_set_x[i],(1,1,9,60,80)))
  File "3dcnn_test.py", line 77, in change_input
    self.pooled[0, i, j, k, l] = math.tanh((vec[0] + vec[1] + vec[2] )/
9.0+self.b[i][j])
IndexError: index out of bounds

*with regards..*

*M. Sai Rajeswar*
*M-tech  Computer Technology*


*IIT Delhi----------------------------------Cogito Ergo Sum---------*


On Sun, Jul 13, 2014 at 9:08 PM, Daπid <davidmenhur at gmail.com> wrote:

>
> On 13 July 2014 14:28, Sai Rajeshwar <rajsai24 at gmail.com> wrote:
>
>>
>> 2)for i in xrange(pooled_shape[1]):
>>             for j in xrange(pooled_shape[2]):
>>                 for k in xrange(pooled_shape[3]):
>>                     for l in xrange(pooled_shape[4]):
>>
>> pooled[0][i][j][k][l]=math.tanh((numpy.sum(conv_out[0][i][j][k*3][l*3:(l+1)*3])+numpy.sum(conv_out[0][i][j][k*3+1][l*3:(l+1)*3])+numpy.sum(conv_out[0][i][j][k*3+2][l*3:(l+1)*3]))/9.0+b[i][j])
>
>
> You should get a speed up by accessing the arrays in a more efficient way:
>
> pooled[0, i, j, k, l] = math.tanh((numpy.sum(conv_out[0, i, j, k*3,
> l*3:(l+1)*3]) + numpy.sum(conv_out[0, i, j, k*3+1, l*3:(l+1)*3]) +
> numpy.sum(conv_out[0, i, j, k*3+2, l*3:(l+1)*3]))/9.0+b[i, j])
>
> In fact:
>
> numpy.sum(conv_out[0, i, j, k*3, l*3:(l+1)*3]) + numpy.sum(conv_out[0, i,
> j, k*3+1, l*3:(l+1)*3])
>
> seems equivalent to:
>
> numpy.sum(conv_out[0, i, j, k*3: k*3 +1, l*3:(l+1)*3])
>
> To take the last one into account:
>
> vec = numpy.sum(conv_out[0, i, j, k*3: k*3 + 2, l*3:(l+1)*3], axis=-1)
> pooled[0, i, j, k, l] = vec[0] + vec[1] + vec[2] / 9.0
>
> And you can probably get rid of the i and j indexes all together.
> Something like this should work (untested):
>
> for k in...
> for l in...
> output = numpy.sum(conv_out[0, :, :, k*3: k*3 +1, l*3:(l+1)*3]), axis=-1)
> output += numpy.sum(conv_out[0, :, :, k*3 + 2, l*3 : (l+1)*3])),
> axis=-1)/9.0
> output += b
> pooled[0, :, :, k, l] = numpy.tanh(output)
>
> In this case, one of the loops seems a great target for parallelisation.
> Also, Cython should help reduce the loop overhead.
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20140724/110d7a54/attachment.html>


More information about the SciPy-Dev mailing list