segmentation fault in scipy?

Thu May 11 00:11:25 EDT 2006

conor.robinson at gmail.com wrote:
> Im using rprop (not dependent on error function in this case ie.
> standard rprop vs. irprop or arprop) for an MLP tanh, sigmod nnet as
> part of a hybrid model. I guess I was using a little Matlab thought
> when I wrote the SSE funtion.  My batches are about 25,000 x 80 so my
> absolute error (diff between net outputs and desired outputs) when
> using *one* output unit is shape(~25000,), am I wrong to assume
> trace(error*transpose(error)) is the sum of the squared errors which
> should be an shape(1,)?

I'm afraid you're using terminology (and abbreviations!) that I can't follow.
Let me try to restate what's going on and you can correct me as I screw up. You
have a neural net that has 80 output units. You have 25000 observations that you
are using to train the neural net. Each observation vector (and consequently,
each error vector) has 80 elements.

Judging by your notation, you are using the matrix subclass of array to change *
to matrix multiplication. In my message you are responding to (btw, please quote
the emails you respond to so we can maintain some context), I gave an answer for
you using regular arrays which have * as elementwise multiplication. The matrix
object's behavior gets in the way of the most natural way to do these
calculations, so I do recommend avoiding the matrix object and learning to use
the dot() function to do matrix multiplication instead. But if you want to
continue using matrix objects, then you can use the multiply() function to do
element-wise multiplication.

The answer I gave also used the wrong name for the result. It seems that you
want the sum of the squared errors across all of the observations. In this case,
you can use axis=None to specify that every element should be summed:

  SSE = sum(multiply(error, error), axis=None)

trace(dot(error, transpose(error))) wastes a *huge* amount of time and memory
since you are calculating (if your machine was capable of it) a gigantic matrix,
then throwing away all of the off-diagonal elements. The method I give above
wastes a little memory; there is one temporary matrix the size of error.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco