[Numpy-discussion] svd error checking vs. speed

josef.pktd at gmail.com josef.pktd at gmail.com
Sat Feb 15 17:35:53 EST 2014


On Sat, Feb 15, 2014 at 5:12 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
> On Sat, Feb 15, 2014 at 5:08 PM, <josef.pktd at gmail.com> wrote:
>>
>> On Sat, Feb 15, 2014 at 4:56 PM, Sebastian Berg
>> <sebastian at sipsolutions.net> wrote:
>> > On Sa, 2014-02-15 at 16:37 -0500, alex wrote:
>> >> Hello list,
>> >>
>> >> Here's another idea resurrection from numpy github comments that I've
>> >> been advised could be posted here for re-discussion.
>> >>
>> >> The proposal would be to make np.linalg.svd more like scipy.linalg.svd
>> >> with respect to input checking.  The argument against the change is
>> >> raw speed; if you know that you will never feed non-finite input to
>> >> svd, then np.linalg.svd is a bit faster than scipy.linalg.svd.  An
>> >> argument for the change could be to avoid issues reported on github
>> >> like crashes, hangs, spurious non-convergence exceptions, etc. from
>> >> the undefined behavior of svd of non-finite input.
>> >>
>> >
>> > +1, unless this is a huge speed penalty, correctness (and decent error
>> > messages) should come first in my opinion, this is python after all. If
>> > this is a noticable speed difference, a kwarg may be an option (but
>> > would think about that some more).
>>
>> maybe -1
>>
>> statsmodels is using np.linalg.pinv which uses svd
>> I never ran heard of any crash (*), and the only time I compared with
>> scipy I didn't like the slowdown.
>> I didn't do any serious timings just a few examples.
>>
>> (*) not converged, ...
>>
>> pinv(x.T).dot(x) -> pinv(x.T, please_don_t_check=True).dot(y)
>>
>> numbers ?
>
>
> FWIW, I see this spurious SVD did not converge warning very frequently with
> ARMA when there is a nan that has creeped in. I usually know where to find
> the problem, but I think it'd be nice if this error message was a little
> better.

maybe I'm +1

While we don't see crashes, when I run Alex's example I see 13% cpu
usage for a hanging process which looks very familiar to me, I see it
reasonably often when I'm debugging code.

I never tried to track down where it hangs.

Josef

>
> Skipper
>
>>
>>
>> grep: we also use scipy.linalg.pinv in some cases
>>
>> Josef
>>
>>
>> >
>> > - Sebastian
>> >
>> >> """
>> >> [...] the following numpy code hangs until I `kill -9` it.
>> >>
>> >> ```
>> >> $ python runtests.py --shell
>> >> $ python
>> >> Python 2.7.5+
>> >> [GCC 4.8.1] on linux2
>> >> >>> import numpy as np
>> >> >>> np.__version__
>> >> '1.9.0.dev-e3f0f53'
>> >> >>> A = np.array([[1e3, 0], [0, 1]])
>> >> >>> B = np.array([[1e300, 0], [0, 1]])
>> >> >>> C = np.array([[1e3000, 0], [0, 1]])
>> >> >>> np.linalg.svd(A)
>> >> (array([[ 1.,  0.],
>> >>        [ 0.,  1.]]), array([ 1000.,     1.]), array([[ 1.,  0.],
>> >>        [ 0.,  1.]]))
>> >> >>> np.linalg.svd(B)
>> >> (array([[ 1.,  0.],
>> >>        [ 0.,  1.]]), array([  1.00000000e+300,   1.00000000e+000]),
>> >> array([[ 1.,  0.],
>> >>        [ 0.,  1.]]))
>> >> >>> np.linalg.svd(C)
>> >> [hangs forever]
>> >> ```
>> >> """
>> >>
>> >> Alex
>> >> _______________________________________________
>> >> NumPy-Discussion mailing list
>> >> NumPy-Discussion at scipy.org
>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> >>
>> >
>> >
>> > _______________________________________________
>> > NumPy-Discussion mailing list
>> > NumPy-Discussion at scipy.org
>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>



More information about the NumPy-Discussion mailing list