[SciPy-user] Reference to algorithm for matrix rank

Fri Oct 10 16:45:32 EDT 2008

On Fri, Oct 10, 2008 at 15:13, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
>> You should get the book _Matrix Computations_ by Golub and van Loan.
>> You actually want tol to be relative to S.max(), not an absolute
>> tolerance. I like this:
>>
>>  np.sum(S > (S.max() * np.finfo(M.dtype).eps)
>
> Thanks a lot - I'll have a look.
>
> I saw that matlab does something like this:
>
> eps = np.finfo(S.dtype).eps
> tol = max(M.shape)*eps*S[0]
>
> but I didn't know why the max(M.shape)...

Neither do I. Golub and van Loan define "numerical rank deficiency" as
using tol=eps*S[0] (note that S[0] is the maximum singular value and
thus the 2-norm of the matrix). The thing is, there really isn't one
definition, much like there isn't a single definition of the norm of a
matrix. For example, if your data come from uncertain measurements
with uncertainties greater than floating point epsilon, choosing a
tolerance of about the uncertainty is probably a better idea (the
tolerance may be absolute if the uncertainties are absolute rather
than relative, even). When floating point roundoff is your concern,
then "numerical rank deficiency" is a better concept, but exactly what
the relevant measure of the tolerance is depends on the operations you
intend to do with your matrix. Possibly, the Matlab implementors had
some set of operations in mind. Unfortunately, they don't cite
anything relevant to that point, and my imagination fails to come up
with one.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco