[SciPy-dev] roots, poly, comparison operators, etc.

Sun Feb 17 03:36:39 EST 2002

> > * Array printing is far superior in Matlab and Octave -- they
> > generally always look nice.  We should clean up how arrays are output.
> > Also, the "format long" and "format short", etc options for specifying
> > how arrays are printed are pretty nice.
>
> I agree. May be ipython could be exploited here?

Does it do nice Numeric array printing and have something kin to the format
command?  If so, yes.  That would be good.  Fernando, I think your the person to
ask about this, correct? :-)

>
> In fact, rounding of tiny numbers to zeros could be done only when
> printing (though, personally I wouldn't prefer that either but I am
> just looking for a compromise), not inside calculation routines. In this
> way, no valuable information is lost when using these routines from
> other calculation routines and computation will be even more efficient.
>
> > * On Matlab/Octave, sort() sorts by magnitude of complex and then by
> > angle.  On the other hand ==, >, <, etc. seem to only compare the real
> > part of complex numbers.
> > These seem fine to me.  I know they aren't mathematically correct, but
> > they seem to be pragmatically correct.  I'd like comments on these
> > conventions and what others think.
>
> There is no mathematically correct way to compare complex numbers,
> they just cannot be ordered in an unique and sensible way.
>
> However, in different applications different conventions may be useful or
> reasonable for ordering complex numbers. Whatever is the convention, their
> mathematical correctness is irrelevant and this cannot be used as an
> argument for prefering one convention to another.
>

I'm not sure of your position on whether cmp() for complex values should work.
Are you against it or for it?

My main gripe with no comparison is that, for code to work generically, anywhere
in SciPy that has a comparison operator we have to change:

if( a < b):
   ...do whatever

to

if( a.typcode() in ['F', 'D'] or b.typcode in ['F','D'] and a.real < b.real):
    ... do whatever
else( a < b):
    ... do whatever

or something like that.  I've run into this multiple times within my own
projects and have wished that there was some default behavior.

>
> I would propose providing number of efficient comparison methods for
> complex (or any) numbers that users may use in sort functions as an
> optional argument. For example,
>
> scipy.sort([2,1+2j],cmpmth='abs') -> [1+2j,2]  # sorts by abs value
> scipy.sort([2,1+2j],cmpmth='real') -> [2,1+2j] # sorts by real part
> scipy.sort([2,1+2j],cmpmth='realimag') # sorts by real then by imag
> scipy.sort([2,1+2j],cmpmth='imagreal') # sorts by imag then by real
> scipy.sort([2,1+2j],cmpmth='absangle') # sorts by abs then by angle
> etc.
> scipy.sort([2,1+2j],cmpfunc=<user defined comparison function>)
>
> Note that
>
> scipy.sort([-1,1],cmpmth='absangle') -> [1,-1]
>
> which also demonstrates the arbitrariness of sorting complex numbers.

If we did this, instead of passing a string, we should pass a cmp method in just
as Python's list sort method takes instead of passing a string.  I think this
can be made to be fast efficient for standard cases.

>
> Btw, why do you want to sort the output of roots()? As far as I know,
> there is no order defined for roots of polynomials. May be an optional
> flag could be provided here?

This was in the original code I started with, and I just left it in there.  The
newest CVS does not have the sorting in it.

>
> > * Comparison of IEEE floats is hairy.  It looks to me like Matlab and
> > Octave have chosen to limit precision to 15 digits.  This seems like a
> > reasonable thing to do, for SciPy also, but we'd currently have to
> > limit to 14 digits to deal with problems of imprecise LAPACK routines.
> > Pearu and Fernando are arguing against this, but maybe they wouldn't
> > mind a 15 digit limit for double and 6 for float.  We'd have to modify
> > Numeric to do this internally on comparisons.  There could be a flag
> > that enables exact comparisons.
>
> I hope this issue is solved by fixing roots() and also the accuracy
> of LAPACK routines can be rehabilitated now (I don't claim that their
> output is always accurate, just floating numbers cannot be represented
> accuratelly in computers memory, in principle. It has nothing to do with
> the programs that manipulate these numbers, one can only improve the
> algorithtms to minimize the computational errors).
>
> The deep lesson to learn here is:
>   Fix the computation algorithm.
>   Do not fix the output of the computation.

Agreed -- when possible.  Your solution to the roots() problem was indeed the
right one.  I spent the day digging a stump out of my yard and thinking about
IEEE floating point in SciPy -- what a way to spend a Saturday.  Comparison of
numbers can lead to very subtle bugs that many scientist/engineers spend days to
find.  I'd like SciPy for to mitigate as many of these difficult programming
issues as possible so that programming doesn't get in the way of solving
problems for these people.  Still, the choice to round has some down sides also,
and this may be one of those situations where the solution is as bad as the
problem.  Also, I played a little more with Matlab and Octave and found that
they don't do anything special to help out unsuspecting saps either.  They
maintain standard IEEE floating point behavior with all the warts and benefits
that come with it.

>> a = 1
a =
     1
>> b = 1.0 + 2.22e-16
b =
    1.0000
>> a == b
ans =
     0

For now, well table any thoughts of handling rounding internally and focus on
other things.

>
> Matlab and Octave are programs that are oriented to certain applications,
> namely, to engineering applications where linear algebra is very
> important.
>
> SciPy needs not to choose the same orientation as Matlab, however, it can
> certainly cover Matlab's orientations.
>
> Python is a general purpose language and SciPy could also be a general
> purpose package for scientific computing (whatever are the applications).

It is general purpose -- much of this is thanks to Python's huge library.  I
don't think any of the things we are discussing break SciPy for other
applications.  The fact that floating point precision is 2.22e-16 doesn't break
it currently.  Moving this precision to 1e-15 wouldn't break it either.  Same
thing for complex numbers and comparisons.  Both of these and many others are
just issues that SciPy needs to examine.  They both impact programming in a
significant way, and if we can make changes that simplify programming for the
average scientist/engineer, we should.  Its all a bunch of trade-offs.

Also, I don't really view Matlab and Octave as that narrowly applicable -- they
seem to cover a very wide breadth of Science to me.  They're both excellent
tools that we can learn from.  The things they are missing are what the Python
language and its standard library bring to the table.  If SciPy can manage to be
as complete and usable of a scientific library as these two programs, it will
have succeeded.

eric