[SciPy-User] orthogonal polynomials ?

Wed May 18 09:16:53 EDT 2011

On Mon, May 16, 2011 at 10:24 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Mon, May 16, 2011 at 8:16 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
>>
>>
>> On Mon, May 16, 2011 at 7:47 PM, <josef.pktd at gmail.com> wrote:
>>>
>>> On Mon, May 16, 2011 at 9:29 PM, Charles R Harris
>>> <charlesr.harris at gmail.com> wrote:
>>> >
>>> >
>>> > On Mon, May 16, 2011 at 6:12 PM, <josef.pktd at gmail.com> wrote:
>>> >>
>>> >> On Mon, May 16, 2011 at 5:58 PM, Charles R Harris
>>> >> <charlesr.harris at gmail.com> wrote:
>>> >> >
>>> >> >
>>> >> > On Mon, May 16, 2011 at 12:40 PM, <josef.pktd at gmail.com> wrote:
>>> >> >>
>>> >> >> On Mon, May 16, 2011 at 12:47 PM,  <josef.pktd at gmail.com> wrote:
>>> >> >> > On Mon, May 16, 2011 at 12:29 PM, Charles R Harris
>>> >> >> > <charlesr.harris at gmail.com> wrote:
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> On Mon, May 16, 2011 at 10:22 AM, <josef.pktd at gmail.com> wrote:
>>> >> >> >>>
>>> >> >> >>> On Mon, May 16, 2011 at 11:27 AM, Charles R Harris
>>> >> >> >>> <charlesr.harris at gmail.com> wrote:
>>> >> >> >>> >
>>> >> >> >>> >
>>> >> >> >>> > On Mon, May 16, 2011 at 9:17 AM, Charles R Harris
>>> >> >> >>> > <charlesr.harris at gmail.com> wrote:
>>> >> >> >>> >>
>>> >> >> >>> >>
>>> >> >> >>> >> On Mon, May 16, 2011 at 9:11 AM, <josef.pktd at gmail.com>
>>> >> >> >>> >> wrote:
>>> >> >> >>> >>>
>>> >> >> >>> >>> On Sat, May 14, 2011 at 6:04 PM, Charles R Harris
>>> >> >> >>> >>> <charlesr.harris at gmail.com> wrote:
>>> >> >> >>> >>> >
>>> >> >> >>> >>> >
>>> >> >> >>> >>> > On Sat, May 14, 2011 at 2:26 PM, <josef.pktd at gmail.com>
>>> >> >> >>> >>> > wrote:
>>> >> >> >>> >>> >>
>>> >> >> >>> >>> >> On Sat, May 14, 2011 at 4:11 PM, nicky van foreest
>>> >> >> >>> >>> >> <vanforeest at gmail.com>
>>> >> >> >>> >>> >> wrote:
>>> >> >> >>> >>> >> > Hi,
>>> >> >> >>> >>> >> >
>>> >> >> >>> >>> >> > Might this be what you want:
>>> >> >> >>> >>> >> >
>>> >> >> >>> >>> >> > The first eleven probabilists' Hermite polynomials
>>> >> >> >>> >>> >> > are:
>>> >> >> >>> >>> >> >
>>> >> >> >>> >>> >> > ...
>>> >> >> >>> >>> >> >
>>> >> >> >>> >>> >> > My chromium browser does not seem to paste pngs.
>>> >> >> >>> >>> >> > Anyway,
>>> >> >> >>> >>> >> > check
>>> >> >> >>> >>> >> >
>>> >> >> >>> >>> >> >
>>> >> >> >>> >>> >> > http://en.wikipedia.org/wiki/Hermite_polynomials
>>> >> >> >>> >>> >> >
>>> >> >> >>> >>> >> > and you'll see that the first polynomial is 1, the
>>> >> >> >>> >>> >> > second
>>> >> >> >>> >>> >> > x,
>>> >> >> >>> >>> >> > and
>>> >> >> >>> >>> >> > so
>>> >> >> >>> >>> >> > forth. From my courses on quantum mechanics I recall
>>> >> >> >>> >>> >> > that
>>> >> >> >>> >>> >> > these
>>> >> >> >>> >>> >> > polynomials are, with respect to some weight function,
>>> >> >> >>> >>> >> > orthogonal.
>>> >> >> >>> >>> >>
>>> >> >> >>> >>> >> Thanks, I haven't looked at that yet, we should add
>>> >> >> >>> >>> >> wikipedia
>>> >> >> >>> >>> >> to
>>> >> >> >>> >>> >> the
>>> >> >> >>> >>> >> scipy.special docs.
>>> >> >> >>> >>> >>
>>> >> >> >>> >>> >> However, I would like to change the last part "with
>>> >> >> >>> >>> >> respect
>>> >> >> >>> >>> >> to
>>> >> >> >>> >>> >> some
>>> >> >> >>> >>> >> weight function"
>>> >> >> >>> >>> >>
>>> >> >> >>> >>> >>
>>> >> >> >>> >>> >> http://en.wikipedia.org/wiki/Hermite_polynomials#Orthogonality
>>> >> >> >>> >>> >>
>>> >> >> >>> >>> >> Instead of Gaussian weights I would like uniform weights
>>> >> >> >>> >>> >> on
>>> >> >> >>> >>> >> bounded
>>> >> >> >>> >>> >> support. And I have never seen anything about changing
>>> >> >> >>> >>> >> the
>>> >> >> >>> >>> >> weight
>>> >> >> >>> >>> >> function for the orthogonal basis of these kind of
>>> >> >> >>> >>> >> polynomials.
>>> >> >> >>> >>> >>
>>> >> >> >>> >>> >
>>> >> >> >>> >>> > In numpy 1.6, you can use the Legendre polynomials. They
>>> >> >> >>> >>> > are
>>> >> >> >>> >>> > orthogonal
>>> >> >> >>> >>> > on
>>> >> >> >>> >>> > [-1,1] as has been mentioned, but can be mapped to other
>>> >> >> >>> >>> > domains.
>>> >> >> >>> >>> > For
>>> >> >> >>> >>> > example
>>> >> >> >>> >>> >
>>> >> >> >>> >>> > In [1]: from numpy.polynomial import Legendre as L
>>> >> >> >>> >>> >
>>> >> >> >>> >>> > In [2]: for i in range(5): plot(*L([0]*i + [1],
>>> >> >> >>> >>> > domain=[0,1]).linspace())
>>> >> >> >>> >>> >    ...:
>>> >> >> >>> >>> >
>>> >> >> >>> >>> > produces the attached plots.
>>> >> >> >>> >>>
>>> >> >> >>> >>> I'm still on numpy 1.5 so this will have to wait a bit.
>>> >> >> >>> >>>
>>> >> >> >>> >>> > <snip>
>>> >> >> >>> >>> >
>>> >> >> >>> >>> > Chuck
>>> >> >> >>> >>> >
>>> >> >> >>> >>>
>>> >> >> >>> >>>
>>> >> >> >>> >>> as a first application for orthogonal polynomials I was
>>> >> >> >>> >>> trying
>>> >> >> >>> >>> to
>>> >> >> >>> >>> get
>>> >> >> >>> >>> an estimate for a density, but I haven't figured out the
>>> >> >> >>> >>> weighting
>>> >> >> >>> >>> yet.
>>> >> >> >>> >>>
>>> >> >> >>> >>> Fourier polynomials work better for this.
>>> >> >> >>> >>>
>>> >> >> >>> >>
>>> >> >> >>> >> You might want to try Chebyshev then, the Cheybyshev
>>> >> >> >>> >> polynomialas
>>> >> >> >>> >> are
>>> >> >> >>> >> essentially cosines and will handle the ends better.
>>> >> >> >>> >> Weighting
>>> >> >> >>> >> might
>>> >> >> >>> >> also
>>> >> >> >>> >> help, as I expect the distribution of the errors are
>>> >> >> >>> >> somewhat
>>> >> >> >>> >> Poisson.
>>> >> >> >>> >>
>>> >> >> >>> >
>>> >> >> >>> > I should mention that all the polynomial fits will give you
>>> >> >> >>> > the
>>> >> >> >>> > same
>>> >> >> >>> > results, but the Chebyshev fits are more numerically stable.
>>> >> >> >>> > The
>>> >> >> >>> > general
>>> >> >> >>> > approach is to overfit, i.e., use more polynomials than
>>> >> >> >>> > needed
>>> >> >> >>> > and
>>> >> >> >>> > then
>>> >> >> >>> > truncate the series resulting in a faux min/max
>>> >> >> >>> > approximation.
>>> >> >> >>> > Unlike
>>> >> >> >>> > power
>>> >> >> >>> > series, the coefficients of the Cheybshev series will tend to
>>> >> >> >>> > decrease
>>> >> >> >>> > rapidly at some point.
>>> >> >> >>>
>>> >> >> >>> I think I might have still something wrong with the way I use
>>> >> >> >>> the
>>> >> >> >>> scipy.special polynomials
>>> >> >> >>>
>>> >> >> >>> for a large sample size with 10000 observations, the nice graph
>>> >> >> >>> is
>>> >> >> >>> fourier with 20 elements, the second (not so nice) is with
>>> >> >> >>> scipy.special.chebyt with 500 polynomials. The graph for 20
>>> >> >> >>> Chebychev
>>> >> >> >>> polynomials looks very similar
>>> >> >> >>>
>>> >> >> >>> Chebychev doesn't want to get low enough to adjust to the low
>>> >> >> >>> part.
>>> >> >> >>>
>>> >> >> >>> (Note: I rescale to [0,1] for fourier, and [-1,1] for chebyt)
>>> >> >> >>>
>>> >> >> >>
>>> >> >> >> That certainly doesn't look right. Could you mail me the data
>>> >> >> >> offline?
>>> >> >> >> Also,
>>> >> >> >> what are you fitting, the histogram, the cdf, or...?
>>> >> >> >
>>> >> >> > The numbers are generated (by a function in scikits.statsmodels).
>>> >> >> > It's
>>> >> >> > all in the script that I posted, except I keep changing things.
>>> >> >> >
>>> >> >> > I'm not fitting anything directly.
>>> >> >> > There is supposed to be a closed form expression, the estimated
>>> >> >> > coefficient of each polynomial is just the mean of that
>>> >> >> > polynomial
>>> >> >> > evaluated at the data. The theory and the descriptions sounded
>>> >> >> > easy,
>>> >> >> > but unfortunately it didn't work out.
>>> >> >> >
>>> >> >> > I was just hoping to get lucky and that I'm able to skip the
>>> >> >> > small
>>> >> >> > print.
>>> >> >> > http://onlinelibrary.wiley.com/doi/10.1002/wics.97/abstract
>>> >> >> > got me started and it has the fourier case that works.
>>> >> >> >
>>> >> >> > There are lots of older papers that I only skimmed, but I should
>>> >> >> > be
>>> >> >> > able to work out the Hermite case before going back to the
>>> >> >> > general
>>> >> >> > case again with arbitrary orthogonal polynomial bases. (Initially
>>> >> >> > I
>>> >> >> > wanted to ignore the Hermite bases because gaussian_kde works
>>> >> >> > well in
>>> >> >> > that case.)
>>> >> >>
>>> >> >> Just another graph before stopping with this
>>> >> >>
>>> >> >> chebyt work if I cheat (rescale at the end
>>> >> >>
>>> >> >> f_hat = (f_hat - f_hat.min())
>>> >> >> fint2 = integrate.trapz(f_hat, grid)
>>> >> >> f_hat /= fint2
>>> >> >>
>>> >> >> graph is with chebyt with 30 polynomials after shifting and
>>> >> >> scaling,
>>> >> >> 20 polynomials looks also good.
>>> >> >>
>>> >> >> (hunting for the missing scaling term is left for some other day)
>>> >> >>
>>> >> >> In any case, there's a recipe for non-parametric density estimation
>>> >> >> with compact support.
>>> >> >>
>>> >> >
>>> >> > Ah, now I see what is going on -- monte carlo integration to get the
>>> >> > expansion of the pdf in terms of orthogonal polynomials. So yes, I
>>> >> > think
>>> >> > Lagrange polynomials are probably the ones to use unless you use the
>>> >> > weight
>>> >> > in the integral. Note that 1.6 also has the Hermite and Laguerre
>>> >> > polynomials. But it seems that for these things it would also be
>>> >> > desirable
>>> >> > to have the normalization constants.
>>> >>
>>> >
>>> > Heh, I meant Legendre.
>>> >
>>> >>
>>> >> It's also intended to be used as a density estimator for real data,
>>> >> but the idea is the same.
>>> >>
>>> >> My main problem seems to be that I haven't figured out the
>>> >> normalization (constants) that I'm supposed to use.
>>> >
>>> > The normalization for Legendre functions over an interval of length L
>>> > would
>>> > be (2/L)*2/(2*i + 1)*(1/n), where i is the degree of the polynomial,
>>> > and n
>>> > is the number of samples.
>>> >
>>> >>
>>> >> Given the Wikipedia page that Andy pointed out, I added an additional
>>> >> weighting term. I still need to shift and rescale, but the graphs look
>>> >> nice. (the last one, promised) chebyt with 30 polynomials on sample
>>> >> with 10000 observations.
>>> >>
>>> >> (I don't know yet if I want to use the new polynomials in numpy even
>>> >> thought they are much nicer. I just gave up trying to get statsmodels
>>> >> compatible with numpy 1.3 because I'm using the polynomials introduced
>>> >> in 1.4)
>>> >>
>>> >> (I will also look at Andy's pointer to Gram–Schmidt, because I
>>> >> realized that for nonlinear trend estimation I want orthogonal for
>>> >> discretely evaluated points instead of for the integral.)
>>> >>
>>> >
>>> > QR is Gram-Schmidt. You can use any polynomial basis for the columns.
>>> > There
>>> > is also a method due to Moler to compute the polynomials using the fact
>>> > that
>>> > they satisfy a three term recursion, but QR is simpler.
>>> >
>>> > Anne suggested some time ago that I should include the Gauss points and
>>> > weights in the polynomial classes and I've come to the conclusion she
>>> > was
>>> > right. Looks like I should include the normalization factors also.
>>>
>>> Since I'm going through this backwards, program first, figure out what
>>> I'm doing, I always appreciate any of these premade helper functions.
>>>
>>> A mixture between trial and error and reading Wikipedia. This is the
>>> reweighted chebychev T polynomial, orthonormal with respect to uniform
>>> integration,
>>>
>>> for i,p in enumerate(polys[:5]):
>>>    for j,p2 in enumerate(polys[:5]):
>>>        print i,j,integrate.quad(lambda x: p(x)*p2(x), -1,1)[0]
>>>
>>> And now the rescaling essentially doesn't have an effect anymore
>>> f_hat.min() 0.00393264959543
>>> fint2 1.00082015654   integral of estimated pdf
>>>
>>>
>>> class ChtPoly(object):
>>>
>>>    def __init__(self, order):
>>>        self.order = order
>>>        from scipy.special import chebyt
>>>        self.poly = chebyt(order)
>>>
>>>    def __call__(self, x):
>>>        if self.order == 0:
>>>            return np.ones_like(x) / (1-x**2)**(1/4.) / np.sqrt(np.pi)
>>>        else:
>>>            return self.poly(x) / (1-x**2)**(1/4.) / np.sqrt(np.pi) *
>>> np.sqrt(2)
>>>
>>> If I can get the same for some of the other polynomials, I would be
>>> happy.
>>>
>>
>> The virtue of the Legendre polynomials here is that you don't need the
>> weight to make them orthogonal. For the Chebyshev I'd be tempted to have a
>> separate weight function w(x) = 1/(1 - x**2)**.5 and do (1/n)\sum_i
>> T_n(x_i)*w(x_i), giving a result in a normal polynomial series. The
>> additional normalization factor due to the interval would then be 2/L in
>> addition to the terms you already have. The singularity in the weight could
>> be a problem so Legendre polynomials might be a safer choice.
>>
>> To simplify this a bit, in the Legendre case you can use legvander(x,
>> deg=n).sum(0) and scale the results with the factors in the preceding post
>> to get the coefficients. The legvander function is a rather small one and
>> you could pull it out and modify it to do the sum on the fly if space is a
>> problem.
>>
>
> Blush,brown paper bag in order here. Since you are already mapping the data
> points into the relevant interval you can forget the interval length. Just
> use 2/(2*j + 1) as the scaling factor for the Legendre functions.

just an update (no graphs but they look nice)

I haven't tried Legendre again, but after some fight I figured out
what the orthonormal (wrt weight=1) version of scipy.special.hermite
is

hermite and chebyt density estimation works well, for fourier I still
need to rescale by sqrt(pi),
for the orthonormal chebyt polynomial base it looks like that I have
to stay away from the boundary.

latest version
http://bazaar.launchpad.net/~josef-pktd/statsmodels/statsmodels-josef-experimental-030/view/head:/scikits/statsmodels/sandbox/nonparametric/densityorthopoly.py

I also wrote some simple helper functions to check orthogonality and
orthonormality.

>>> from scipy.special import chebyt
>>> polys = [chebyt(i) for i in range(4)]
>>> r, e = inner_cont(polys, -1, 1)
>>> r
array([[ 2.        ,  0.        , -0.66666667,  0.        ],
       [ 0.        ,  0.66666667,  0.        , -0.4       ],
       [-0.66666667,  0.        ,  0.93333333,  0.        ],
       [ 0.        , -0.4       ,  0.        ,  0.97142857]])
>>> is_orthonormal_cont(polys, -1, 1, atol=1e-6)
False

>>> polys = [ChebyTPoly(i) for i in range(4)]
>>> r, e = inner_cont(polys, -1, 1)
>>> r
array([[  1.00000000e+00,   0.00000000e+00,  -9.31270888e-14,
          0.00000000e+00],
       [  0.00000000e+00,   1.00000000e+00,   0.00000000e+00,
         -9.47850712e-15],
       [ -9.31270888e-14,   0.00000000e+00,   1.00000000e+00,
          0.00000000e+00],
       [  0.00000000e+00,  -9.47850712e-15,   0.00000000e+00,
          1.00000000e+00]])
>>> is_orthonormal_cont(polys, -1, 1, atol=1e-6)
True

check orthogonal wrt weight function, but it's not orthonormal

>>> polysc = [chebyt(i) for i in range(4)]
>>> r, e = inner_cont(polysc, -1, 1, weight=lambda x: (1-x*x)**(-1/2.))
>>> r
array([[  3.14159265e+00,   0.00000000e+00,  -2.00130508e-13,
          0.00000000e+00],
       [  0.00000000e+00,   1.57079633e+00,   0.00000000e+00,
          6.31095803e-12],
       [ -2.00130508e-13,   0.00000000e+00,   1.57079633e+00,
          0.00000000e+00],
       [  0.00000000e+00,   6.31095803e-12,   0.00000000e+00,
          1.57079633e+00]])
>>> np.max(np.abs(r - np.diag(np.diag(r))))
6.3109580284604367e-12

Josef

>
> Chuck
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>