[SciPy-User] [Numpy-discussion] Fitting a curve on a log-normal distributed data

Tue Nov 17 17:07:17 EST 2009

On Tue, Nov 17, 2009 at 1:37 PM, Robert Kern <robert.kern at gmail.com> wrote:

> On Tue, Nov 17, 2009 at 13:28, Gökhan Sever <gokhansever at gmail.com> wrote:
> >
> >
> > On Tue, Nov 17, 2009 at 12:38 PM, <josef.pktd at gmail.com> wrote:
>
> >> If conc where just lognormal distributed, then you would not get any
> >> relationship between conc and size.
> >>
> >> If you have many observations with conc, size pairs then you could
> >> estimate a noisy model
> >> conc = f(size) + u  where the noise u is for example log-normal
> >> distributed but you would still need to get an expression for the
> >> non-linear function f.
> >
> > I don't understand why I can't get a relation between sizes and conc
> values
> > if conc is log-normally distributed. Can I elaborate this a bit more? The
> > non-linear relationship part is also confusing me. If say to test the
> linear
> > relationship of x and y data pairs we just fit a line, in this case what
> I
> > am looking is to fit a log-normal to get a relation between size and
> conc.
>
> It's a language issue. Your concentration values are not log-normally
> distributed. Your particle sizes are log-normally distributed (maybe).
> The concentration of a range of particle sizes is a measurement that
> is related to particle size the distribution, but you would not say
> that the measurements themselves are log-normally distributed. Josef
> was taking your language at face value.
>
> >> If you want to fit a curve f that has the same shape as the pdf of
> >> the log-normal, then you cannot do it with lognorm.fit, because that
> >> just assumes you have a random sample independent of size.
> >
> > Could you give an example on this?
>
> x = stats.norm.rvs()
> stats.norm.fit(x)
>
> >> So, it's not clear to me what you really want, or what your sample data
> >> looks like (do you have only one 15 element sample or lots of them).
> >
> > I have many sample points (thousands) that are composed of this 15
> elements.
> > But the whole data don't look much different the sample I used. Most
> peaks
> > are around 3rd - 4th channel and decaying as shown in the figure.
>
> Do you need to fit a different distribution for each 15-vector? Or are
> all of these measurements supposed to be merged somehow?
>

For my comparison case I will use an hour length of data, which are composed
of 3600 sample points. At each minute I will average these points. This is
because I am comparing data from two different instruments and by averaging
I am trying to eliminate intrinsic measurement error. It is really not an
easy task to make point by point comparison in my case. So in the end I will
have 60 averaged data-points where each point composed of 15-elements in
them. Later use the same fitting technique to guess the
out-ouf-the-measurement-limits parts.

>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
>  -- Umberto Eco
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>

-- 
Gökhan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20091117/bd368b05/attachment.html>