[SciPy-user] [SciPy-dev] New maximum entropy and Monte Carlo packages
David Huard
david.huard at gmail.com
Wed Jan 18 19:55:53 EST 2006
I think that maxent methods are sufficiently useful to deserve a place in
scipy.
On a related topic, is there a politic about what should or should not be
included on scipy ? What is the 'grand design' intended for scipy ? A select
compilation of the best software or the largest possible collection of
routines ?
David
2006/1/18, Ed Schofield <schofield at ftw.at>:
>
> Hi all,
>
> I recently moved two new packages, maxent and montecarlo, from the
> sandbox into the main SciPy tree. I've now moved them back to the
> sandbox pending further discussion. I'll introduce them here and ask
> for feedback on whether they should be included in the main tree.
>
> The maxent package is for fitting maximum entropy models subject to
> expectation constraints. Maximum entropy models represent the 'least
> biased' models subject to given constraints. When the constraints are
> on the expectations of functionals -- the usual formulation -- maximum
> entropy models take the form of a generalized exponential family. A
> normal distribution, for example, is a maximum entropy distribution
> subject to mean and variance constraints.
>
> The maxent package contains one main module and one module with utility
> functions. Both are entirely in Python. (I have now removed the F2Py
> dependency.) The main module supports fitting models on either small or
> large sample spaces, where 'large' means continuous or otherwise too
> large to iterate over. Maxent models on 'small' sample spaces are
> common in natural language processing; models on 'large' sample spaces
> are useful for channel modelling in mobile communications, spectrum and
> chirp analysis, and (I believe) fluid turbulence. Some simple examples
> are in the examples/ directory. The simplest use is to define a list of
> functions f, an array of desired expectations K, and a sample space, and
> use the commands
>
>
> >>>>>> model = maxent.model(f, samplespace)
> >>>>>> model.fit(K)
> >>>
> >>>
>
> You can then retrieve the fitted parameters directly or analyze the
> model in other ways.
>
> I've been developing the maxent algorithms and code for about 4 years.
> The code is very well commented and should be straightforward to maintain.
>
>
> The montecarlo package currently does only one thing. It generates
> discrete variates from a given distribution. It does this FAST. On my
> P4 it generates over 107 variates per second, even for a sample space
> with 106 elements. The algorithm is the compact 5-table lookup sampler
> of Marsaglia. The main module, called 'intsampler', is written in C.
> There is also a simple Python wrapper class around this called
> 'dictsampler' that provides a nicer interface, allowing sampling from a
> distribution with arbitrary hashable objects
> (e.g. strings) as labels instead of {0,1,2,...}. dictsampler has
> slightly more overhead than intsampler, but is also very fast (around
> 106 per second for me with a sample space of 106 elements labelled
> with strings). An example of using it to sample from this discrete
> distribution:
>
> x 'a' 'b' 'c'
> p(x) 10/180 150/180 20/180
>
> is:
>
>
> >>>>>> table = {'a':10, 'b':150, 'c':20}
> >>>>>> sampler = dictsampler(table)
> >>>>>> sampler.sample(10**4)
> >>>
> >>>
> array([b, b, a, ..., b, b, c], dtype=object)
>
> The montecarlo package is very small (and not nearly as impressive as
> Christopher Fonnesbeck's PyMC package), but the functionality that is
> there would be an efficient foundation for many discrete Monte Carlo
> algorithms.
>
> I'm aware of the build issue Travis Brady reported with MinGW not
> defining lrand48(). I can't remember why I used this, but I'll adapt it
> to use lrand() instead and report back.
>
>
> Would these packages be useful? Are there any objections to including
> them?
>
>
> -- Ed
>
>
>
>
>
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.net
> http://www.scipy.net/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20060118/ce6572b9/attachment.html>
More information about the SciPy-User
mailing list