[SciPy-user] [SciPy-dev] New maximum entropy and Monte Carlo packages

David Huard david.huard at gmail.com
Wed Jan 18 19:55:53 EST 2006


I think that maxent methods are sufficiently useful to deserve a place in
scipy.

On a related topic, is there a politic about what should or should not be
included on scipy ? What is the 'grand design' intended for scipy ? A select
compilation of the best software or the largest possible collection of
routines ?

David


2006/1/18, Ed Schofield <schofield at ftw.at>:
>
> Hi all,
>
> I recently moved two new packages, maxent and montecarlo, from the
> sandbox into the main SciPy tree.  I've now moved them back to the
> sandbox pending further discussion.  I'll introduce them here and ask
> for feedback on whether they should be included in the main tree.
>
> The maxent package is for fitting maximum entropy models subject to
> expectation constraints.  Maximum entropy models represent the 'least
> biased' models subject to given constraints.  When the constraints are
> on the expectations of functionals -- the usual formulation -- maximum
> entropy models take the form of a generalized exponential family.  A
> normal distribution, for example, is a maximum entropy distribution
> subject to mean and variance constraints.
>
> The maxent package contains one main module and one module with utility
> functions.  Both are entirely in Python.  (I have now removed the F2Py
> dependency.)  The main module supports fitting models on either small or
> large sample spaces, where 'large' means continuous or otherwise too
> large to iterate over.  Maxent models on 'small' sample spaces are
> common in natural language processing; models on 'large' sample spaces
> are useful for channel modelling in mobile communications, spectrum and
> chirp analysis, and (I believe) fluid turbulence.  Some simple examples
> are in the examples/ directory.  The simplest use is to define a list of
> functions f, an array of desired expectations K, and a sample space, and
> use the commands
>
>
> >>>>>> model = maxent.model(f, samplespace)
> >>>>>> model.fit(K)
> >>>
> >>>
>
> You can then retrieve the fitted parameters directly or analyze the
> model in other ways.
>
> I've been developing the maxent algorithms and code for about 4 years.
> The code is very well commented and should be straightforward to maintain.
>
>
> The montecarlo package currently does only one thing.  It generates
> discrete variates from a given distribution.  It does this FAST.  On my
> P4 it generates over 107 variates per second, even for a sample space
> with 106 elements.  The algorithm is the compact 5-table lookup sampler
> of Marsaglia.  The main module, called 'intsampler', is written in C.
> There is also a simple Python wrapper class around this called
> 'dictsampler' that provides a nicer interface, allowing sampling from a
> distribution with arbitrary hashable objects
> (e.g. strings) as labels instead of {0,1,2,...}.  dictsampler has
> slightly more overhead than intsampler, but is also very fast (around
> 106 per second for me with a sample space of 106 elements labelled
> with strings).  An example of using it to sample from this discrete
> distribution:
>
>     x       'a'       'b'       'c'
>     p(x)    10/180    150/180   20/180
>
> is:
>
>
> >>>>>> table = {'a':10, 'b':150, 'c':20}
> >>>>>> sampler = dictsampler(table)
> >>>>>> sampler.sample(10**4)
> >>>
> >>>
> array([b, b, a, ..., b, b, c], dtype=object)
>
> The montecarlo package is very small (and not nearly as impressive as
> Christopher Fonnesbeck's PyMC package), but the functionality that is
> there would be an efficient foundation for many discrete Monte Carlo
> algorithms.
>
> I'm aware of the build issue Travis Brady reported with MinGW not
> defining lrand48().  I can't remember why I used this, but I'll adapt it
> to use lrand() instead and report back.
>
>
> Would these packages be useful?  Are there any objections to including
> them?
>
>
> -- Ed
>
>
>
>
>
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.net
> http://www.scipy.net/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20060118/ce6572b9/attachment.html>


More information about the SciPy-User mailing list