[SciPy-user] Question about implementation of a directed acyclic graph of formulas and variables

Robert Kern robert.kern at gmail.com
Sun Feb 22 02:46:28 EST 2009


On Sun, Feb 22, 2009 at 01:23, Christopher Mutel <cmutel at gmail.com> wrote:
> Hello all-
>
> I am working on a model that uses a large set of linear equations.
> SciPy provides a set of tools that help very much in my case
> (especially sparse matrix stuff), and I hope it is okay if I ask the
> general SciPy community for advice on a further development of my
> model. I am sure that some of you have already dealt with the
> questions that I am struggling with.
>
> I would like to replace some of the numbers used to construct my
> matrix with a directed acyclic graph of formulas and variables, to
> represent the fact that many model components are not independent of
> one another. This is especially useful when doing Monte Carlo
> analysis, where every element in the set of linear equations has an
> associated uncertainty distribution. In the model I am working on, the
> linear equations represent physical processes in the industrial
> economy, and its makes the model more accurate to say that, for
> example, the NOx production in a boiler is a function of the
> temperature of the boiler, or the fuel consumption of a truck is a
> function of the load. The alternative, which is what I do now, is
> assume these parameters are independently distributed.
>
> My questions are:
>
> 1. To store my graph of references, I need to choose an existing
> python graph implementation. Does anyone have ideas on what would be
> best in my specific case? I only need a graph implementation to ensure
> transitive closure (no circular references), and to allow a way to
> keep track of references so the entire graph can be easily and
> correctly re-calculated. NetworkX seems like tremendous overkill in
> this case.

You probably just want a simple dict mapping nodes to lists of
adjacent nodes (following the direction of the arrows). Then you just
need an implementation of the appropriate algorithms on top of this
data structure. You might find what you need here (we had a similar
problem once):

https://svn.enthought.com/svn/enthought/EnthoughtBase/trunk/enthought/util/graph.py

> 2. Is there a "best" way to write a formula? Perhaps there are
> libraries for something like this? I was thinking of a class like:
>
> class Formula(object):
>    formula = "foo"
>    references = [bar1, bar2]
>
> A key point here is that the formula itself must be stored in a SQL
> database, and human-readable (at least to some extent). I am sure that
> there is someone out there who has though a lot about these types of
> issues, and has a decent solution. I don't think something like SymPy
> would work here, though of course I may be wrong.

I think sympy is probably an excellent option for you, but I'm not
entirely clear on what your formulae look like. Remember that you can
always pickle sympy expressions if they need to get stored in a SQL
database.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco



More information about the SciPy-User mailing list