Math package

Robert Kern robert.kern at gmail.com
Sat Jul 29 18:36:07 EDT 2006


diffuser78 at gmail.com wrote:
> I will write the problem a little more clearer so that you guys can
> recommend me better.
> 
> In a graphs of size N ( where, N = 1e9), each node has a degree D=1000.
> i.e There are overall (D*N)/2 edges in the graph. This graph needs to
> be generated randomly using the program.

You will need to specify your desired random generation algorithm a bit better. 
There are lots of ways to do that, and different choices will affect your 
results substantially. They will also affect your *ability* to get results.

> Now my task is to find the shortest distance from each node to every
> other node. And finally I want to find is the average distance from one
> node to another node in the graph. This is an average Erdos number or
> equivalently what degree of seperation exists in the graph.
> 
> I can start with low values of N and D but my ultimate aim is to
> simulate this graph on big values of N and D.

You probably won't be able to get up to N=1e9 and D=1000. The memory 
requirements are just too large even with a better data structure than an 
adjacency matrix (possibly the worst one you could use for problems this size).

However, for smaller graphs, you will probably want to look at the Boost Graph 
Library, as someone else has already mentioned, and LANL's NetworkX package. It 
was written for the statistical study of large networks (though not as large as 
you want).

   https://networkx.lanl.gov/

If you have a large cluster available, you might be able to parallelize your 
algorithms using the Parallel Boost Graph Library. I don't believe that Python 
bindings are available though. Your ability to solve your problem will also 
depend on the structure of the graph that you generated. Some networks 
parallelize better than others. Look at the "Performance" link on the site below.

   http://osl.iu.edu/research/pbgl/

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco




More information about the Python-list mailing list