[Numpy-discussion] Numpy-discussion Digest, Vol 6, Issue 18

Sebastian Haase haase at msg.ucsf.edu
Fri Mar 9 09:58:32 EST 2007


On 3/9/07, James A. Bednar <jbednar at inf.ed.ac.uk> wrote:
> |  From: Robert Kern <robert.kern at gmail.com>
> |  Subject: Re: [Numpy-discussion] in place random generation
> |
> |  Daniel Mahler wrote:
> |  > On 3/8/07, Charles R Harris <charlesr.harris at gmail.com> wrote:
> |
> |  >> Robert thought this might relate to Travis' changes adding
> |  >> broadcasting to the random number generator. It does seem
> |  >> certain that generating small arrays of random numbers has a
> |  >> very high overhead.
> |  >
> |  > Does that mean someone is working on fixing this?
> |
> |  It's not on the top of my list, no.
>
> I just wanted to put in a vote saying that generating a large quantity
> of small arrays of random numbers is quite important in my field, and
> is something that is definitely slowing us down right now.
>
> We often simulate neural networks whose many, many small weight
> matrices need to be initialized with random numbers, and we are seeing
> quite slow startup times (on the order of minutes, even though
> reloading a pickled snapshot of the same simulation once it has been
> initialized takes only a few seconds).
>
> The quality of these particular random numbers doesn't matter very
> much for us, so we are looking for some cheaper way to fill a bunch of
> small matrices with at least passably random values.  But it would of
> course be better if the regular high-quality random number support in
> Numpy were speedy under these conditions...
>
> Jim
>
Hey Jim,

Could you not create all the many arrays to use "one large chunck" of
contiguous memory ?
like: 1) create a large 1D array
2) create all small arrays in a for loop using
numpy.ndarray(buffer=largeArray[offset], shape=..., dtype=...)  ---
you increment offset appropriately during the loop
3) then you can reset all small arrays to  new random numbers with one
call to resetting the large array ((they all have the same statistics
(mean,stddev, type), right ?


Maybe this helps,
Sebastian Haase



More information about the NumPy-Discussion mailing list