[Numpy-discussion] histogramdd memory needs

Fri Feb 1 10:08:19 EST 2008

Hi Lars,

[...]

2008/2/1, Lars Friedrich <lfriedri at imtek.de>:
>
>
> 1) How can I tell histogramdd to use another dtype than float64? My bins
> will be very little populated so an int16 should be sufficient. Without
> normalization, a Integer dtype makes more sense to me.

There is no way you'll be able to ask that without tweaking the histogramdd
function yourself.  The relevant bit of code is the instantiation of hist :

hist = zeros(nbin.prod(), float)

2) Is there a way to use another algorithm (at the cost of performance)
> that uses less memory during calculation so that I can generate bigger
> histograms?

You could work through your array block by block. Simply fix the range and
generate an histogram for each slice of 100k data and sum them up at the
end.

The current histogram and histogramdd implementation has the advantage of
being general, that is you can work with uniform or non-uniform bins, but it
is not particularly efficient, at least for large number of bins (>30).

Cheers,

David

My numpy version is '1.0.4.dev3937'
>
> Thanks,
> Lars
>
>
> --
> Dipl.-Ing. Lars Friedrich
>
> Photonic Measurement Technology
> Department of Microsystems Engineering -- IMTEK
> University of Freiburg
> Georges-Köhler-Allee 102
> D-79110 Freiburg
> Germany
>
> phone: +49-761-203-7531
> fax:   +49-761-203-7537
> room:  01 088
> email: lars.friedrich at imtek.de
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20080201/1fc759aa/attachment.html>