[SciPy-user] hash function on arrays
Robert Kern
robert.kern at gmail.com
Tue Oct 9 18:49:15 EDT 2007
Tom Johnson wrote:
> On 10/9/07, Robert Kern <robert.kern at gmail.com> wrote:
>> ('numpy.ndarray', a.shape, a.dtype, a.strides, str(a.flags), buffer(a))
>
> Will this work for arrays defined in different python processes?
>
> I will be storing these hash values (along with the matrices) in a
> database and doing comparisons at some later time, in some other
> python process.
Sorry, the hash of the dtype does depend on the pointer. For the
natively-supported types on your machine (dtype(float), dtype('=i4'), etc.),
this shouldn't matter since they appear to be shortcutted such that you get the
same object out always. However, non-native types like byteswapped versions and
custom dtypes give different hashes every time.
If you want to support the non-native types, then you have to expand the dtype
somewhat. If you only need to support byteswapped versions of the usual
datatypes, you can probably just use a.dtype.str. If you need to support simples
record arrays, tupe(a.dtype.descr) will probably work. If you need to support
nested record arrays ... you have some more work ahead of you.
--
Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
More information about the SciPy-User
mailing list