[Numpy-discussion] feature request - increment counter on write check

Sebastian Berg sebastian at sipsolutions.net
Fri Sep 11 09:20:28 EDT 2015


On Fr, 2015-09-11 at 13:10 +0000, Daniel Manson wrote:
> Originally posted as issue 6301 on github.
> 
> 
> Presumably any block of code that modifies an ndarray's buffer is
> wrapped in a (thread safe?) check of the writable flag. Would it be
> possible to hold a counter rather than a simple bool flag and then
> increment the counter whenever you test the flag? Hopefully this would
> introduce only a tiny additional overhead, but would permit
> pseudo-hashing to test whether a mutable array has changed since you
> last encountered it.
>   

Just a quick note. This is a bit more complex then it might appear. The
reason being that when a view of the array is changed, you would have to
"notify" the array itself that it has changed. So propagation from top
to bottom does not seem straight forward to me. (the other way is fine,
since on check you could check all parents, but you cannot check all
children).

- Sebastian


> Ideally this should be exposed a bit like python's __hash__ method,
> lets say __mutablehash__, meaning a hash is returned but be warned
> that the object is mutable.  For an ndarray, X,  containing objects
> that themselves have a __mutablehash__ method (e.g. other ndarrays, or
> some user object), the X.__mutablehash__ method will need to do the
> full check over all constituent objects, or simply return None.
> Defining and API of this sort would make it possible to - for example
> - let pandas DataFrames also implement this interface.
> 
> 
> In terms of usage cases, the one I was motivated by was imagining
> improvements to the "variable explorer" in Spyder - roughly speaking,
> this widget's job is to always display an up-to-date summary of
> variables in current scope, e.g. currently it can show max/min and
> shape, but you could imagine also showing graphical summaries of the
> contents of an ndarray.  If the widget could cache summaries and check
> which arrays have really changed it should be much faster/offer more
> features/be simpler internally.  Note that pandas DataFrames are
> relevant here as an example of complex objects, containing ndarrays,
> which would benefit from being able to have their summaries cached.
> 
> 
> A more common/general usage case would be as a check in some kind of
> memoization process...
> 
> 
> #simple example...
> @memoize_please
> def hasnans(x):
>    return np.any(np.isnan(x))
> 
> 
> # more complex example...
> def convolve_fft(a,b, _cache={}):
>    a_hash = mutablehash(a)
>    b_hash = mutablehash(b)
>    if a_hash not in _cache:
>       _cache[a_hash] = fft(a)
>   if b_hash not in _cache:
>       _cache[b_hash] = fft(b)
>   return ifft(_cache[a_hash] * _cache[b_hash])
> 
> 
> 
> 
> A quick though on an implementation detail...
>  
> I'm not sure exactly how to deal with the counter overflowing: perhaps
> if you treated counter==0 to mean not-writable (i.e. that would be the
> new version of the old write flag) then you might get some
> uint-wraparound checking for free (because when it wraps back around
> to zero the buffer ends up becoming locked)?  Alternatively you could
> just say that no guarantee is given of wraparound being caught..though
> that might seriously impact on the range of possible uses.
> 
> 
> In summary...
> Hopefully the stuff needed to make __mutablehash__ work could be
> implemented simply by adding a single extra operation to the
> write-check (and maybe changing the footprint of the ndarray slightly
> to accomodate a counter).  But I suspect someone will tell me that
> life is never that simple! 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150911/41329a4a/attachment.sig>


More information about the NumPy-Discussion mailing list