proposal for slice hashing

Will Bradshaw defenastrator at gmail.com
Mon May 11 18:09:17 EDT 2020


On Monday, May 11, 2020 at 4:45:55 PM UTC-4, Chris Angelico wrote:
> On Tue, May 12, 2020 at 6:31 AM Will Bradshaw <defenastrator at gmail.com> wrote:
> >
> > On Monday, May 11, 2020 at 4:10:56 PM UTC-4, Chris Angelico wrote:
> > > On Tue, May 12, 2020 at 6:01 AM Will Bradshaw <defenastrator at gmail.com> wrote:
> > > > The only options as of now are:
> > > >     1. use 3 layers of wrappers to pack the slices into a custom type that supports hashing pass this mangled version of the arguments through lru_cache wrapper into a function that reverses the process of the first wrapper and passes through to the underlying implementation. (see below "code for workaround" as example)
> > > >        - This is kinda jank and arguably slow. Though in my case the cost of the calculation operation dwarfs this cost by an several orders of magnitude.
> > > >        - mapping may be unreliable and is a rather long and impenetrable mess.
> > > >     2. implementing my own custom caching for this situation which does not scale well and is a heck of a lot of work.
> > > >     3. implement a special case for slices in the lru_cache function. However, this is just moving the problem into the functools library.
> > > >
> > >
> > > 4. Implement __getitem__ as a wrapper around a caching lookup function
> > > that simply takes the three arguments.
> > >
> > > def __getitem__(self, slice):
> > >     return generate_values(slice.start, slice.stop, slice.step)
> > >
> > > @lru_cache
> > > def generate_values(start, stop, step):
> > >     ...
> > >
> > > Not sure if this makes it easy enough to not worry about the hashability.
> > >
> > > ChrisA
> >
> > does not work in the case of multi dimensional __getitem__ calls in the numpy style which happens to be my case as the number of dimensions in the indexes changes by case and all of the following are supported for each axis: slice, array of indexes, int index, and other custom index object types which I am hesitant to disclose)
> >
> 
> Ah, fair enough. Was just looking for a solution that would work on
> existing Python versions rather than demanding an upgrade to 3.10 :)

I have a solution to the issue in current python, it was given in my first post at the bottom. I implemented a variant of it for my work. Just because there is a hack does not mean a fix should not be implemented.

> Your use-case isn't common, but on the other hand, making slice
> objects hashable doesn't seem like it'd break anything. Obviously
> they'd only be hashable if start/stop/step are all hashable, but
> that's no different from tuples. +0.5.


I'm certain that my use case is not common. However, there are certainly many uncommon use cases in which being able to have a slice in a dictionary key or otherwise hashing a slice could be useful.

> 
> BTW, your third option (moving the problem to functools) might
> actually be easier than you think. You'd ultimately just be changing
> the behaviour of _make_key. Unfortunately there's no easy way to
> replace that function for a specific lru_cache, but that might itself
> be a useful feature. Imagine this:
> 
> @lru_cache(make_key=hashable_slices)
> def __getitem__(self, item):
>     ...

This is a good idea as it would allow for quite a bit of tuning by the user and  allow trade offs to be made between cache lookup complexity and potentially unnecessary re-computes.


More information about the Python-list mailing list