proposal for slice hashing

Chris Angelico rosuav at gmail.com
Mon May 11 16:45:28 EDT 2020


On Tue, May 12, 2020 at 6:31 AM Will Bradshaw <defenastrator at gmail.com> wrote:
>
> On Monday, May 11, 2020 at 4:10:56 PM UTC-4, Chris Angelico wrote:
> > On Tue, May 12, 2020 at 6:01 AM Will Bradshaw <defenastrator at gmail.com> wrote:
> > > The only options as of now are:
> > >     1. use 3 layers of wrappers to pack the slices into a custom type that supports hashing pass this mangled version of the arguments through lru_cache wrapper into a function that reverses the process of the first wrapper and passes through to the underlying implementation. (see below "code for workaround" as example)
> > >        - This is kinda jank and arguably slow. Though in my case the cost of the calculation operation dwarfs this cost by an several orders of magnitude.
> > >        - mapping may be unreliable and is a rather long and impenetrable mess.
> > >     2. implementing my own custom caching for this situation which does not scale well and is a heck of a lot of work.
> > >     3. implement a special case for slices in the lru_cache function. However, this is just moving the problem into the functools library.
> > >
> >
> > 4. Implement __getitem__ as a wrapper around a caching lookup function
> > that simply takes the three arguments.
> >
> > def __getitem__(self, slice):
> >     return generate_values(slice.start, slice.stop, slice.step)
> >
> > @lru_cache
> > def generate_values(start, stop, step):
> >     ...
> >
> > Not sure if this makes it easy enough to not worry about the hashability.
> >
> > ChrisA
>
> does not work in the case of multi dimensional __getitem__ calls in the numpy style which happens to be my case as the number of dimensions in the indexes changes by case and all of the following are supported for each axis: slice, array of indexes, int index, and other custom index object types which I am hesitant to disclose)
>

Ah, fair enough. Was just looking for a solution that would work on
existing Python versions rather than demanding an upgrade to 3.10 :)

Your use-case isn't common, but on the other hand, making slice
objects hashable doesn't seem like it'd break anything. Obviously
they'd only be hashable if start/stop/step are all hashable, but
that's no different from tuples. +0.5.

BTW, your third option (moving the problem to functools) might
actually be easier than you think. You'd ultimately just be changing
the behaviour of _make_key. Unfortunately there's no easy way to
replace that function for a specific lru_cache, but that might itself
be a useful feature. Imagine this:

@lru_cache(make_key=hashable_slices)
def __getitem__(self, item):
    ...

Current implementation has a single global _make_key function that
then gets snapshotted into the closure, so you could do something
hacky as a proof of concept:

old = functools._make_key
functools._make_key = hashable_slices
@lru_cache
def __getitem__(self, item):
    ...
functools._make_key = old

Obviously that's bad code, but it's a great low-effort proof of concept :)

ChrisA


More information about the Python-list mailing list