[issue46864] Deprecate ob_shash in BytesObject

Inada Naoki report at bugs.python.org
Thu Mar 24 03:51:33 EDT 2022


Inada Naoki <songofacandy at gmail.com> added the comment:

> I guess not much difference in benchmarks.
> But if put a bytes object into multiple dicts/sets, and len(bytes_key) is large, it will take a long time. (1 GiB 0.40 seconds on i5-11500 DDR4-3200)
> The length of bytes can be arbitrary,so computing time may be very different.

I don't think calculating hash() for large bytes is not so common use case.
Rare use cases may not justify adding 8bytes to basic types, especially users expect it is compact.

Balance is important. Microbenchmark for specific case doesn't guarantee the good balance.
So I want real world examples. Do you know some popular libraries that are depending on hash(bytes) performance?


> Is it possible to let code objects use other types? In addition to ob_hash, maybe the extra byte \x00 at the end can be saved.

Of course, it is possible. But it needs large refactoring around code, including pyc cache file format.
I will try it before 3.13.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue46864>
_______________________________________


More information about the Python-bugs-list mailing list