[Python-3000] bytes and dicts (was: PEP 3137: Immutable Bytes and Mutable Buffer)

Jim Jewett jimjjewett at gmail.com
Fri Sep 28 20:33:04 CEST 2007


On 9/28/07, Guido van Rossum <guido at python.org> wrote:
> Well, maybe this is a good enough argument to give up.

Not quite yet... I still see two potential solutions, depending on
whether or not the exclusion is sticky.  Details below.

=========

If the exclusion is sticky, then add (implicit) flags saying "seen a
string" and "seen a byte".   Similar logic is already there, in that
"seen a non-string" replaces the lookdict function.

The most common case (exact unicode in an exact unicode-only dict)
would stay the same as today, but the other cases would have some
extra type-checking.

=========

If the exclusion is based on current contents, then we can add a
count; my concern is that keeping this efficient may be too hacky.

It looks like there is room for exactly one more pointer (-sized count
variable) before small dicts bleed to a third cacheline.  Because of
this guard, bytes and strings can never appear in the same dict, so at
least one count is zero.  Because dict entries are 3 pointers long,
there can never be more than (Py_ssize_t / 2) entries, so the sign bit
can be repurposed to indicate whether the count refers to strings or
bytes.  (count==0 means no bytes or strings;  count==5 means 5 string
keys;  count==-32 means 32 bytes keys.)

-jJ


More information about the Python-3000 mailing list