[Python-Dev] The future of the wchar_t cache

Mon Oct 22 16:41:23 EDT 2018

On 22Oct2018 1047, Steve Dower wrote:
> On 22Oct2018 1007, Serhiy Storchaka wrote:
>> 22.10.18 16:24, Steve Dower пише:
>>> Yes, that's true. But "should reduce ... footprint" is also an 
>>> optimisation that deserves a benchmark by that standard. Also, I'm 
>>> proposing keeping the 'kind' as UCS-2 when the string is created from 
>>> UCS-2 data that is likely to be used as UCS-2. We would not create 
>>> the UCS-1 version in this case, so it's not the same as prefilling 
>>> the cache, but it would cost a bit of memory in exchange for CPU. If 
>>> slicing and concatentation between matching kinds also preserved the 
>>> kind, a lot of path handling code could avoid back-and-forth 
>>> conversions.
>>
>> Oh, I afraid this will complicate the whole code of unicodeobject.c 
>> (and several other files) a much and can introduce a lot of subtle bugs.
>>
>> For example, when you search a UCS2 string in a UCS1 string, the 
>> current code returns the result fast, because a UCS1 string can't 
>> contain codes  > 0xff, and a UCS2 string should contain codes > 0xff. 
>> And there are many such assumptions.
> 
> That doesn't change though, as we're only ever expanding the range. So 
> searching a UCS2 string in a UCS2 string that doesn't contain any actual 
> UCS2 characters is the only case that would be affected, and whether 
> that case occurs more than the UCS2->UCS1->UCS2 conversion case is 
> something we can measure (but I'd be surprised if substring searches 
> occur more frequently than OS conversions).
> 
> Currently, unicode_compare_eq exits early when the kinds do not match, 
> and that would be a problem (but is also easily fixable). But other 
> string operations already handle mismatched kinds.

I made the changes (along with a somewhat expensive update to make 
__hash__ produce the same value for UCS1 and UCS2 strings) and it works 
just fine, but the speed difference seems to be fairly trivial. Equality 
time in particular is slower (highly optimized memcpy vs. plain-old for 
loop).

That said, I didn't remove the wchar_t cache (though I tried some tricks 
to avoid it), so it's possible that once that's gone we'll see an 
avoidable regression here, but on its own this doesn't contribute much.

Cheers,
Steve