[Python-Dev] PEP 393 Summer of Code Project

Stefan Behnel stefan_ml at behnel.de
Tue Aug 23 17:18:13 CEST 2011


Antoine Pitrou, 23.08.2011 16:08:
> On Tue, 23 Aug 2011 16:02:54 +0200
> Stefan Behnel wrote:
>> "Martin v. Löwis", 23.08.2011 15:17:
>>>> Has this been considered before? Was there a reason to decide against it?
>>>
>>> I think we simply didn't consider it. An early version of the PEP used
>>> the lower bits for the pointer to encode the kind, in which case it even
>>> stopped being a pointer. Modules are not expected to access this
>>> pointer except through the macros, so it may not matter that much.
>>
>> The difference is that you *could* access them directly in a safe way, if
>> it was a union.
>>
>> So, for an efficient character loop, replicated for performance reasons or
>> for character range handling reasons or whatever, you could just check the
>> string kind and then jump to the loop implementation that handles that
>> type, without using any further macros.
>
> Macros are useful to shield the abstraction from the implementation. If
> you access the members directly, and the unicode object is represented
> differently in some future version of Python (say e.g. with tagged
> pointers), your code doesn't compile anymore.

Even with tagged pointers, you could just provide a macro that unpacks the 
pointer to the buffer for a given string kind. I don't think there's much 
more to be done to keep up the abstraction. I don't see a reason to prevent 
users from accessing the memory buffer directly, especially not by 
(accidental, as I understand it) obfuscation through a void*.

Stefan



More information about the Python-Dev mailing list