[Python-Dev] undesireable unpickle behavior, proposed fix

Jake McGuire jake at youtube.com
Tue Jan 27 21:25:02 CET 2009


On Jan 27, 2009, at 11:40 AM, Martin v. Löwis wrote:
>> Hm. This would change the pickling format though. Wouldn't just
>> interning (short) strings on unpickling be simpler?
>
> Sure - that's what Jake had proposed. However, it is always difficult
> to select which strings to intern - his heuristics (IIUC) is to intern
> all strings that appear as dictionary keys. Whether this is good  
> enough,
> I don't know. In particular, it might intern very large strings that
> aren't identifiers at all.

I may have misunderstood how unpickling works, but I believe that my  
path only interns strings that are keys in a dictionary used to  
populate an instance.  This is very similar to how instance creation  
and modification works in Python now.  The only difference is if you  
set an attribute via "inst.__dict__['attribute_name'] = value" then  
'attribute_name' will not be automatically interned, but if you pickle  
the instance, 'attribute_name' will be interned on unpickling.

There may be cases where users specifically go through __dict__ to  
avoid interning attribute names, but I would be surprised to hear  
about it and very interested in talking to the person who did that.

Creating a new pickle protocol to handle this case seems excessive...

-jake 


More information about the Python-Dev mailing list