[IronPython] Portable use of pickle.dumps()

Michael Foord fuzzyman at voidspace.org.uk
Fri May 29 17:10:43 CEST 2009


Robert Smallshire wrote:
> Hi Michael,
>
>   
>>> I'm trying to get some commercial code for a simple object 
>>>       
>> database we 
>>     
>>> have written for Python 2.6 to work with IronPython 2.6. In 
>>>       
>> Python 2.6 
>>     
>>> the return type of pickle.dumps() is str, which is of course a byte 
>>> string.  In IronPython 2.6 it is also str, which is of course a 
>>> unicode string.  This 'compatibility' is fine until I put those 
>>> strings into a database, at which point my interoperability between 
>>> CPython and IronPython goes off the rails.
>>>
>>>       
>> How is this actually a problem?
>>
>> I mean, can you provide a specific example of where a string in 
>> IronPython doesn't behave as a byte string in CPython. I'm sure there 
>> are such examples, but those may be bugs that the IPy team 
>> can fix. In 
>> practise I've encountered these problems very rarely.
>>     
>
> My opening paragraph may be ambiguously worded - by 'interoperability' I
> didn't mean the ability to run the same code unchanged on CPython and
> IronPython (I have to change the code anyway to use a different database
> adapter) - I meant interoperability between pickles persisted into a
> database from both IronPython and CPython.
>   

So are you telling the database that it is binary data or text?

Is the question how do I go from a pickle string in IronPython to a byte 
array that I can pass to the database adaptor without going through an 
explicit encode (which will transform the data)?

(One technique would be to explicitly use pickle protocol 0 which is 
less efficient but only creates ascii characters - this is actually the 
default. Another alternative would be to use JSON or YAML instead of 
pickle.)

Here is an example of getting a byte array from a binary pickle in 
IronPython:

 >>> import pickle
 >>> class A(object):
...  b = 'hello'
...  c = (None, 'fish', 7.2, 7j)
...  a = {1: 2}
...
 >>> p = pickle.dumps(A(), protocol=2)
 >>> p
u'\x80\x02c__main__\nA\nq\x00)\x81q\x01}q\x02b.'
 >>> from System import Array, Byte
 >>> a = Array[Byte](tuple(Byte(ord(c)) for c in p))
 >>> a
Array[Byte]((<System.Byte object at 0x0000000000000033 [128]>, 
<System.Byte obje...

I hope this is at least slightly helpful. :-)

Michael



> My basic issue is that the 'str' unavoidably implies certain semantics when
> calling .NET APIs from IronPython. These APIs interpret str as text rather
> than just bytes, which therefore gets transformed by various text encodings,
> such as UTF-8 to UTF-16. Such encodings are undesirable for my pickled data
> since the result is no longer necessarily a valid pickle.   I suppose the
> intention in Python 3.0 is that 'bytes' doesn't carry any semantics with it,
> its just data, which is why pickle.dumps() in Python 3.0 returns bytes
> rather than str.
>
> I want to push plain old byte arrays into the database from both CPython and
> IronPython, so I can avoid any head-scratching confusion with database
> adapters and/or databases inappropriately encoding or decoding my data.
>
>   
>> For example "data = [ord(c) for c in some_string]" has behaved as 
>> expected many times for me in IronPython (and could help you turn 
>> strings into bytes).
>>     
>
> Thanks. I'll try something based on that.
>
>   
>> Is this a theoretical problem at this stage or an actual problem?
>>     
>
> Its an actual problem with SQLiteParameter.Value from the SQLite ADO.NET
> provider.  I think our original CPython code is a bit sloppy with respect to
> the distinction between text strings and byte arrays, so I'll probably need
> to tighten things up on both sides.
>
> Would you agree tha using unicode() and bytes() everywhere and avoiding
> str() gives code that has the same meaning in Python 2.6, IronPython 2.6 and
> Python 3.0?  Do you think this would be a good guideline to follow until we
> can leave Python 2.x behind?
>
> Many thanks,
>
> Rob
>
>
>
>   


-- 
http://www.ironpythoninaction.com/




More information about the Ironpython-users mailing list