PyYaml?

Jeremy Bowers jerf at jerf.org
Sun Sep 19 21:11:55 EDT 2004


On Sun, 19 Sep 2004 23:23:18 +0000, Chris S. wrote:

> I don't quite follow your logic. If you load a serialized file, you should
> conceivably already know what classes it should and should not be
> instantiating, and be able to restrict its access accordingly.

In theory, yes. In Java, yes, I would imagine. In Python, not so far. In
fact, note that Bastion and RExec have been removed from modern Pythons
because they were false assurances. Securing Pickle is probably the same
problem as re-writing those modules to work in modern Python. People more
familiar with the internals can give more details about that, though I'd
google the Python dev list before asking anyone.

It is probably not theoretically impossible to add this to Python but it
is surprisingly difficult; it is the sort of thing you have to design into
the language from day one and even then it is hard.

> I meant language and platform portability. I suppose you'd find this
> aspect attractive for the same reasons you'd use XML, which some have also
> used as a serialization format. Granted, not every languages' objects may
> be translatable, but many languages share common data primitives.

You'd be surprised, if you actually tried. (This is technically off-topic,
not directly about Pickle.)

A data type is basically a range of values, and a set of operations on it
that returns some value. 

So, C has this thing called an "int", right? Surely Python has it too.

But, technically, it doesn't. Compare (this is "test.cpp"):

#include <iostream>                                                             
                                                                                
int main() {                                                                    
  int a = 1073741824;                                                           
  std::cout << a * 4 << std::endl;                                              
  return 0;                                                                     
}                                                                               

>> g++ test.cpp  
>> a.out
0

with:

>> python -c "print 1073741824 * 4"
4294967296



Python and C do *not* have the same int "datatype". This matters when you
go to serialize 2 ^ 43 and the resulting number works in Python but has
some random looking value in C++.

If you really get down to it, languages share far fewer datatypes than
you'd think, and while the ten-mile-high view says "Oh, that shouldn't
matter", I assure you, if you actually got into trying to design an actual
serialization protocol you'd rapidly find it matters.




More information about the Python-list mailing list