Portable readable dump format

David Mertz mertz at gnosis.cx
Sun Oct 26 13:23:18 EST 2003


Chris Stiles <chris at example.org> wrote previously:
|Is there a human readable dump format for objects ?  As opposed to
|pickling ?  It doesn't necessarily have to be XML - in fact i'd prefer
|if it wasn't ;) - but ISTR an XML dump library.

Another poster pointed to YAML, which is a good format.  If you want
XML, however, I have gnosis.xml.pickle in Gnosis Utilities.

Some comparisons:

    >>> import gnosis.xml.pickle
    >>> import yaml
    >>> class Something:
    ...     def __init__(self):
    ...         self.tup = (1,2,3)
    ...         self.dct = {1.4:5.6,7.8:9.0}
    ...         self.str = "Spam and eggs"
    ...
    >>> o = Something()
    >>> print gnosis.xml.pickle.dumps(o)
    <?xml version="1.0"?>
    <!DOCTYPE PyObject SYSTEM "PyObjects.dtd">
    <PyObject module="__main__" class="Something" id="2283692">
    <attr name="tup" type="tuple" id="2577828" >
      <item type="numeric" value="1" />
      <item type="numeric" value="2" />
      <item type="numeric" value="3" />
    </attr>
    <attr name="dct" type="dict" id="2829140" >
      <entry>
        <key type="numeric" value="1.3999999999999999" />
        <val type="numeric" value="5.5999999999999996" />
      </entry>
      <entry>
        <key type="numeric" value="7.7999999999999998" />
        <val type="numeric" value="9." />
      </entry>
    </attr>
    <attr name="str" type="string" value="Spam and eggs" />
    </PyObject>
    >>> print yaml.dump(o)
    --- !!__main__.Something
    dct:
        1.4: 5.6
        7.8: 9.0
    str: Spam and eggs
    tup:
        - 1
        - 2
        - 3

Certainly YAML is a lot less verbose (but also in a way less explicit).
The method 'yaml.dump()' is misnamed, however, it should have an 's' at
the end.

However, notice this:

    >>> class NewSomething(object):
    ...     def __init__(self):
    ...         self.tup = (1,2,3)
    ...         self.dct = {1.4:5.6,7.8:9.0}
    ...         self.str = "Spam and eggs"
    ...
    >>> o2 = NewSomething()
    >>> print yaml.dump(o2)
    --- <__main__.NewSomething object at 0x2abbcc>

Gnosis does fine here:

    >>> print gnosis.xml.pickle.dumps(o2)
    <?xml version="1.0"?>
    <!DOCTYPE PyObject SYSTEM "PyObjects.dtd">
    <PyObject module="__main__" class="NewSomething" id="2800588">
    <attr name="tup" type="tuple" id="2576628" >
      <item type="numeric" value="1" />
      <item type="numeric" value="2" />
      <item type="numeric" value="3" />
    </attr>
    <attr name="dct" type="dict" id="2829820" >
      <entry>
        <key type="numeric" value="1.3999999999999999" />
        <val type="numeric" value="5.5999999999999996" />
      </entry>
      <entry>
        <key type="numeric" value="7.7999999999999998" />
        <val type="numeric" value="9." />
      </entry>
    </attr>
    <attr name="str" type="string" value="Spam and eggs" />
    </PyObject>

Moreover, notice this:

    >>> class ListandTuple:
    ...     def __init__(self):
    ...         self.tup = (1,2,3)
    ...         self.lst = [1,2,3]
    ...
    >>> lt = ListandTuple()
    >>> print yaml.dump(lt)
    --- !!__main__.ListandTuple
    lst:
        - 1
        - 2
        - 3
    tup:
        - 1
        - 2
        - 3

YAML is lossy in regard to Python types.  Some other languages supported
by the YAML format do not have the list/tuple distinction, but since
Python does, usually you want to keep it (which gnosis.xml.pickle does).

Yours, David...

--
mertz@  | The specter of free information is haunting the `Net!  All the
gnosis  | powers of IP- and crypto-tyranny have entered into an unholy
.cx     | alliance...ideas have nothing to lose but their chains.  Unite
        | against "intellectual property" and anti-privacy regimes!
-------------------------------------------------------------------------






More information about the Python-list mailing list