XML overuse? (was Re: Python to XML to Python conversion)
holger krekel
pyth at devel.trillke.net
Tue Jul 16 09:18:30 EDT 2002
Huaiyu Zhu wrote:
> holger krekel <pyth at devel.trillke.net> wrote:
> >Huaiyu Zhu wrote:
> >> Readability for machines does not have to come at the expense of readability
> >> for humans. A few years back I experimented with an indentation based data
> >> format that is:
> >>
> >> - as readable as emacs's outline mode
> >> - reduce to common conventions like this paragraph for simple cases
> >> - allow mixed nested structures of set, sequence, dictionary, and seqdict
> >> - can include binary data
> >> - can handle different encodings/encryptions in different elements
> >> - with average less than 5% bloat, in contrast to XML's over 100% bloat
> >
> >do you have any code or design documents for this?
> >
> >Sounds quite interesting.
>
> The basic idea is quite simple: consider a data structure as a tree; denote
> the type of branching at each node; indent the subtrees. It appears to me
> that indentation is easier to handle than quotes and escapes. Here's a
> simple example:
>
> ...snipped...
>
> OK, hope this makes sense.
It does and it's very interesting. It does sound a lot like
http://yaml.org to me, though (They even have an RFC).
Don't you think YAML might be a superset of your ideas?
Let me add some random thoughts/questions about your/yaml's scheme
(i hope i am not missing something obvious):
- how is a binary data-stream's size determined? What about
open-ended streams? Embedding of arbitrary data-streams
is very useful (IMO).
- somehow your and yaml's scheme remind me of todays wiki techniques.
E.g. Wikis have methods of sequence-detection (bullets ...) and they
have a commitment to readability. Of course, they are generally more
concerned with graphical views than with beeing a concise persistence scheme.
- Is there a canonical conversion between XML and your scheme/YAML?
Shouldn't be too hard, anyway...
- how do you express external addresses akin XPATH?
Ideas:
- Mappings are easy, just take the 'key'.
- Sequences are easy (take the sequence number) but not very robust
to deletions and insertions of items.
- tag-names (IDs) which can be associated with any item might be interesting.
readability is likely to suffer, probably.
btw, I wonder whether some form of your and/or YAML's ideas should play a
role in the new persistence-SIG. While the actual persistence mappings
are not in the focus there are certainly some interesting connections
between the two areas.
> If this is still interesting I'll dig the thing
> out. I have documents and code (perl and python) at home, but I'll have to
> ...
this sure is useful. Especially for me since i work with a (perl-)
friend on a project which needs to address the persistence-question. And
we want to have it interoperable, simple and fast. I guess looking
at YAML might avoid that you have to dig too much into old harddisks :-)
holger
More information about the Python-list
mailing list