XML overuse? (was Re: Python to XML to Python conversion)

Jonathan Hogg jonathan at onegoodidea.com
Sun Jul 14 04:18:52 EDT 2002


On 14/7/2002 2:46, in article agql4s$np5pt$1 at ID-125932.news.dfncis.de,
"Christopher Browne" <cbbrowne at acm.org> wrote:

> Furthermore, what I'm trying to have as an underlying "theme" in the
> "music" is that it might very well be easier to build a Lex/Yacc
> grammar using C and link it in than to fight your way through
> designing the XML-based system.  There may be more "elegant" options
> than Lex/Yacc; the underlying theme is that that doesn't prevent the
> "design-the-grammar-from-scratch" approach from being more manageable
> than XML.

Do you find using XML-parsing libraries a fight? Certainly in Python I have
found using the XML libraries to be astonishingly simple.

The thing is, using lex and yacc (or whatever your favourite parsing
framework may be) may possibly be easier if you're already comfortable with
it. But in order to use them you need to design a syntax. The resulting
syntax is useful only to programs that have a parser for that syntax. With
XML, the data can be imported, queried, and translated by anyone or any
program with a basic XML toolkit.

As I tried to show before, with tools like XPath, one can extract useful
information out of any XML file.

And to answer François' earlier point about not being able to use
"standardised" meaningfully with regard to XML. I consider XML to be
"standardised" not because the W3C said so, but because parsing, validating,
querying, and transforming frameworks are available for nearly any language
off-the-shelf, editors support it, and database and data manipulation tools
support it.

I'm afraid Pickle doesn't come close in this regard (and isn't human
readable anyway). CSV is probably closer but it doesn't support complex
enough structure for me. ASN.1 might be a contender but also isn't human
readable and doesn't have the same availability of tools. And certainly, any
random syntax I might come up with will have support no further than I
write.

People like to scoff at so-called "Enterprise" computing, but in large
organisations every new file format that someone comes up with represents a
new maintenance headache. I don't want to have to reverse engineer file
formats and come up with custom parsers every time I need to make two
different systems interoperate.

Jonathan




More information about the Python-list mailing list