XML Considered Harmful

alister alister.ware at ntlworld.com
Tue Sep 21 18:30:40 EDT 2021


On Tue, 21 Sep 2021 14:22:52 -0500, Michael F. Stemper wrote:

> On 21/09/2021 13.49, alister wrote:
>> On Tue, 21 Sep 2021 13:12:10 -0500, Michael F. Stemper wrote:
>> 
>>> On the prolog thread, somebody posted a link to:
>>> <https://dirtsimple.org/2004/12/python-is-not-java.html>
>>>
>>> One thing that it tangentially says is "XML is not the answer."
>>>
>>> I read this page right when I was about to write an XML parser to get
>>> data into the code for a research project I'm working on.
>>> It seems to me that XML is the right approach for this sort of thing,
>>> especially since the data is hierarchical in nature.
>>>
>>> Does the advice on that page mean that I should find some other way to
>>> get data into my programs, or does it refer to some kind of
>>> misuse/abuse of XML for something that it wasn't designed for?
>>>
>>> If XML is not the way to package data, what is the recommended
>>> approach?
>> 
>> 1'st can I say don't write your own XML parser, there are already a
>> number of existing parsers that should do everything you will need. 
>> This is a wheel that does not need re-inventing.
> 
> I was going to build it on top of xml.etree.ElementTree
> 
so not writing a parser, using one, that's ok

>> 2nd if you are not generating the data then you have to use whatever
>> data format you are supplied
> 
> It's my own research, so I can give myself the data in any format that I
> like.
> 
>> as far as I can see the main issue with XML is bloat, it tries to do
>> too many things & is a very verbose format, often the quantity of
>> mark-up can easily exceed the data contained within it.
>> 
>> other formats such a JSON & csv have far less overhead, although again
>> not always suitable.
> 
> I've heard of JSON, but never done anything with it.
the python json library makes it simple.
it was originally invented for javascript, it looks very much like the 
repl for a list/dictionary but if you are using std libraries you don't 
really need to know except for academic interst
> 
> How does CSV handle hierarchical data?
It dosn't, if you have heirachiacl data it is not a suitable format
> For instance, I have
> generators[1], each of which has a name, a fuel and one or more
> incremental heat rate curves. Each fuel has a name, UOM, heat content,
> and price. Each incremental cost curve has a name, and a series of
> ordered pairs (representing a piecewise linear curve).
> 
> Can CSV files model this sort of situation?
> 
>> As in all such cases it is a matter of choosing the most apropriate
>> tool for the job in hand.
> 
> Naturally. That's what I'm exploring.
> 
> 
> [1] The kind made of tons of iron and copper, filled with oil, and
> rotating at 1800 rpm.





-- 
Riches cover a multitude of woes.
		-- Menander


More information about the Python-list mailing list