enumerate XML tags (keys that will become headers) along with text (values) and write to CSV in one row (as opposed to "stacked" values with one header)

Marko Rauhamaa marko at pacujo.net
Tue Jun 30 12:32:06 EDT 2015


Robert Kern <robert.kern at gmail.com>:

>> <tuple>
>>    <item>tuplemember_1</item>
>>    <item>tuplemember_2</item>
>> ....
>>    <item>tuplemember_n/item>
>> </tuple>
>
> [...]
>
> Yes! Any conforming XML implementation will preserve the order.

And not only the order: the newlines and other whitespace around the
<item>s are also preserved as children of <tuple>.

> And if you are completely blind as to the schema as the OP apparently
> is, and you are simply given a load of XML and told to do "something"
> with it, you don't know if any given collection is meant to be ordered
> or unordered. Of course, the only sensible thing to do is just
> preserve the order given to you as that is what the semantics of XML
> requires of you in the absence of a schema that says otherwise.

XML is an unfortunate creation. You cannot fully parse it without
knowing the schema (or document type). For example, these constructs
might or might not be equivalent:

    <stuff>å</stuff>

    <stuff>å</stuff>

That is because &...; entities are defined in the document type.

Similarly, these two constructs might or might not be equivalent:

   <stuff greet="hello"/>

   <stuff greet=" hello "/>

By default, they would be, but the document type can declare whitespce
significant around an attribute value.

Finally, these two constructs might or might not be equivalent:

   <count> 7 </count>
   <count>7</count>

By default, they wouldn't be, but the document type can declare
whitespace *insignificant* around element contents.

Sigh, S-expressions would have been a much better universal data format.


Marko



More information about the Python-list mailing list