XML Considered Harmful

Chris Angelico rosuav at gmail.com
Thu Sep 23 09:27:13 EDT 2021


On Thu, Sep 23, 2021 at 10:55 PM Mats Wichmann <mats at wichmann.us> wrote:
>
> On 9/22/21 10:31, Dennis Lee Bieber wrote:
>
> >       If you control both the data generation and the data consumption,
> > finding some format  ...
>
> This is really the key.  I rant at people seeming to believe that csv is
> THE data interchange format, and it's about as bad as it gets at that,
> if you have a choice.  xml is noisy but at least (potentially)
> self-documenting, and ought to be able to recover from certain errors.
> The problem with csv is that a substantial chunk of the world seems to
> live inside Excel, and so data is commonly both generated in csv so it
> can be imported into excel and generated in csv as a result of exporting
> from excel, so the parts often are *not* in your control.
>
> Sigh.

The only people who think that CSV is *the* format are people who
habitually live in spreadsheets. People who move data around the
internet, from program to program, are much more likely to assume that
JSON is the sole format. Of course, there is no single ultimate data
interchange format, but JSON is a lot closer to one than CSV is.

(Or to be more precise: any such thing as a "single ultimate data
interchange format" will be so generic that it isn't enough to define
everything. For instance, "a stream of bytes" is a universal data
interchange format, but that's not ultimately a very useful claim.)

ChrisA


More information about the Python-list mailing list