XML Considered Harmful

Chris Angelico rosuav at gmail.com
Thu Sep 23 23:49:04 EDT 2021


On Fri, Sep 24, 2021 at 1:44 PM Dan Stromberg <drsalists at gmail.com> wrote:
>
>
> On Thu, Sep 23, 2021 at 8:12 PM Chris Angelico <rosuav at gmail.com> wrote:
>>
>> One good hybrid is to take a subset of Python syntax (so it still
>> looks like a Python script for syntax highlighting etc), and then
>> parse that yourself, using the ast module. For instance, you can strip
>> out comments, then look for "VARNAME = ...", and parse the value using
>> ast.literal_eval(), which will give you a fairly flexible file format
>> that's still quite safe.
>
>
> Restricting Python with the ast module is interesting, but I don't think I'd want to bet my career on the actual safety of such a thing.  Given that Java bytecode was a frequent problem inside web browsers, imagine all the messiness that could accidentally happen with a subset of Python syntax from untrusted sources.
>
> ast.literal_eval might be a little better - or a list of such, actually.

Uhh, I specifically mention literal_eval in there :) Simple text
parsing followed by literal_eval for the bulk of it is a level of
safety that I *would* bet my career on.

> Better still to use JSON or ini format - IOW something designed for the purpose.

It all depends on how human-editable it needs to be. JSON has several
problems in that respect, including some rigidities, and a lack of
support for comments. INI format doesn't have enough data types for
many purposes. YAML might be closer, but it's not for every situation
either.

That's why we have options.

ChrisA


More information about the Python-list mailing list