easy eval() fix?

John Roth newsgroups at jhrothjr.com
Wed Oct 15 21:44:28 EDT 2003


"Geoff Gerrietts" <geoff at gerrietts.net> wrote in message
news:mailman.130.1066265600.2192.python-list at python.org...
> Quoting John Roth (newsgroups at jhrothjr.com):
> >
> > I don't know of a module that does this, but I'm not altogether
> > certain it wouldn't be possible to put something together that would
> > suit what you need in around the same time it took to write the
> > message.
>
> You might be surprised how quickly I type. ;)
>
> > What are the primitive types you need to convert from repr() string
> > format back to their object format?
>
> Literal statements.
>
> A list of integers:
>   [1, 2, 3, 4]
> A list of strings:
>   ['1', '2', '3', '4']
> A string/string dict:
>   {'a': 'b', 'c': 'd'}
>
> Imagine the variations; they are plentiful.
>
> On the other hand, anything that actually performs "operations" is not
> permissible.
>
> On the other hand, an error case:
>   [10 ** (10 *** 10)]
>
> This should not, for instance, choke the box for a day evaluating the
> expression; it should (probably) throw an exception but any scenario
> that does not allow the code to chew CPU time is a win over eval().
>
> Also, eval and exec do all their work inside a namespace where names
> get resolved to bound objects etcetera. That's not desirable. Nor is
> it desirable to permit an object to be called.
>
> What I'm interested in -- what eval seems most used for, certainly in
> this project -- is a general-purpose tool for transforming a string
> containing a literal statement into the Python data structure.
>
> I toyed with using the parser module to do this. I still may try to do
> that, but I don't know enough about ASN parse trees to understand why
> so many apparently unrelated symbols show up in the parse tree, and so
> I'm reluctant to start down this road without an ample budget of time
> to come to an understanding of such things.
>
> I don't have that ample budget of time in my project schedule, so I
> thought I would check to see if there was a quick fix available.

Are the strings allowed to contain commas? Are the structures
allowed to contain embedded structures? If neither of those is
true, it's relatively easy to crack the input and build a result.

If you have to handle strings with embedded commas (or colons)
and also recursive structures, you'll need a finite state machine.
It's not a particularly difficult one to handle since there are only
8 symbols ([, ], {, }, ,, :, ', ") that have to be handled. Everything
else is just a literal symbol that's either the string that comes out
of the FSM, or it can be fed into int() or float().

It might even be easier to verify that the input string only
contains those special characters, plus strings, and then feed
it into exec the way you're doing now. That would satisfy the
security concern by verifying that the input can't cause any
harm.

John Roth

>
> Thanks,
> --G.






More information about the Python-list mailing list