Converting a hex string to a number

Wed Jul 10 03:11:51 EDT 2002

Matt Gerrans wrote:

>> Using exec or eval without explicit dictionaries _is_ dangerous, if
>> you can't absolutely trust the data.
> 
> Hey Gerhard, can you elaborate a bit on explicit dictionaries or refer me
> to some reference material on this?   (I did some searching on Python.org,
> but only found a few offhand references).

It's not very hard, and probably doesn't deserve much more than
offhand references.  If you just call eval or use the exec statement
with a string or codeobject argument, e.g.:

        exec somestring

whatever statements somestring includes will trample over your
local and global namespaces.  That's invariably a bad idea, with
sundry downsides even apart from security -- for example, any
function including this statement will slow down badly, because
the compiler knows it doesn't know which variables are local
(the statements in somestring could change things...).  Debugging
becomes near-impossible since anything that happens after the
exec is a mystery -- you can't count on ANY variable or function
name meaning what you think it means any more.  Etc, etc.

The solution is very very simple:

    fakelocals = {}
    exec something in fakelocals

that's it!  Now the statements in 'something' affect dictionary
fakelocals instead of your real locals, there is no huge slowdown,
etc, etc.  This isn't secure, but it's still a darn sight better
than exec without an explicit dictionary (I think Python would be
a better language if exec had a MANDATORY 'in' clause).  Anyway,
whatever names 'something' binds you'll find as items in the
fakelocals dictionary after the exec statement.

For relative security, see the rexec module in the library docs
(there's also a howto about it).  It's not solid as a rock, but
still better than what you could kludge up yourself.  You build
up  a sandbox and run untrusted code in it -- untrusted code has
a few key limitations about introspection / metaprogramming,
plus any you want to place on what modules it can use, what
entries from module sys, what it can do regarding I/O, etc, etc.

You normally use a try/except statement around rexec uses, since
all attempts at security violation are diagnosed by exceptions
and you normally do want to catch those exceptions and do
something about it.

Things aren't all that different regarding the eval builtin
function rather than the exec statement:

    result = eval(something, fakelocals)

there are a bit fewer issues with eval than with exec, but it
doesn't take much to bypass the "can only do expressions"
limit, alas.  E.g., string something could easily rebind any
of your variables, even though it's an expression...:

>>> eval('[x for x in (23,)]')
[23]
>>> x
23

So, it's still better to do:

>>> fakelocals = {}
>>> eval('[x for x in (23,)]', fakelocals)
[23]
>>> fakelocals['x']
23

If you look at fakelocals at this point you'll notice Python
has also inserted in it a key '__builtins__' and under it the
dict of the built-ins module -- to get restricted execution
you have to put a controlled version of built-ins in said
dictionary.  But use rexec instead, it's easier and better.

Both with exec and with eval you may also choose to pass TWO
dictionaries -- then, the first is used as fake globals
(including the builtins stuff), the second as fake locals
(for rebinding names, etc).

Alex