[pypy-dev] Re: Mixed modules for both PyPy and CPython
VanL
"news-8a9e0fd91190ca" at northportal.net
Sun Apr 16 01:38:30 CEST 2006
Hello,
holger krekel wrote:
>> Second, comments on py3k list indicated that secure python is difficult
>> because of a) introspection, b) type inference, and c) GIL acquisition.
>
> Hum, this list looks a bit weird to me. Could you state what
> the actual attacks are for which security measures are discussed?
> Or which use cases are people on py3k having in mind?
This is an amalgam of several different posts (and maybe different
threads) but here goes:
In the thread "Will we have a true restricted exec environment for
python 3000," Vineet Jain asked for a restricted mode which would
"1. Limit the memory consumed by the script
2. Limit access to file system and other system resources
3. Limit cpu time that the script will take
4. Be able to specify which modules are available for import."
In responses to that request, various people commented on the
difficulties of implementing such a restricted mode. On that thread,
several people had the same idea I had, to try to use PyPy for this
purpose - however, it didn't look like many people were up-to-date
reading both lists (and thus familiar-ish with PyPy's execution model).
A) Introspection
Nick Coghlan stated that:
"I'm interested, but I'm also aware of how much work it would be. I'm
disinclined to trust any mechanism which allows the untrusted code to
run in the same process, as the implications of being able to do:
self.__class__.__mro__[-1].__subtypes__()
are somewhat staggering, and designing an in-process sandbox to cope
with that is a big ask (and demonstrating that the sandbox actually
*achieves* that goal is even tougher)."
Vineet volunteered with a proposal to start a "light" python
subinterpreter, which would be controlled by the main interpreter.
Nick countered, "But will it allow you to use numbers or strings?
If yes, then you can get to object(), and hence to pretty much whatever
C builtins you want. So its not enough to try to hide dangerous builtins
like file(), you want to remove them from the light version entirely
(routing all file system and network access requests through the main
application). But if the file objects are gone, what happens to the
Python machinery that relies on them (like import)?
Python's powerful introspection is a severe drawback from a security POV
- it is *really* hard to make a user stay in a box you put them in
without crippling some part of the language as a side effect."
Thus, in CPy, allowing someone to access a C type effectively opens up
all the C types. In PyPy, however, each type is effectively in its own
box. Further, PyPy already has a structure that can deal with these
sorts of accesses: the flowgraph. Operations in PyPy come about because
of traversals of the graph - certain branches of the graph could be
restricted or proxied out to a trusted interpreter.
B) GIL Acquisition
Another person suggested leveraging the multiple subinterpreter code
which already exists in CPython to create a restricted-exec interpreter.
MvL noted that GIL acquisition made that difficult:
"Part of the problem is that it doesn't really work. Some objects *are*
shared across interpreters, such as global objects in extension modules
(extension modules are initialized only once). I believe that the GIL
management code (for acquiring the GIL out of nowhere) breaks if there
are multiple interpreters."
C) Type inference
I tried to find the thread for this one - its not from the Py3K list -
but I recall a couple years ago someone attempting to make an rexec
version of python. One of the comments that I recall from that
discussion had to do with understanding what types were being
manipulated. I believe there was an example somewhat like
operator.add is trusted
class A:
def __add__(self, other):
... something evil here ...
a, b = A(), 1
a + b
[something evil happens]
However, this is a foggy memory that I have so far been unable to
substantiate.
Thanks,
VanL
More information about the Pypy-dev
mailing list