Sandboxed Python: memory limits?

Thu Apr 7 14:59:59 EDT 2011

On Fri, Apr 8, 2011 at 4:36 AM, David Bolen <db3l.net at gmail.com> wrote:
> Just wondering, but rather than spending the energy to cap Python's
> allocations internally, could similar effort instead be directed at
> separating the "other things" the same process is doing?  How tightly
> coupled is it?  If you could split off just the piece you need to
> limit into its own process, then you get all the OS tools at your
> disposal to restrict the resources of that process.

Well, what happens is roughly this:

Process begins doing a lengthy operation.
Python is called upon to generate data to use in that.
C collects the data Python generated, reformats it, stores it in
database (on another machine).
C then proceeds to use the data, further manipulating it, lots of
processing that culminates in another thing going into the database.

The obvious way to split it would be to send it to the database twice,
separately, as described above (the current code optimizes it down to
a single INSERT at the bottom, keeping it in RAM until then). This
would work, but it seems like a fair amount of extra effort (including
extra load on our central database server) to achieve what I'd have
thought would be fairly simple.

I think it's going to be simplest to use a hardware solution - throw
heaps of RAM at the boxes and then just let them do what they like. We
already have measures to ensure that one client's code can't "be evil"
repeatedly in a loop, so I'll just not worry too much about this
check. (The project's already well past its deadlines - mainly not my
fault!, and if I tell my boss "We'd have to tinker with Python's
internals to do this", he's going to put the kybosh on it in two
seconds flat.)

Thanks for the information, all!

Chris Angelico