[Python-ideas] Memory limits [was Re: Membership of infinite iterators]

Nick Coghlan ncoghlan at gmail.com
Wed Oct 18 08:43:57 EDT 2017


On 18 October 2017 at 21:38, Steven D'Aprano <steve at pearwood.info> wrote:

> > But should it be fixed in list or in count?
>
> Neither. There are too many other places this can break for it to be
> effective to try to fix each one in place.
>

> e.g. set(xrange(2**64)), or tuple(itertools.repeat([1]))
>

A great many of these call operator.length_hint() these days in order to
make a better guess as to how much memory to pre-allocate, so while that
still wouldn't intercept everything, it would catch a lot of them.


> Rather, I think we should set a memory limit that applies to the whole
> process. Once you try to allocate more memory, you get an MemoryError
> exception rather than have the OS thrash forever trying to allocate a
> terabyte on a 4GB machine.
>
> (I don't actually understand why the OS can't fix this.)
>

Trying to allocate enormous amounts of memory all at once isn't the
problem, as that just fails outright with "Not enough memory":

    >>> data = bytes(2**62)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    MemoryError

The machine-killing case is repeated allocation requests that the operating
system *can* satisfy, but require paging almost everything else out of RAM.
And that's exactly what "list(infinite_iterator)" entails, since the
interpreter will make an initial guess as to the correct size, and then
keep resizing the allocation to 125% of its previous size each time it
fills up (or so - I didn't check the current overallocation factor) .

Per-process memory quotas *can* help avoid this, but enforcing them
requires that every process run in a resource controlled sandbox. Hence,
it's not a coincidence that mobile operating systems and container-based
server environments already work that way, and the improved ability to cope
with misbehaving applications is part of why desktop operating systems
would like to follow the lead of their mobile and server counterparts :)

So here is my suggestion:
>
> 1. Let's add a function in sys to set the "maximum memory" available,
> for some definition of memory that makes the most sense on your
> platform. Ordinary Python programmers shouldn't have to try to decipher
> the ulimit interface.
>

Historically, one key reason we didn't do that was because the `PyMem_*`
APIs bypassed CPython's memory allocator, so such a limit wouldn't have
been particularly effective.

As of 3.6 though, even bulk memory allocations pass through pymalloc,
making a Python level memory allocation limit potentially more viable
(since it would pick up almost all of the interpeter's own allocations,
even if it missed those in extension modules):
https://docs.python.org/dev/whatsnew/3.6.html#optimizations

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20171018/b81ac71a/attachment-0001.html>


More information about the Python-ideas mailing list