GMPY: divm() memory leak revisited

mensanator at aol.com mensanator at aol.com
Sat Nov 5 00:06:58 EST 2005


mensanator at aol.com wrote:
> Since the following discussion took place (unresolved),
>
> <http://groups.google.com/group/comp.lang.python/browse_frm/thread/c3bd08ef3e4c478a/2b54deb522c9b9d9?lnk=st&q=divm()+memory+leak+group:comp.lang.python+author:mensanator&rnum=1&hl=en#2b54deb522c9b9d9>
>
> I've kept it in the back of my mind as I've been
> learning to use the base gmp library in c. Now I believe
> I know what the problem is.
>
> First, divm() is not a gmp function. It is a derived
> function created in gmpy. It is derived from the
> gmp invert() function (which I know from my testing
> does not leak memory). So it's a gmpy specific bug.
>
> Second, I learned from the gmp c library that temporary
> mpz objects must be freed to prevent memory leak. Aha!
> The gmpy source code is probably not freeing some temporary
> mpz it created.
>
> Third, we have the smoking gun:
>
>   divm(a,b,m): returns x such that b*x==a modulo m,
>                or else raises a ZeroDivisionError
>                exception if no such value x exists
>                (a, b and m must be mpz objects,
>                 or else get coerced to mpz)
>
> Of course, to "coerce" means to create temporary variables
> to pass to the gmp library. It would appear that these
> temporary variables are not being freed. Now if I'm right,
> then I can prove this by eliminating the need to coerce
> the operands by passing mpz's to the divm() function.
>
> #
> # if the parameters are already mpz's...
> #
> z = gmpy.mpz(81287570543)
> x = gmpy.mpz(8589934592)
> y = gmpy.mpz(3486784401)
> tot = 0
> while True:
>    n = input('How many more divm: ')
>    if n<=0: break
>    print '%d more...' % n,
> #
> #   ...then they won't need to be coerced
> #
>    for i in xrange(n): gmpy.divm(z,x,y)
>    tot += n
>    print '...total %d' % tot
>
>
> With coercing, I get
>
> C:\Python23\user\the_full_monty>python gmpytest.py
> How many more divm: 10000000
> 10000000 more...Fatal Python error: mp_allocate failure
> abnormal program termination
> peak Commit Charge (K): 792556
>
> Without needing to coerce, the test ran to completion with
> flat memory usage.
>
> Unfortunately, c is still somewhat greek to me, but even so,
> the problem appears obvious.
>
> static PyObject *
> Pygmpy_divm(PyObject *self, PyObject *args)
> {
>     PympzObject *num, *den, *mod, *res;
>     if(!PyArg_ParseTuple(args, "O&O&O&",
>         Pympz_convert_arg, &num,
>         Pympz_convert_arg, &den,
>         Pympz_convert_arg, &mod))
>     {
>         return last_try("divm", 3, 3, args);
>     }
>     if(!(res = Pympz_new()))
>         return NULL;
>     if(mpz_invert(res->z, den->z, mod->z)) { /* inverse exists */
>         mpz_mul(res->z, res->z, num->z);
>         mpz_mod(res->z, res->z, mod->z);
>         if(options.ZM_cb && mpz_sgn(res->z)==0) {
>             PyObject* result;
>             if(options.debug)
>                 fprintf(stderr, "calling %p from %s for %p %p %p %p\n",
>                     options.ZM_cb, "divm", res, num, den, mod);
>             result = PyObject_CallFunction(options.ZM_cb, "sOOOO",
> "divm",
>                                            res, num, den, mod);
>             if(result != Py_None) {
>                 Py_DECREF((PyObject*)res);
>                 return result;
>             }
>         }
>         return (PyObject*)res;
>     } else {
>         PyObject* result = 0;
>         if(options.ZD_cb) {
>             result = PyObject_CallFunction(options.ZD_cb,
>                 "sOOO", "divm", num, den, mod);
>         } else {
>             PyErr_SetString(PyExc_ZeroDivisionError, "not invertible");
>         }
>         Py_DECREF((PyObject*)res);
>         return result;
>     }
> }
>
> Note that 4 PympzObjects get created but only res gets passed to
> Py_DECREF (which seems to be the method by which it's freed, not
> 100% sure about this). But I notice that other functions that
> coerce variables call Py_DECREF on each of the coerced variables:
>
> static PyObject *
> Pygmpy_gcd(PyObject *self, PyObject *args)
> {
>     PympzObject *a, *b, *c;
>     TWO_ARG_CONVERTED("gcd", Pympz_convert_arg,&a,&b);
>     assert(Pympz_Check((PyObject*)a));
>     assert(Pympz_Check((PyObject*)b));
>     if(!(c = Pympz_new())) {
>         Py_DECREF((PyObject*)a); Py_DECREF((PyObject*)b);
>         return NULL;
>     }
>     mpz_gcd(c->z, a->z, b->z);
>     Py_DECREF((PyObject*)a); Py_DECREF((PyObject*)b);
>     return (PyObject*)c;
> }
>
> Here the PympzObject c is not freed because it is returned
> as the result of the function, but the coerced variables a and b
> are.
>
> So the fix may be to simply add
>
> Py_DECREF((PyObject*)num);
> Py_DECREF((PyObject*)den);
> Py_DECREF((PyObject*)mod);
>
> to divm(). (Assuming it's that simple, I could have overlooked
> something.)
>
> Unfortunately, I don't have any means of testing this theory.
>
> I did see the reference to
>
> <http://www.vrplumber.com/programming/mstoolkit>
>
> which I will be trying eventually, but in case I can't get that
> to work I wanted to have this posted in case someone else wants
> to take a crack at it.


I completely forgot to mention the fourth thing I discovered.

The linear congruence algorithm used in divm() is wrong. To wit:

>>> divm(6,12,14)

Traceback (most recent call last):
  File "<pyshell#4>", line 1, in -toplevel-
    divm(6,12,14)
ZeroDivisionError: not invertible

Sure, (12,14) is not invertible, but that is not a requirement for
solving a linear congruence. All that's required is that gcd(12,14)
divides 6, which it obviously does. The divm() function cannot
handle the case when b and m are not coprime. A properly
written linear congruence algorithm will work around the fact
that (12,14) is not invertible. To wit:

>>> linear_congruence(12,14,6)
mpz(4)

The correct algorithm can't solve a problem that's not solvable,
but it shouldn't let non-invertability cause an exception. Here's
the correct algorithm which divm() ought to incorprorate.

from gmpy import *
def linear_congruence(x,y,z):
    #
    # xa == z (mod y)
    #
    g = gcd(x,y)
    d = divmod(z,g)
    if d[1]==0:
        #
        # gcd(x,y) divides z, solution exists
        #
        if g==1:
            #
            # x,y coprime, modular inverse exists
            #
            a = invert(x,y)*z % y
        else:
            #
            # x,y not coprime, no modular inverse
            # ...but wait, if we get here g divides z and also
            # by definition, x & y, so divide x,y,z by g to create
            # a new congruence with x,y now coprime and invert() valid
            #
            x = x/g
            y = y/g
            z = z/g
            a = invert(x,y)*z % y
    else:
        #
        # g doesn't divide z, no solution
        #
        a = -1
    return a




More information about the Python-list mailing list