[SciPy-User] return "full_output" or how to stop throwing away already calculated results

josef.pktd at gmail.com josef.pktd at gmail.com
Tue Mar 9 23:54:15 EST 2010


On Tue, Mar 9, 2010 at 10:37 PM, Rob Falck <robfalck at gmail.com> wrote:
> Returning an object would be my preference as well, it seems more
> pythonic.  Most optimizers should be able to return at a minimum
>
> result.x
> result.objfun
> result.iterations
> result.exit_status
>
> Some optimizers have other useful data to return, such as the Lagrange
> multiplers from slsqp.  Going down that path means probably breaking
> current implementations of the optimizers, but doing it right would be
> worth it, in my opinion.  We should also agree upon names for the
> common attributes.

If we create new functions, as David argued, then the old signature
could still be kept.

I was curious what the increase in call overhead is, and for an
essentially empty function it would be up to 60%. However, since this
is mainly for functions that do heavier work, this will not be really
relevant. (It might add a little bit to my Monte Carlos or bootstrap,
but this should be negligible), and we will gain if we don't have to
redo some calculations.

The main reason I liked the full output option instead of always
returning a result instance, is that, if full_output is true, then
additional results can be calculated, that I don't want when I just
need a fast minimal result.

Josef

>
> On Tue, Mar 9, 2010 at 8:56 PM, David Warde-Farley <dwf at cs.toronto.edu> wrote:
>>
>> On 9-Mar-10, at 4:01 PM, josef.pktd at gmail.com wrote:
>>
>>> in statsmodels, I am switching more to the pattern when
>>> full_output=True (or similar keyword) for a function, then all
>>> intermediate results are returned, attached to a generic class
>>> instance ( or could be a struct in matlab or a bunch for Gael,...)
>>
>> David Cournapeau pointed out (in the thread "anyone to look at #1402?"
>> on the numpy-discussion list) that the wider Python community frowns
>> on the type of the returned value depending on a boolean flag
>> argument, preferring instead different function names that call some
>> common (private) helper function. It provided some validation for my
>> uncomfortable feelings about scipy.optimize methods that do this.
>>
>> I do think it's worth thinking about returning some sort of proxy
>> object that can have attributes set to either 'None' or the
>> appropriate values. It would certainly make code more readable (except
>> in the situation where you use argument unpacking, but even then it
>> isn't obvious to an outsider how crucial that full_output thing is).
>>
>> David
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>
>
>
> --
> - Rob Falck
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------


import numpy as np

class Store(object):
    pass

def function0():
    y = 0
#    x = np.random.randn(20,20)
#    y = np.linalg.inv(x)
#    y = np.corrcoef(x)
    return y

def function1():
    result = Store()
    result.a = np.arange(5)
    result.b = function0()
    return result

def function2():
    return np.arange(5), function0()
    
def function3():
    result = function1()
    return result.a, result.b

def function4(full_output=True):
    if full_output:
        result = Store()
        result.a = np.arange(5)
        result.b = function0()
        return result
    else:
        return np.arange(5), function0()

if __name__=='__main__':
    from timeit import Timer
    n = 10000
    t = Timer("r = function1()", "from __main__ import np,Store,function1")
    t1 = t.timeit(n)
    print 'function1', t1
    t = Timer("r = function2()", "from __main__ import np,Store,function2")
    t2 = t.timeit(n)
    print 'function2', t2
    t = Timer("r = function3()", "from __main__ import np,Store,function3")
    t3 = t.timeit(n)
    print 'function3', t3, t3/t2
    t = Timer("r = function1()", "from __main__ import np,Store,function1")
    t1 = t.timeit(n)  #repeated
    print 'function1', t1, t1/t2
    t = Timer("r = function4()", "from __main__ import np,Store,function4")
    t4a = t.timeit(n)
    print 'function4a', t4a, t4a/t2
    t = Timer("r = function4(full_output=0)", 
              "from __main__ import np,Store,function4")
    t4b = t.timeit(n)
    print 'function4b', t4b, t4b/t2


More information about the SciPy-User mailing list