list comprehension return a list and sum over in loop

Fri Dec 12 04:45:08 EST 2014

Peter Otten於 2014年12月12日星期五UTC+8下午5時13分58秒寫道：
> KK Sasa wrote:
> 
> > Mark Lawrence於 2014年12月12日星期五UTC+8下午3時17分43秒寫道：
> >> On 12/12/2014 06:22, KK Sasa wrote:
> >> > Hi there,
> >> >
> >> > The list comprehension is results = [d2(t[k]) for k in xrange(1000)],
> >> > where d2 is a function returning a list, say [x1,x2,x3,x4] for one
> >> > example. So "results" is a list consisting of 1000 lists, each of
> >> > length four. Here, what I want to get is the sum of 1000 lists, and
> >> > then the result is a list of length four. Is there any efficient way to
> >> > do this? Because I found it is slow in my case. I tried sum(d2(t[k])
> >> > for k in xrange(1000)), but it returned error: TypeError: unsupported
> >> > operand type(s) for +: 'int' and 'list'. Thanks.
> >> >
> >> 
> >> I think you need something like this
> >> http://stackoverflow.com/questions/19339/a-transpose-unzip-function-in-python-inverse-of-zip
> >> 
> >> I'll let you add the finishing touches if I'm correct :)
> >> 
> >> --
> >> My fellow Pythonistas, ask not what our language can do for you, ask
> >> what you can do for our language.
> >> 
> >> Mark Lawrence
> > 
> > Hi Mark and Yotam,
> >   Thanks for kind reply. I think I didn't make my problem clear enough.
> >   The slow part is "[d2(t[k]) for k in xrange(1000)]". In addition, I
> >   don't need to construct a list of 1000 lists inside, but my aim is to
> >   get the sum of all "d2(t[k])". I wonder if there is any method to sum up
> >   efficiently.
> 
> If that is slow the culprit is probably the d2() function. If so 
> 
> results = [0] * 4
> for k in xrange(1000):
>     for i, v in enumerate(d2(t[k])):
>         results[i] += v
> 
> won't help. Can you tell us what's inside d2()?

Thanks for reply, Christian and Peter. Actually, the d2() is the Hessian function of a simple function (derived by using "ad" package, http://pythonhosted.org//ad/). The package hasn't supported Numpy array so far. And I am still not how to take a look inside d2(). Because I have to use d2() in a heavy way for iterative algorithm, the efficiency is a issue in my case. Following is an example.

import scipy
from scipy import stats
import numpy
from ad import adnumber
from ad.admath import *
from ad import jacobian
from ad import gh  # the gradient and hessian functions generator
from ad import *
import time
people = 1000
range_people = xrange(people)
dim0 = 2; mean0 = [0,0]; cov0 = [[1,0],[0,1]]
seed([1])
t = stats.multivariate_normal.rvs(mean0,cov0,people)
t = t.reshape(people,dim0)
t = t.tolist() # back to list
x = [0, 0, 1, 2]
point = 2
def p(x,t,point,z,obs):    
    d = x[0]
    tau = [0]+[x[1:point]] 
    a = x[point:len(x)]
    at = sum(i*j for i, j in zip(a, t))
    nu = [exp(z[k]*(at-d)-sum(tau[k])) for k in xrange(point)]
    de = sum(nu, axis=0)
    probability = [nu[k]/de for k in xrange(point)]
    return probability[obs]

d1, d2 = gh(p)    
tStart = time.time()
z = range(point)
re = [d2(x,t[k],2,z,1) for k in range_people]
tEnd = time.time()
print "It cost %f sec" % (tEnd - tStart)