[Numpy-discussion] vectorizing

Fri Jun 5 15:53:17 EDT 2009

On Fri, Jun 5, 2009 at 2:07 PM, Brian Blais <bblais at bryant.edu> wrote:
> Hello,
> I have a vectorizing problem that I don't see an obvious way to solve.  What
> I have is a vector like:
> obs=array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2])
> and a matrix
> T=zeros((6,6))
> and what I want in T is a count of all of the transitions in obs, e.g.
> T[1,2]=3 because the sequence 1-2 happens 3 times,  T[3,4]=1 because the
> sequence 3-4 only happens once, etc...  I can do it unvectorized like:
> for o1,o2 in zip(obs[:-1],obs[1:]):
>     T[o1,o2]+=1
>
> which gives the correct answer from above, which is:
> array([[ 0.,  0.,  0.,  0.,  0.,  0.],
>        [ 0.,  0.,  3.,  0.,  0.,  1.],
>        [ 0.,  3.,  0.,  1.,  0.,  0.],
>        [ 0.,  0.,  2.,  0.,  1.,  0.],
>        [ 0.,  0.,  0.,  2.,  0.,  0.],
>        [ 0.,  0.,  0.,  0.,  1.,  0.]])
>
>
> but I thought there would be a better way.  I tried:
> o1=obs[:-1]
> o2=obs[1:]
> T[o1,o2]+=1
> but this doesn't give a count, it just yields 1's at the transition points,
> like:
> array([[ 0.,  0.,  0.,  0.,  0.,  0.],
>        [ 0.,  0.,  1.,  0.,  0.,  1.],
>        [ 0.,  1.,  0.,  1.,  0.,  0.],
>        [ 0.,  0.,  1.,  0.,  1.,  0.],
>        [ 0.,  0.,  0.,  1.,  0.,  0.],
>        [ 0.,  0.,  0.,  0.,  1.,  0.]])
>
> Is there a clever way to do this?  I could write a quick Cython solution,
> but I wanted to keep this as an all-numpy implementation if I can.
>

histogram2d or its imitation, there was a discussion on histogram2d a
short time ago

>>> obs=np.array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2])
>>> obs2 = obs - 1
>>> trans = np.hstack((0,np.bincount(obs2[:-1]*6+6+obs2[1:]),0)).reshape(6,6)
>>> re = np.array([[ 0.,  0.,  0.,  0.,  0.,  0.],
...         [ 0.,  0.,  3.,  0.,  0.,  1.],
...         [ 0.,  3.,  0.,  1.,  0.,  0.],
...         [ 0.,  0.,  2.,  0.,  1.,  0.],
...         [ 0.,  0.,  0.,  2.,  0.,  0.],
...         [ 0.,  0.,  0.,  0.,  1.,  0.]])
>>> np.all(re == trans)
True

>>> trans
array([[0, 0, 0, 0, 0, 0],
       [0, 0, 3, 0, 0, 1],
       [0, 3, 0, 1, 0, 0],
       [0, 0, 2, 0, 1, 0],
       [0, 0, 0, 2, 0, 0],
       [0, 0, 0, 0, 1, 0]])


or

>>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]])
>>> re
array([[ 0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  3.,  0.,  0.,  1.],
       [ 0.,  3.,  0.,  1.,  0.,  0.],
       [ 0.,  0.,  2.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  2.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  1.,  0.]])
>>> h
array([[ 0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  3.,  0.,  0.,  1.],
       [ 0.,  3.,  0.,  1.,  0.,  0.],
       [ 0.,  0.,  2.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  2.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  1.,  0.]])

>>> np.all(re == h)
True