[Numpy-discussion] Run length encoding of an ndarray

Tue Oct 2 08:36:02 EDT 2007

I am trying to do a type of run-length encoding of a 2D array by axis. I 
have an array of values arranged along two axes, state and position. 
These are many (180, 30000) uint8 arrays.

I would like to have a list of tuples like

(state, start_pos, end_pos, values)

only separating out a set of values into a new tuple if they are all the 
same value in a run of at least 10 cells.

Is there a clever way to do this in NumPy? I was thinking of using 
itertools.groupby() but it would be nicer to have something faster.