[Numpy-discussion] Filling gaps

Thu Feb 12 21:19:32 EST 2009

On Thu, Feb 12, 2009 at 6:04 PM, Keith Goodman <kwgoodman at gmail.com> wrote:
> On Thu, Feb 12, 2009 at 5:52 PM, Keith Goodman <kwgoodman at gmail.com> wrote:
>> On Thu, Feb 12, 2009 at 5:22 PM, A B <python6009 at gmail.com> wrote:
>>> Are there any routines to fill in the gaps in an array. The simplest
>>> would be by carrying the last known observation forward.
>>> 0,0,10,8,0,0,7,0
>>> 0,0,10,8,8,8,7,7
>>
>> Here's an obvious hack for 1d arrays:
>>
>> def fill_forward(x, miss=0):
>>    y = x.copy()
>>    for i in range(x.shape[0]):
>>        if y[i] == miss:
>>            y[i] = y[i-1]
>>    return y
>>
>> Seems to work:
>>
>>>> x
>>   array([ 0,  0, 10,  8,  0,  0,  7,  0])
>>>> fill_forward(x)
>>   array([ 0,  0, 10,  8,  8,  8,  7,  7])
>
> I guess that should be
>
>    for i in range(1, x.shape[0]):
>
> instead of
>
>    for i in range(x.shape[0]):
>
> to avoid replacing the first element of the array, if it is missing,
> with the last.

For large 1d x arrays, this might be faster:

def fill_forward2(x, miss=0):
    y = x.copy()
    while np.any(y == miss):
        idx = np.where(y == miss)[0]
        y[idx] = y[idx-1]
    return y

But it does replace the first element of the array, if it is missing,
with the last.

We could speed it up by doing (y == miss) only once per loop. (But I
bet the np.where is the bottleneck.)