[Tutor] Replacing a value in a list

Mats Wichmann mats at wichmann.us
Fri Aug 13 08:51:29 EDT 2021


On 8/13/21 1:14 AM, nzbz xx wrote:
> I came across this problem: Given that -999 is a missing value in a
> dataset, replace this value with the mean of its adjacent values. For
> example [1, 10, -999, 4, 5] should yield 7.
> 
> This is what i got so far. The problem is that i don't know what to do when
> there are consecutive -999 in the list e.g.  [1, 10, -999, -999, -999]

For the problem as stated, you probably want to raise an error. But 
since the problem statement doesn't say, there isn't really an "answer" 
we can give.  It's not actually a Python problem.

This is data science... in the real world, collected data may have 
missing points, and what you do about  missing points affects your 
conclusions, and the choices are almost certainly colored by an 
understanding of the data. I think that lecture (which I'm not 
qualified to give anyway) probably belongs in an entirely different 
forum.  One of the approaches is to "impute" a value for a missing 
element, and one way to do that is to use a mean of nearest neighbors as 
in your problem. Even when you've chosen that specific approach there 
are variances: the size of window used for nearest-neighbor may smooth 
too much if too large or too little if too small. If the neighbors are 
missing too, you would probably conclude that missing-neighbors was not 
a good choice for imputation.  Maybe there's a different algorithm to 
try? And just maybe, that the dataset might be of such low quality that 
you can't draw any reasonable conclusions from it




More information about the Tutor mailing list