Numpy outlier removal

Joseph L. Casale jcasale at activenetwerx.com
Sun Jan 6 14:44:08 EST 2013


I have a dataset that consists of a dict with text descriptions and values that are integers. If
required, I collect the values into a list and create a numpy array running it through a simple
routine: data[abs(data - mean(data)) < m * std(data)] where m is the number of std deviations
to include.


The problem is I loos track of which were removed so the original display of the dataset is
misleading when the processed average is returned as it includes the removed key/values.


Ayone know how I can maintain the relationship and when I exclude a value, remove it from
the dict?

Thanks!
jlc


More information about the Python-list mailing list