[scikit-learn] Supervised anomaly detection in time series

Amita Misra amisra2 at ucsc.edu
Thu Aug 4 19:48:54 EDT 2016


SubSample would remove a lot of information from the negative class.
I have more than 500 samples of negative class and just 5 samples of
positive class.

Amita

On Thu, Aug 4, 2016 at 4:43 PM, Nicolas Goix <goix.nicolas at gmail.com> wrote:

> Hi,
>
> Yes you can use your labeled data (you will need to sub-sample your normal
> class to have similar proportion normal-abnormal) to learn your
> hyper-parameters through CV.
>
> You can also try to use supervised classification algorithms on `not too
> highly unbalanced' sub-samples.
>
> Nicolas
>
> On Thu, Aug 4, 2016 at 5:17 PM, Amita Misra <amisra2 at ucsc.edu> wrote:
>
>> Hi,
>>
>> I am currently exploring the problem of speed bump detection using
>> accelerometer time series data.
>> I have extracted some features based on mean, std deviation etc  within a
>> time window.
>>
>> Since the dataset is highly skewed ( I have just 5  positive samples for
>> every > 300 samples)
>> I was looking into
>>
>> One ClassSVM
>> covariance.EllipticEnvelope
>> sklearn.ensemble.IsolationForest
>>
>> but I am not sure how to use them.
>>
>> What I get from docs
>> separate the positive examples and train using only negative examples
>>
>> clf.fit(X_train)
>>
>> and then
>> predict the positive examples using
>> clf.predict(X_test)
>>
>>
>> I am not sure what is then the role of positive examples in my training
>> dataset or how can I use them to improve my classifier so that I can
>> predict better on new samples.
>>
>>
>> Can we do something like Cross validation to learn the parameters as in
>> normal binary SVM classification
>>
>> Thanks,?
>> Amita
>>
>> Amita Misra
>> Graduate Student Researcher
>> Natural Language and Dialogue Systems Lab
>> Baskin School of Engineering
>> University of California Santa Cruz
>>
>>
>>
>>
>>
>> --
>> Amita Misra
>> Graduate Student Researcher
>> Natural Language and Dialogue Systems Lab
>> Baskin School of Engineering
>> University of California Santa Cruz
>>
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>


-- 
Amita Misra
Graduate Student Researcher
Natural Language and Dialogue Systems Lab
Baskin School of Engineering
University of California Santa Cruz
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20160804/90a89480/attachment-0001.html>


More information about the scikit-learn mailing list