[scikit-learn] Specify boosting percentage using Randomoversampling?

Guillaume Lemaître g.lemaitre58 at gmail.com
Tue Jan 10 13:05:49 EST 2017


I will first assume that RandomOverSampling refer to imbalanced-learn API
(a scikit-learn-contrib project).

The parameter that you are seeking for is the ratio parameter. By default
ratio='auto' which will balance
the classes, as you described.

The ratio can be given as a float as the ratio of the number of samples in
the minority class over the
number of samples in in the majority class.

Check there for more info:
http://contrib.scikit-learn.org/imbalanced-learn/generated/imblearn.over_sampling.RandomOverSampler.html#imblearn.over_sampling.RandomOverSampler

On 10 January 2017 at 18:36, Suranga Kasthurirathne <surangakas at gmail.com>
wrote:

>
> Hi all,
>
> I apologize - i've been looking for this answer all over the internet, and
> it could be that I'm not googling the right terms.
>
> For managing unbalanced datasets, Weka has SMOTE, and scikit has
> randomoversampling.
>
> In weka, we can ask it to boost by a given percentage (say 100%) so an
> undersampled class with 10 values ends up with 20 values (100% increase)
> after boosting.
>
> In Scikit learn, I cant seem to find a way to do this. The
> ramdomoversampler boosts arbitrarily. and seem to try to balance the two
> classes, which may not be realistic in some cases.
>
> Can anyone point me to how I can manage boosting percentage using scikit?
>
> --
> Best Regards,
> Suranga
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>


-- 
Guillaume Lemaitre
INRIA Saclay - Ile-de-France
Equipe PARIETAL
guillaume.lemaitre at inria.f <guillaume.lemaitre at inria.fr>r ---
https://glemaitre.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170110/c4cdf662/attachment.html>


More information about the scikit-learn mailing list