[scikit-learn] Specify boosting percentage using Randomoversampling?
Guillaume Lemaître
g.lemaitre58 at gmail.com
Tue Jan 10 13:05:49 EST 2017
I will first assume that RandomOverSampling refer to imbalanced-learn API
(a scikit-learn-contrib project).
The parameter that you are seeking for is the ratio parameter. By default
ratio='auto' which will balance
the classes, as you described.
The ratio can be given as a float as the ratio of the number of samples in
the minority class over the
number of samples in in the majority class.
Check there for more info:
http://contrib.scikit-learn.org/imbalanced-learn/generated/imblearn.over_sampling.RandomOverSampler.html#imblearn.over_sampling.RandomOverSampler
On 10 January 2017 at 18:36, Suranga Kasthurirathne <surangakas at gmail.com>
wrote:
>
> Hi all,
>
> I apologize - i've been looking for this answer all over the internet, and
> it could be that I'm not googling the right terms.
>
> For managing unbalanced datasets, Weka has SMOTE, and scikit has
> randomoversampling.
>
> In weka, we can ask it to boost by a given percentage (say 100%) so an
> undersampled class with 10 values ends up with 20 values (100% increase)
> after boosting.
>
> In Scikit learn, I cant seem to find a way to do this. The
> ramdomoversampler boosts arbitrarily. and seem to try to balance the two
> classes, which may not be realistic in some cases.
>
> Can anyone point me to how I can manage boosting percentage using scikit?
>
> --
> Best Regards,
> Suranga
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
--
Guillaume Lemaitre
INRIA Saclay - Ile-de-France
Equipe PARIETAL
guillaume.lemaitre at inria.f <guillaume.lemaitre at inria.fr>r ---
https://glemaitre.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170110/c4cdf662/attachment.html>
More information about the scikit-learn
mailing list