[scikit-learn] Text classification of large dataset
Ranjana Girish
ranjanagirish30 at gmail.com
Wed Dec 27 05:16:34 EST 2017
Hai all,
Thank you for your suggestions.
But I am still getting *memory error* while doing feature selection
*fs = feature_selection.SelectPercentile(feature_selection.chi2,
percentile=20)*
*documenttermmatrix1 = fs.fit_transform(documenttermmatrix,y1)*
*documenttermmatrix* will be of shape *(1594516,232832)*
type of *documenttermmatrix * is *scipy csr matrix*
Am I doing anything wrong?
Is there any better way of doing feature selection?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171227/76efd6f1/attachment.html>
More information about the scikit-learn
mailing list