[scikit-learn] HashingVectorizer slow in version 0.18

Olivier Grisel olivier.grisel at ensta.org
Tue Oct 11 08:02:47 EDT 2016


I cannot reproduce such a degradation on my machine:

(sklearn-0.17)ogrisel at is146148:~/code/scikit-learn$ python
~/tmp/bench_vectorizer.py
scikit-learn 0.17.1. Numpy 1.11.2. Python 3.5.0 x86_64
Vectorizing 20newsgroup 11314 documents
Vectorization completed in  4.033604383468628  seconds, resulting
shape  (11314, 1048576)

(sklearn-0.18) ogrisel at is146148:~/code/scikit-learn$ python
~/tmp/bench_vectorizer.py
scikit-learn 0.18. Numpy 1.11.2. Python 3.5.0 x86_64
Vectorizing 20newsgroup 11314 documents
Vectorization completed in  4.990509510040283  seconds, resulting
shape  (11314, 1048576)

Which operating system are you using?

Please feel free to open an issue on the tracker anyway.

-- 
Olivier


More information about the scikit-learn mailing list