[scikit-learn] HashingVectorizer slow in version 0.18

Gabriel Trautmann gabit7 at gmail.com
Tue Oct 11 08:19:24 EDT 2016


Thank you for your response, have Windows 7 Enterprise 64 bit / Intel Xeon
E5 2640 CPU, same problem on two similar machines

python-3.5.2-amd64.exe - fresh installation

numpy-1.11.2+mkl-cp35-cp35m-win_amd64.whl  - from Christoph Gohlke
scipy-0.18.1-cp35-cp35m-win_amd64.whl
pip install scikit-lean

on the same python instance if I downgrade to version 0.17 is much faster.

pip uninstall scikit-lean
pip install scikit-lean==0.17

I will open an issue after I test on more machines or if someone else can
reproduce the problem.




On Tue, Oct 11, 2016 at 3:02 PM, Olivier Grisel <olivier.grisel at ensta.org>
wrote:

> I cannot reproduce such a degradation on my machine:
>
> (sklearn-0.17)ogrisel at is146148:~/code/scikit-learn$ python
> ~/tmp/bench_vectorizer.py
> scikit-learn 0.17.1. Numpy 1.11.2. Python 3.5.0 x86_64
> Vectorizing 20newsgroup 11314 documents
> Vectorization completed in  4.033604383468628  seconds, resulting
> shape  (11314, 1048576)
>
> (sklearn-0.18) ogrisel at is146148:~/code/scikit-learn$ python
> ~/tmp/bench_vectorizer.py
> scikit-learn 0.18. Numpy 1.11.2. Python 3.5.0 x86_64
> Vectorizing 20newsgroup 11314 documents
> Vectorization completed in  4.990509510040283  seconds, resulting
> shape  (11314, 1048576)
>
> Which operating system are you using?
>
> Please feel free to open an issue on the tracker anyway.
>
> --
> Olivier
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20161011/f7fba0c1/attachment.html>


More information about the scikit-learn mailing list