trying to improve my knn algorithm

Peter Otten __peter__ at web.de
Wed Jul 1 16:57:29 EDT 2020


hunter.hammond.dev at gmail.com wrote:

> This is a knn algorithm for articles that I have gotten. Then determines
> which category it belongs to. I am not getting very good results :/

[snip too much code;)]

- Shouldn't the word frequency vectors be normalized? I don't see that in
  your code. Without that the length of the text may overshade its contents.

- There are probably words that are completely irrelevant. Getting
  rid of these should improve the signal-to-noise ratio.




More information about the Python-list mailing list