Sentiment analysis using sklearn

qrious mittra at juno.com
Sun Jan 28 01:59:00 EST 2018


On Saturday, January 27, 2018 at 5:21:15 PM UTC-8, Dan Stromberg wrote:
> On Sat, Jan 27, 2018 at 1:05 PM, qrious wrote:
> > I am attempting to understand how scikit learn works for sentiment analysis and came across this blog post:
> >
> > https://marcobonzanini.wordpress.com/2015/01/19/sentiment-analysis-with-python-and-scikit-learn
> >
> > The corresponding code is at this location:
> >
> > https://gist.github.com/bonzanini/c9248a239bbab0e0d42e
> >
> > My question is while trying to predict, why does the curr_class in Line 44 of the code need a classification (pos or neg) for the test data? After all, am I not trying to predict it? Without any initial value of curr_class, the program has a run time error.
> 
> I'm a real neophyte when it comes to modern AI, but I believe the
> intent is to divide your inputs into "training data" and "test data"
> and "real world data".
> 
> So you create your models using training data including correct
> classifications as part of the input.
> 
> And you check how well your models are doing on inputs they haven't
> seen before with test data, which also is classified in advance, to
> verify how well things are working.
> 
> And then you use real world, as-yet-unclassified data in production,
> after you've selected your best model, to derive a classification from
> what your model has seen in the past.
> 
> So both the training data and test data need accurate labels in
> advance, but the real world data trusts the model to do pretty well
> without further labeling.

Dan, 

Thanks and I was also thinking along this line: 'So both the training data and test data need accurate labels in advance'. It makes sense to me. 

For this part: 'the real world data trusts the model to do pretty well without further labeling', the question is: how do I do this using sklearn library functions? Is there some code example for using the actual data that needs prediction?



More information about the Python-list mailing list