[TriZPUG] Python NLTK/data mining/machine learning project of public research data, anyone interested?

Nathan Rice nathan.alexander.rice at gmail.com
Thu Aug 16 22:56:08 CEST 2012


On Thu, Aug 16, 2012 at 4:13 PM, Jesse <jessebikman at gmail.com> wrote:
> I don't know how helpful I'd be, but I'd like to at least check out what
> you're doing. I just started programming in Python last month. When could
> this happen? Are you near Chapel Hill?

I work at UNC.  I could demonstrate some stuff at a hack night.  I'm
still in the planning stages for most of the stuff; I have the pubmed
extraction code pretty well nailed, and I have a solid outline for the
article disqualification (create a feature vector out of topic and
abstract bigrams, MeSH subject headings and journal, use a SVM
discriminator and manually generate a RoC curve to determine the
cutoff score) but I'm still very up in the air regarding NL extraction
of things like sample size, significance, etc.  If you'd like to learn
more I would of course be happy to go over my thoughts on the matter
and we can play around with some code.

Nathan


More information about the TriZPUG mailing list