[Tutor] i want to build my own arabic training corpus data and use the NLTK to deal with

Danny Yoo dyoo at hkn.eecs.berkeley.edu
Fri Aug 12 11:32:12 CEST 2005



On Fri, 12 Aug 2005, enas khalil wrote:

> hi Danny, Thanks for this help It is now ok to tokenize my text but the
> next step i want is to use tagger class to tag my text with own tags.
> how can i start this?

Hi Enas,

I'd strongly recommend you really go through the NTLK tutorials: the
developers of NLTK have spent a lot of effort into making an excellent set
of tutorials.  It would be a shame to waste their work.

    http://nltk.sourceforge.net/tutorial/index.html

The tutorial on Tagging seems to answer your question affirmatively.


>  Also for any NLTK further help is there a specific mailing list i could
> go on

I think you're looking for the nltk forums:

    http://sourceforge.net/forum/?group_id=30982

As a warning: again, read through the tutorials first before jumping in
there.  I suspect that many of the people who work with NTLK are
researchers; they may want to see that you've done your "homework" before
they answer your questions.  In general, the guidelines in:

    http://www.catb.org/~esr/faqs/smart-questions.html

will probably apply here.


Good luck to you.



More information about the Tutor mailing list