[Chennaipy] Chennaipy - Monday Module - 19 Dec 2022
selvi dct
selvi.dct at gmail.com
Mon Dec 19 05:19:58 EST 2022
Date: 19 Dec 2022
Module : nltk
Installation : pip install nltk
About:
Natural Language Toolkit (NLTK) is one of the leading Python platforms for
processing language data. It is a set of language processing libraries and
programs that provide a toolkit for:
- Classification
- Tokenization
- Stemming
- Tagging
- Parsing
- Semantic reasoning
Sample:
>>> import nltk
>>> sentence = """At eight o'clock on Thursday morning
... Arthur didn't feel very good."""
# Tokenization in NLP is the process by which a large quantity of text is
divided into smaller parts called tokens.
>>> tokens = nltk.word_tokenize(sentence)
>>> tokens
['At', 'eight', "o'clock", 'on', 'Thursday', 'morning',
'Arthur', 'did', "n't", 'feel', 'very', 'good', '.']
# POS Tagging in NLTK is a process to mark up the words in text format for
a particular part of a speech based on its definition and context.
>>> tagged = nltk.pos_tag(tokens)
>>> tagged[0:6]
[('At', 'IN'), ('eight', 'CD'), ("o'clock", 'JJ'), ('on', 'IN'),
('Thursday', 'NNP'), ('morning', 'NN')]
Reference:
https://pypi.org/project/nltk/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/chennaipy/attachments/20221219/2da58db9/attachment.html>
More information about the Chennaipy
mailing list