[Chennaipy] Chennaipy - Monday Module - 19 Dec 2022

selvi dct selvi.dct at gmail.com
Mon Dec 19 05:19:58 EST 2022


Date: 19 Dec 2022


Module : nltk


Installation : pip install nltk


About:

Natural Language Toolkit (NLTK) is one of the leading Python platforms for
processing language data. It is a set of language processing libraries and
programs that provide a toolkit for:

- Classification

- Tokenization

- Stemming

- Tagging

- Parsing

- Semantic reasoning


Sample:

>>> import nltk

>>> sentence = """At eight o'clock on Thursday morning

... Arthur didn't feel very good."""


# Tokenization in NLP is the process by which a large quantity of text is
divided into smaller parts called tokens.

>>> tokens = nltk.word_tokenize(sentence)

>>> tokens

['At', 'eight', "o'clock", 'on', 'Thursday', 'morning',

'Arthur', 'did', "n't", 'feel', 'very', 'good', '.']


# POS Tagging in NLTK is a process to mark up the words in text format for
a particular part of a speech based on its definition and context.

>>> tagged = nltk.pos_tag(tokens)

>>> tagged[0:6]

[('At', 'IN'), ('eight', 'CD'), ("o'clock", 'JJ'), ('on', 'IN'),

('Thursday', 'NNP'), ('morning', 'NN')]



Reference:

https://pypi.org/project/nltk/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/chennaipy/attachments/20221219/2da58db9/attachment.html>


More information about the Chennaipy mailing list