nltk related issue

Sharan Basappa sharan.basappa at gmail.com
Wed Jun 20 23:40:55 EDT 2018


Folks,

I am trying to run a simple example associated with nltk.
I get some error and I don't know what the issue is.
I need some guidance please.

I am using python canopy distribution

The following is the code:

inputstring = ' This is an example sent. The sentence splitter will split on sent markers. Ohh really !!'
from nltk.tokenize import sent_tokenize

sentences = sent_tokenize(inputstring)

print sentences


The following is the error report:

LookupErrorTraceback (most recent call last)
D:\Projects\Initiatives\machine learning\programs\nltk_1.py in <module>()
      2 from nltk.tokenize import sent_tokenize
      3 
----> 4 sentences = sent_tokenize(inputstring)
      5 
      6 print sentences
D:\Users\sharanb\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\nltk\tokenize\__init__.pyc in sent_tokenize(text, language)
     94     :param language: the model name in the Punkt corpus
     95     """
---> 96     tokenizer = load('tokenizers/punkt/{0}.pickle'.format(language))
     97     return tokenizer.tokenize(text)
     98 
D:\Users\sharanb\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\nltk\data.pyc in load(resource_url, format, cache, verbose, logic_parser, fstruct_reader, encoding)
    812 
    813     # Load the resource.
--> 814     opened_resource = _open(resource_url)
    815 
    816     if format == 'raw':
D:\Users\sharanb\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\nltk\data.pyc in _open(resource_url)
    930 
    931     if protocol is None or protocol.lower() == 'nltk':
--> 932         return find(path_, path + ['']).open()
    933     elif protocol.lower() == 'file':
    934         # urllib might not use mode='rb', so handle this one ourselves:
D:\Users\sharanb\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\nltk\data.pyc in find(resource_name, paths)
    651     sep = '*' * 70
    652     resource_not_found = '\n%s\n%s\n%s' % (sep, msg, sep)
--> 653     raise LookupError(resource_not_found)
    654 
    655 
LookupError: 
**********************************************************************
  Resource u'tokenizers/punkt/english.pickle' not found.  Please
  use the NLTK Downloader to obtain the resource:  >>>
  nltk.download()
  Searched in:
    - 'D:\\Users\\sharanb/nltk_data'
    - 'C:\\nltk_data'
    - 'D:\\nltk_data'
    - 'E:\\nltk_data'
    - 'D:\\Users\\sharanb\\AppData\\Local\\Enthought\\Canopy\\edm\\envs\\User\\nltk_data'
    - 'D:\\Users\\sharanb\\AppData\\Local\\Enthought\\Canopy\\edm\\envs\\User\\lib\\nltk_data'
    - 'D:\\Users\\sharanb\\AppData\\Roaming\\nltk_data'
    - u''
********************************************************************** 



More information about the Python-list mailing list