TypeError: expected string or Unicode object, NoneType found

subhabangalore at gmail.com subhabangalore at gmail.com
Sat May 19 12:19:08 EDT 2018


I wrote a small piece of following code 

import nltk
from nltk.corpus.reader import TaggedCorpusReader
from nltk.tag import CRFTagger
def NE_TAGGER():
    reader = TaggedCorpusReader('/python27/', r'.*\.pos')
    f1=reader.fileids()
    print "The Files of Corpus are:",f1
    sents=reader.tagged_sents()
    ls=len(sents)
    print "Length of Corpus Is:",ls
    train_data=sents[:300]
    test_data=sents[301:350]
    ct = CRFTagger()
    crf_tagger=ct.train(train_data,'model.crf.tagger')

This code is working fine. 
Now if I change the data size to say 500 or 3000 in  train_data by giving  train_data=sents[:500] or
 train_data=sents[:3000] it is giving me the following error.

Traceback (most recent call last):
  File "<pyshell#2>", line 1, in <module>
    NE_TAGGER()
  File "C:\Python27\HindiCRFNERTagger1.py", line 20, in NE_TAGGER
    crf_tagger=ct.train(train_data,'model.crf.tagger')
  File "C:\Python27\lib\site-packages\nltk\tag\crf.py", line 185, in train
    trainer.append(features,labels)
  File "pycrfsuite\_pycrfsuite.pyx", line 312, in pycrfsuite._pycrfsuite.BaseTrainer.append (pycrfsuite/_pycrfsuite.cpp:3800)
  File "stringsource", line 53, in vector.from_py.__pyx_convert_vector_from_py_std_3a__3a_string (pycrfsuite/_pycrfsuite.cpp:10738)
  File "stringsource", line 15, in string.from_py.__pyx_convert_string_from_py_std__in_string (pycrfsuite/_pycrfsuite.cpp:10633)
TypeError: expected string or Unicode object, NoneType found
>>> 

I have searched for solutions in web found the following links as,
https://stackoverflow.com/questions/14219038/python-multiprocessing-typeerror-expected-string-or-unicode-object-nonetype-f
or
https://github.com/kamakazikamikaze/easysnmp/issues/50

reloaded Python but did not find much help. 

I am using Python 2.7.15 (v2.7.15:ca079a3ea3, Apr 30 2018, 16:22:17) [MSC v.1500 32 bit (Intel)] on win32

My O/S is, MS-Windows 7.

If any body may kindly suggest a resolution. 



More information about the Python-list mailing list