Case tagging and python
chrispoliquin at gmail.com
chrispoliquin at gmail.com
Thu Jul 31 16:21:45 EDT 2008
I second the idea of just using the islower(), isupper(), and
istitle() methods.
So, you could have a function - let's call it checkCase() - that
returns a string with the tag you want...
def checkCase(word):
if word.islower():
tag = 'nocap'
elif word.isupper():
tag = 'allcaps'
elif word.istitle():
tag = 'cap'
return tag
Then let's take an input file and pass every word through the
function...
f = open(path:to:file, 'r')
corpus_text = f.read()
f.close()
tagged_corpus = ''
all_words = corpus_text.split()
for w in all_words:
tagtext = checkCase(w)
tagged_corpus = tagged_corpus + ' ' + w + '/' + tagtext
output_file = open(path:to:file, 'w')
output_file.write(tagged_corpus)
print 'All Done!'
Also, if you're doing natural language processing in Python, you
should get NLTK.
More information about the Python-list
mailing list