Case tagging and python

chrispoliquin at gmail.com chrispoliquin at gmail.com
Thu Jul 31 16:21:45 EDT 2008


I second the idea of just using the islower(), isupper(), and
istitle() methods.
So, you could have a function - let's call it checkCase() - that
returns a string with the tag you want...

def checkCase(word):

    if word.islower():
    	tag = 'nocap'
    elif word.isupper():
	tag = 'allcaps'
    elif word.istitle():
	tag = 'cap'

    return tag

Then let's take an input file and pass every word through the
function...

f = open(path:to:file, 'r')
corpus_text = f.read()
f.close()

tagged_corpus = ''
all_words = corpus_text.split()

for w in all_words:
  tagtext = checkCase(w)
  tagged_corpus = tagged_corpus + ' ' + w + '/' + tagtext

output_file = open(path:to:file, 'w')
output_file.write(tagged_corpus)
print 'All Done!'



Also, if you're doing natural language processing in Python, you
should get NLTK.




More information about the Python-list mailing list