[Tutor] Counting words
Nicole Seitz
nicole.seitz@urz.uni-hd.de
Thu, 4 Apr 2002 20:28:10 +0200
Hi there!
I would like to count words in a file, i.e. I want to know how often a word
occurs.
In "GoTo Python" I found some help, so I could write:
_____________________________________________________
import re, string
reg = re.compile("[\W]")
file = open("someText","r")
text = file.read()
occurences = {}
for word in reg.split(text):
occurences[word] = occurences.get(word,0)+1
print occurences
for word in occurences.keys():
print "word:",word,", occurences:",occurences[word]
_____________________________________________________
First question: Can someone explain what's happening in the first for-loop?
I don't understand occurences.get(word,0)+1 .
I know it<#s counting there, but how?
(one possible output)
word: of , occurences: 1
word: Some , occurences: 1
word: are , occurences: 1
word: texts , occurences: 1
word: BORING , occurences: 1
word: some , occurences: 1
word: is , occurences: 2
word: boring , occurences: 1
word: This , occurences: 2
word: kind , occurences: 1
word: text , occurences: 1
word: , occurences: 3
Second question:
"Some" and "some" should be recognized as one word, the same is with "BORING"
and "boring". I thought of string.lowercase as a possible solution, but as
it doesn't work , I might be wrong. Any idea what to do?
Third question:
Last line of output:
Is "\n" recognized as a word?? (My text currently consists of three lines)
Thanx in advance.
Nicole