counting how often the same word appears in a txt file...But my code only prints the last line entry in the txt file
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Wed Dec 19 06:03:21 EST 2012
On Wed, 19 Dec 2012 02:45:13 -0800, dgcosgrave wrote:
> Hi Iam just starting out with python...My code below changes the txt
> file into a list and add them to an empty dictionary and print how often
> the word occurs, but it only seems to recognise and print the last entry
> of the txt file. Any help would be great.
>
> tm =open('ask.txt', 'r')
> dict = {}
> for line in tm:
> line = line.strip()
> line = line.translate(None, '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~')
> line = line.lower()
> list = line.split(' ')
Note: you should use descriptive names. Since this is a list of WORDS, a
much better name would be "words" rather than list. Also, list is a built-
in function, and you may run into trouble when you accidentally re-use
that as a name. Same with using "dict" as you do.
Apart from that, so far so good. For each line, you generate a list of
words. But that's when it goes wrong, because you don't do anything with
the list of words! The next block of code is *outside* the for-loop, so
it only runs once the for-loop is done. So it only sees the last list of
words.
> for word in list:
The problem here is that you lost the indentation. You need to indent the
"for word in list" (better: "for word in words") so that it starts level
with the line above it.
> if word in dict:
> count = dict[word]
> count += 1
> dict[word] = count
This bit is fine.
> else:
> dict[word] = 1
But this fails for the same reason! You have lost the indentation.
A little-known fact: Python for-loops take an "else" block too! It's a
badly named statement, but sometimes useful. You can write:
for value in values:
do_something_with(value)
if condition:
break # skip to the end of the for...else
else:
print "We never reached the break statement"
So by pure accident, you lined up the "else" statement with the for loop,
instead of what you needed:
for line in tm:
... blah blah blah
for word in words:
if word in word_counts: # better name than "dict"
... blah blah blah
else:
...
> for word, count in dict.iteritems():
> print word + ":" + str(count)
And this bit is okay too.
Good luck!
--
Steven
More information about the Python-list
mailing list