Struggling with sorted dict of word lengths and count

Mon Jun 27 14:17:35 EDT 2011

On Tue, Jun 28, 2011 at 3:00 AM, Cathy James <nambo4jb at gmail.com> wrote:
> def fileProcess(filename = open('input_text.txt', 'r')):
>     for line in filename:
>         for word in line.lower().split( ):#split lines into words and make
> lower case
>             wordlen = word_length(word)#run function to return length of
> each word
>             freq[wordlen] = freq.get(wordlen, 0) + 1#increment the stored
> value if there is one, or initialize
>         print(word, wordlen, freq[wordlen])
>
> fileProcess()

Yep, you're pretty close!

There's a few improvements you could do, but the first one I would
recommend is to change your extremely confusing variable name:

def fileProcess(filename = 'input_text.txt'):
   for line in open(filename, 'r'):
   ... continue as before ...

As well as making your code easier to comprehend, this means that the
file will correctly be opened at the start of the function and closed
at the end. (Default arguments are evaluated when the def statement is
executed, not when the function's called.)

The other change you need to make is to move the display into a loop
of its own. Currently you're printing out one word and one length from
each line, which isn't terribly useful. Try this:

for wordlen, wordfreq in freq.enumerate():
    print(wordlen+"\t"+wordfreq);

This should be outside the 'for line in' loop.

There's a few other improvements possible (look up
'collections.Counter' for instance), but this should get you on the
right track!

Chris Angelico
aka Rosuav