Howegrown wordcount
Keith P. Boruff
kboruff at optonline.net
Sun Jun 13 07:24:03 EDT 2004
Grégoire Dooms wrote:
> What's the purpose of stripping the items in the list if you just count
> their number ? Isn't this equivalent to
> words += len(input.split(sep))
>
>> return words
>> else:
>> for item in input:
>> wordcount(item)
>>
>> return words
>
>
> Removing the global statement and sep param, you get:
>
> def wordcount(input):
> if isinstance(input, str):
> return len(input.split())
> else:
> return sum([wordcount(item) for item in input])
>
> --
> Grégoire Dooms
After reading this thread, I decided to embark on a word counting
program of my own. One thing I like to do when learning new programming
languages is to try and emulate some of my favorite UNIX type programs.
That said, to get the count of words in a string, I merely did the
following:
# Beginning of program
import re
# Right now my simple wc program just reads piped data
if not sys.stdin.isatty(): input_data = sys.stdin.read()
print "number of words:", len(re.findall('[^\s]+', input_data))
# End of program
Though I've only done trivial tests on this up to now, the word count of
this script seems to match that of the wc on my system (RH Linux WS). I
ran some big RFC text files through this too.
There could be some flaws here; I don't know. I'll have to look at it
better when I get back from the gym. If anyone here finds a problem, I'd
be interested in hearing it.
Like I said, I love using these UNIX type programs to learn a new
language. It helps me learn things like file I/O, command line
arguments, string manipulations.. etc.
Keith P. Boruff
More information about the Python-list
mailing list