Word Counting -- A Novell Approach

Moshe Zadka moshez at math.huji.ac.il
Fri Apr 16 08:40:41 EDT 1999


There was a thread here about word counting, when reading in arbitary
chunks, instead of line-by-line. 

I have a friend who continually reminds me "In Rome, do as the Romans", so 
it seems to me the right way is to count with an object you 'feed()' data
into, like other non-line-based Python parsers (XML, HTML, etc.).

So I wrote a small word counting class, whose interface is:
     * feed: Feed some data into the counter.
     * flush: Force a word break. The next feed will force new words.
       This is useful, for example, when counting words in multiple
       files, to make sure words are not concatenated across files.
     * items: Will return a list of (word, count) pairs.

(This is an excerpt from the documentation)

I will happily mail this class to anyone who wants.
--
Moshe Zadka <mzadka at geocities.com>. 
QOTD: What fun to me! I'm not signing permanent.





More information about the Python-list mailing list