Counting occurences of words in a list of strings

John Machin sjmachin at lexicon.net
Wed May 25 01:58:21 EDT 2005


Travers Naran wrote:
> John Machin wrote:
> 
>> 3. If you want to roll your own, start with Gonzalo Navarro's 
>> publications: http://www.dcc.uchile.cl/~gnavarro/subj-multiple.html
> 
> 
> I don't suffer from NMH syndrome.  If ahocorasick does the job, or even 
> count, I'm OK with that.

Do you mean NIH syndrome? Sorry, I should have been clearer, like "if 
you want faster, you will have to roll your own; start with ...". The 
Aho-Corasick algorithm is about 30 years old.  Navarro is part of, and 
summarises the rest of, the state of the art.

Cheers,
John



More information about the Python-list mailing list