What is the most effective way, in terms of the execution speed, to build up a histogram of words from a multiple of huge text files? (NOTE: I meant HISTOGRAM a list of all words occuring from texts, with their frequency) Can anyone give me an insight?