grabbing random words

Steven D'Aprano steve at REMOVE.THIS.cybersource.com.au
Sun Sep 24 01:00:27 EDT 2006


On Sat, 23 Sep 2006 04:37:31 -0700, MonkeeSage wrote:

> Another approach would be to just scrape a CS's random (5.75 x 10^30)
> word haiku generator. ;)

That isn't 5.75e30 words, it is the number of possible haikus. There
aren't that many words in all human languages combined.

Standard English working vocabulary is about 800 words in typical daily
use, and 5000 words that most people can understand. Particularly
well-read people might understand a dozen times that, about 60,000 words.
The total number of words in English is hard to count, but the Oxford
English Dictionary estimates about three quarters of a million words.

http://www.askoxford.com/asktheexperts/faq/aboutenglish/numberwords


Call it a million; and lets say that there are, or have every been, a
million distinct human languages (which is surely a large overestimate,
even including dialects and pigeons). That gives only a "mere" 10**12
words, about a million million million times smaller than the number of
haikus.

(Note however that there are languages like Finnish which allow you to
stick together words into a single "word" of indefinite length, sort of as
if we could say in English "therearelanguageswhichallowyou" to
"sticktogetherwordsintoasinglewordofindefinitelength". Such languages
might be said to have an infinite number of words, in some sense.)



-- 
Steven.




More information about the Python-list mailing list