Python slow for filter scripts

Bengt Richter bokr at oz.net
Wed Oct 29 22:36:07 EST 2003


On 29 Oct 2003 06:34:22 GMT, William Park <opengeometry at yahoo.ca> wrote:

>Alex Martelli <aleax at aleax.it> wrote:
>> and I'm specifically reading the King James' Bible (an easily
>> available text so you can reproduct my results!) and writing
>
>Can you post URL for the Bible?
>
Try Project Gutenburg, at

    http://www.gutenberg.net/

or their new host at

    http://www.ibiblio.org/gutenberg/

They have a number of bibles in various languages, and a ton (>10,000 e-texts) of other stuff,
also some audio texts, apparently. BTW I read somewhere that the BBC is going to make all their
archives, video and audio, freely available on the net, except where there is some legal reason
they can't. I guess they're a kind of FEF -- Free Entertainment Foundation (thank you British
telly owners ;-)

Apparently a new King James e-text is at (long URL, or use their search for "bible" (w/o qutoes)
and go to entry #16):

http://www.ibiblio.org/gutenberg/cgi-bin/sdb/t9.cgi?entry=30&full=yes&ftpsite=http://www.ibiblio.org/gutenberg/

They also have the Koran, BTW. It's interesting to compare word frequencies, e.g., the 20 most frequent
(unless I goofed) in the texts I downloaded:

"C:\Info\Linguistics\Gutenberg\bible\bible11.txt"
  6647: 'LORD'
  6649: 'him'
  6856: 'is'
  6893: 'be'
  6971: 'they'
  7249: 'for'
  7972: 'a'
  8388: 'his'
  8854: 'I'
  8940: 'unto'
  9666: 'he'
  9760: 'shall'
 12353: 'in'
 12592: 'that'
 12846: 'And'
 13429: 'to'
 34472: 'of'
 38891: 'and'
 62135: 'the'

"C:\Info\Linguistics\Gutenberg\koran\koran10.txt"
  1739: 'ye'
  1752: 'with'
  1956: 'And'
  1979: 'for'
  1991: 'who'
  2037: 'be'
  2108: 'not'
  2186: 'that'
  2254: 'shall'
  2366: 'them'
  2575: 'a'
  2644: 'they'
  2799: 'is'
  2900: 'in'
  3320: 'God'
  5144: 'to'
  6855: 'of'
  6896: 'and'
 10982: 'the'

Both start with the-and-of-to ;-)
(I hope this does not offend anyone ;-)

Regards,
Bengt Richter




More information about the Python-list mailing list