"Newbie" questions - "unique" sorting ?

Cousin Stanley CousinStanley at hotmail.com
Wed Jun 25 12:17:56 EDT 2003


| ...
| I hope you didn't wait over 6 hours to find out !!!
|

John ...

Actually, I did wait ....

Since I'd the program successfully a number of times,
but on smaller files, I wanted to know just how long
it woulds take to run to completion ....

The numbers in the output I posted
were an actual copy/paste directly
from the DOS window that I it ran in ...

| Unless I missed something it does lines
| starting ftp, http, BUT not lines that start www .
|

You didn't miss anything ....

The version of url_list.py that you ran
only looks for ....

    [ 'http://' ,
      'https://' ,
      'ftp://' ,
      'news://' ,
      'res://' ,
      'fido://' ]

However, I added a bit of code to url_list.py
to also extract lines starting with www. ...

Download newest versions ....

    http://fastq.com/~sckitching/Python/word_list.zip

Run as before ....

    python url_list.py JF_In.txt JF_URLs.txt

    python word_list.py JF_URLs.txt JF_URLs_Index.txt

| Thank you for such excellent programming.

You're welcome ....

Thanks also to ....

    Erik Max Francis for suggesting
    the lambda sort for Mixed-Case sorting ....

    Kim Petersen for suggesting usage of ....

        dict_words.has_key[ this_word ] instead of

        this_word in dict_words.keys()

    which made an incredible difference in processing time ....

-- 
Cousin Stanley
Human Being
Phoenix, Arizona






More information about the Python-list mailing list