[Tutor] Rép. : How can I make this run faster?

Luhmann luhmann_br at yahoo.com
Mon Dec 21 22:40:28 CET 2009


#Here's my try:

vowel_killer_dict = { ord(a): None for a in 'aeiou'}

def devocalize(word):
    return word.translate(vowel_killer_dict)

vowelled = ['him', 'ham', 'hum', 'fun', 'fan']
vowelled = set(vowelled)


devocalise_dict={}


for a in vowelled:
    devocalise_dict[a]= devocalize(a)

    
unvowelled=set(devocalise_dict.values())

for lex in unvowelled:
    d={}
    d[lex] = [word for word in vowelled if devocalise_dict[word] == lex]
    print lex, " ".join(d[lex])




--- En date de : Lun, 21.12.09, Emad Nawfal (عمـ نوفل ـاد) <emadnawfal at gmail.com> a écrit :

De: Emad Nawfal (عمـ نوفل ـاد) <emadnawfal at gmail.com>
Objet: [Tutor] How can I make this run faster?
À: "tutor" <Tutor at python.org>
Date: lundi 21 Décembre 2009, 8 h 40

Dear Tutors,
The purpose of this script is to see how many vocalized forms map to a single consonantal form. For example, the form "fn" could be fan, fin, fun.

The input is a large list (taken from a file) that has ordinary words. The script creates a devocalized list, then compares the two lists.


The problem: It takes over an hour to process 1 file. The average file size is 400,000 words.

Question: How can I make it run faster? I have a large number of files.

Note: I'm not a programmer, so please avoid very technical terms.


Thank you in anticipation.





def devocalize(word):
    vowels = "aiou"
    return "".join([letter for letter in word if letter not in vowels])
    
    
vowelled = ['him', 'ham', 'hum', 'fun', 'fan'] # input, usually a large list of around 500,000 items


vowelled = set(vowelled)
    
unvowelled = set([devocalize(word) for word in vowelled])


for lex in unvowelled:
    d = {}
    d[lex] = [word for word in vowelled if devocalize(word) == lex]
    

    print lex, " ".join(d[lex])

-- 
لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.....محمد الغزالي
"No victim has ever been more repressed and alienated than the truth"


Emad Soliman Nawfal
Indiana University, Bloomington
--------------------------------------------------------


-----La pièce jointe associée suit-----

_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor



      Découvrez les photos les plus intéressantes du jour.
http://www.flickr.com/explore/interesting/7days/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20091221/0e0ddff2/attachment-0001.htm>


More information about the Tutor mailing list