[Tutor] finding words that contain some letters in their respective order

Emad Nawfal (عماد نوفل) emadnawfal at gmail.com
Sat Jan 24 02:17:21 CET 2009


2009/1/23 Emad Nawfal (عماد نوفل) <emadnawfal at gmail.com>

>
>
> On Fri, Jan 23, 2009 at 8:04 PM, Andre Engels <andreengels at gmail.com>wrote:
>
>> 2009/1/24 Emad Nawfal (عماد نوفل) <emadnawfal at gmail.com>:
>> >
>> >
>> > 2009/1/23 Emad Nawfal (عماد نوفل) <emadnawfal at gmail.com>
>> >>
>> >>
>> >> On Fri, Jan 23, 2009 at 6:57 PM, Andre Engels <andreengels at gmail.com>
>> >> wrote:
>> >>>
>> >>> I made an error in my program... Sorry, it should be:
>> >>>
>> >>> def hasRoot(word, root): # This order I find more logical
>> >>>   loc = 0
>> >>>   for letter in root:
>> >>>        loc = word.find(letter,loc) # I missed the ,loc here...
>> >>>        if loc == -1:
>> >>>            return false
>> >>>   return true
>> >>>
>> >>> # main
>> >>>
>> >>> infile = open("myCorpus.txt").read().split()
>> >>> query = "ktb"
>> >>> outcome = [word for word in infile if hasRoot(word,query)]
>> >>>
>> >>>
>> >>> --
>> >>> André Engels, andreengels at gmail.com
>> >>
>> >>
>> >> Thank you so much.  bktab is a legal Arabic word. I also found the word
>> >> bmktbha in the corpus. I would have missed that.
>> >> Thank you again.
>> >> --
>> >> لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه
>> كالحقيقة.....محمد
>> >> الغزالي
>> >> "No victim has ever been more repressed and alienated than the truth"
>> >>
>> >> Emad Soliman Nawfal
>> >> Indiana University, Bloomington
>> >> http://emnawfal.googlepages.com
>> >> --------------------------------------------------------
>> >
>> > Hi again,
>> > If I want to use a regular expression to find the root ktb in all its
>> > derivations, would this be a good way around it:
>> >
>> >>>> x = re.compile("[a-z]*k[a-z]*t[a-z]*b[a-z]*")
>> >>>> text = "hw syktbha ghda wlktab ktb"
>> >>>> re.findall(x, text)
>> > ['syktbha', 'wlktab', 'ktb']
>> >>>>
>>
>> Yes, that looks correct - and a regular expression solution also is
>> easier to adapt - for example, the little that I know of Arab makes me
>> believe that _between_ the letters of a root there may only be vowels.
>> If that's correct, the RE can be changed to
>>
>> "[a-z]*k[aeiou]*t[aeiou]*b[a-z]*"
>
> The letter t does very often occur between the root consonants as well. For
> example, we have akttb, katatib, and for the root fsr you can have astfsr.
>
> Thank you Andre for your helpfulness, and thank you Eugene for suggesting
> the use of regular expressions.
>
>>
>>
>>
>>
>> --
>> André Engels, andreengels at gmail.com
>>
>
>
>
> --
> لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.....محمد
> الغزالي
> "No victim has ever been more repressed and alienated than the truth"
>
> Emad Soliman Nawfal
> Indiana University, Bloomington
> http://emnawfal.googlepages.com
> --------------------------------------------------------
>
Sorry, the last example was incorrect. A correct example would be fqr and
aftqr, slf and astlf


-- 
لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.....محمد
الغزالي
"No victim has ever been more repressed and alienated than the truth"

Emad Soliman Nawfal
Indiana University, Bloomington
http://emnawfal.googlepages.com
--------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20090123/638cc611/attachment.htm>


More information about the Tutor mailing list