[Tutor] Difficult loop?

Emad Nawfal (عماد نوفل) emadnawfal at gmail.com
Wed Oct 15 23:59:58 CEST 2008

Dear Tutors,
I needed a program to go through  a word like this "Almuta$r~id"

1-  a, i,u,  and o are the short vowels
2-  ~ is the double consonant.
I wanted the program to do the following:

- For each letter of the word, if the letter is not a short vowel, print the
letter with the 5 preceding letters (that are not short vowels),  then the
letter itself, then the 5 following letters (no short vowels), then the
short vowel following the letter.
- If the letter is followed by the double character "~", do the same, but
instead of printing just the vowel following the letter as the last
character, print the "~" + the short vowel
-  if the letter is not followed by a vowel, print an underscore as the last

- if there a "+" sign, ignore it.
- If there are fewer than  5 preceding or following letters, print an
underscore in place of each missing letter.

For example, the word "Almuta$r~id" would be printed as follows:

_  _  _  _  _  A  l  m  t  $  r  _
_  _  _  _  A  l  m  t  $  r  d  _
_  _  _  A  l  m  t  $  r  d  _  u
_  _  A  l  m  t  $  r  d  _  _  a
_  A  l  m  t  $  r  d  _  _  _  a
A  l  m  t  $  r  d  _  _  _  _  ~i
l  m  t  $  r  d  _  _  _  _  _  _

I took the problem to a friend of mine who is a Haskel programmer. He wrote
the following python script which works perfectly, but I'm wondering whether
there is an easier, more Pythonic, way to write this:

# Given a transliterated Arabic text like:
# luwnog biyt$ Al+wilAy+At+u ...

# + are to be ignored
# y, A, and w are the long vowels
# shadda is written ~

# produce a table with:
# (letter, vowel, pre5, post5)
# where vowel is either - if the letter is not followed by a vowel or a
# or a shadda with a vowel, pre5 and post5 are the letters surrounding the
# letter


# Initializes a few variables to be used or updated by the main loop below
space = ' '
plus = '+'
dash = '_'
shadda = '~'
vowels = ('a', 'e', 'i', 'o', 'u')
skips = (space,plus) + vowels

# a small input for testing

inputString = "Al+muta$ar~id+i Al+muta$ar~id+i Al+muta$ar~id+i"

# for convenience surround the input with six dashes both ways to make the
# loop below uniform

def makeWord (s): return dash*6 + s + dash*6

# A few utility constants, variables, and functions...

def shiftLeft (context, ch): return context[1:] + (ch,)

def isSkip (ch): return (ch in skips)

def isShadda (ch): return (ch == shadda)

def isVowel (ch): return (ch in vowels)

def nextCh (str,i):
    c = str[i]
        if isSkip(c):
            return nextCh(str,i+1)
        elif isShadda(str[i+1]):
            if isVowel(str[i+2]):
                return (c,str[i+1:i+3],i+3)
                return (c,dash,i+2)
        elif isVowel(str[i+1]):
            return (c,str[i+1],i+2)
            return (c,dash,i+1)
    except IndexError:
        return (c,dash,i+1)

def advance (str,pre,post,horizon):
    (cc,cv) = post[0]
    (hc,hv,nextHorizon) = nextCh(str,horizon)
    nextPre = shiftLeft(pre,(cc,cv))
    nextPost = shiftLeft(post,(hc,hv))
    return (cc,cv,nextPre,nextPost,nextHorizon)

def printLine (cc,cv,pre,post):
    if cc == dash: return
    simplePre = [c for (c,v) in pre]
    simplePost = [c for (c,v) in post[1:]]
    for c in simplePre: print "%s " % c,
    print "%s " % cc,
    for c in simplePost: print "%s " % c,
    print cv

def processWord (str):
    d = (dash,dash)
    pre = (d,d,d,d,d)
    post = (d,d,d,d,d,d)
    horizon = 6
    while horizon < len(str):
        (cc,cv,nextPre,nextPost,nextHorizon) = advance(str,pre,post,horizon)
        pre = nextPre
        post = nextPost
        horizon = nextHorizon

def main ():
    strlist = map(makeWord, inputString.split())



لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.....محمد
"No victim has ever been more repressed and alienated than the truth"

Emad Soliman Nawfal
Indiana University, Bloomington
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20081015/468f4269/attachment-0001.htm>

More information about the Tutor mailing list