[Tutor] about python function +reg expression

Abdirizak abdi a_abdi406@yahoo.com
Sun Mar 23 00:59:01 2003


--0-1284257744-1048399081=:41886
Content-Type: multipart/alternative; boundary="0-1748435489-1048399081=:41886"

--0-1748435489-1048399081=:41886
Content-Type: text/plain; charset=us-ascii


hi everyone,

I am working on a fucnction that does the following:
   the function takes a string as follows:
       <EQN/>what it is in here<EQN/>   and
       first it filters to get the middle part of the string:
      as follows:---  what it is in here--- and then I want to tag each token
      as follows:
      <W>what</W> <W>it</W> <W>is</W> <W>in</W> <W>here</W>
      and finally it returns:
      <EQN/> <W>what</W> <W>it</W> <W>is</W> <W>in</W> <W>here</W> <EQN/>
   My problem:

 I am filtering a string by using regular expression which below

from : <EQN/> what it is in here <EQN/> 
to    :    what it is in here    

Regular expression result:

>>>
>>>
>>> import re
>>> test = '<EQN/> of a conditioned word <EQN/> '
>>> text =re.compile(r"(?<=<EQN/>).*?(?=<EQN/>)")
>>> X = text.findall(test)
>>> print X
[' of a conditioned word '] ---> it extracts what I want that is fine

my problem is how can I manipulate this so that I have a list to manipulate such as this   [ 'of' ', 'a','conditioned', 'word' ] so that I tag with <W>....</W> by using a loop as mentioned above:

I have also attached the EQNfunc.py with this e-mail for reference

X.split()  doesn't work simply

thanks in advance



---------------------------------
Do you Yahoo!?
Yahoo! Platinum - Watch CBS' NCAA March Madness, live on your desktop!
--0-1748435489-1048399081=:41886
Content-Type: text/html; charset=us-ascii

<P>hi everyone,</P>
<P>I am working on a fucnction that does the following:<BR>&nbsp;&nbsp; the function takes a string as follows:<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;EQN/&gt;what it is in here&lt;EQN/&gt;&nbsp;&nbsp; and<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; first it filters to get the middle part of the string:<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; as follows:---&nbsp; <STRONG>what it is in here</STRONG>--- and then&nbsp;I want to tag each token<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; as follows:<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;W&gt;what&lt;/W&gt; &lt;W&gt;it&lt;/W&gt; &lt;W&gt;is&lt;/W&gt; &lt;W&gt;in&lt;/W&gt; &lt;W&gt;here&lt;/W&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; and finally it returns:<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;EQN/&gt; &lt;W&gt;what&lt;/W&gt; &lt;W&gt;it&lt;/W&gt; &lt;W&gt;is&lt;/W&gt; &lt;W&gt;in&lt;/W&gt; &lt;W&gt;here&lt;/W&gt; &lt;EQN/&gt;<BR>&nbsp;&nbsp; My problem:</P>
<P>&nbsp;I&nbsp;am filtering a string by using regular expression which below</P>
<P>from&nbsp;:&nbsp;<EM><STRONG>&lt;EQN/&gt; what it is in here &lt;EQN/&gt;&nbsp;<BR>to&nbsp;&nbsp;&nbsp; :&nbsp;&nbsp;&nbsp;&nbsp;what it is in here&nbsp;&nbsp;&nbsp; </STRONG></EM></P>
<P>Regular expression result:</P>
<P>&gt;&gt;&gt;<BR>&gt;&gt;&gt;<BR>&gt;&gt;&gt; import re<BR>&gt;&gt;&gt; test = '&lt;EQN/&gt; of a conditioned word &lt;EQN/&gt; '<BR>&gt;&gt;&gt; text =re.compile(r"(?&lt;=&lt;EQN/&gt;).*?(?=&lt;EQN/&gt;)")<BR>&gt;&gt;&gt; X = text.findall(test)<BR>&gt;&gt;&gt; print X<BR><STRONG>[' of a conditioned word '] ---&gt; it extracts what I want that is fine</STRONG></P>
<P>my problem is how can I manipulate this so that&nbsp;I have a list to manipulate such as this&nbsp;&nbsp; <STRONG>[ 'of' ', 'a','conditioned', 'word' ] </STRONG>so that I tag with &lt;W&gt;....&lt;/W&gt; by using a loop as mentioned above:</P>
<P>I have also attached the EQNfunc.py with this e-mail for reference</P>
<P><STRONG>X.split()&nbsp; doesn't work simply</STRONG></P>
<P>thanks in advance</P><p><br><hr size=1>Do you Yahoo!?<br>
<a href="http://rd.yahoo.com/platinum/evt=8162/*http://platinum.yahoo.com/splash.html">Yahoo! Platinum</a> - Watch CBS' NCAA March Madness, <a href="http://rd.yahoo.com/platinum/evt=8162/*http://platinum.yahoo.com/splash.html">live on your desktop</a>!
--0-1748435489-1048399081=:41886--
--0-1284257744-1048399081=:41886
Content-Type: text/plain; name="EQNfunc.py"
Content-Description: EQNfunc.py
Content-Disposition: inline; filename="EQNfunc.py"

import re

def EQNfunc(text1):
   """ this function takes a string as follows:
       <EQN/>what it is in here<EQN/> and
       first it filters to get the middle part of the string:
      as follows:---  what it is in here--- and then it tags
      with each token with as follows:
      <W>what</W> <W>it</W> <W>is</W> <W>in</W> <W>here</W>
       """
   
   initial_tag = "<EQN/>"
   spacing = " "
   tag_W = "W"
   #print text1 # for test

   buf = re.compile(r"(?<=<EQN/>).*?(?=<EQN/>)")
   #temp = buf.findall(text1)
   #temp = X.split()
   
   for i in range(0,len(temp),1):
       temp = buf.findall(buf)
       print temp
       temp[i] = '<%s>%s</%s>%s' %(tag_W,temp[i],tag_W,spacing)
      # join the text   
      #joined_text =''.join(temp)

   last_result = initial_tag + str(temp) + initial_tag  
   return last_result

#-------------------------------------------


test = '<EQN/> of a conditioned word <EQN/> '
# buf = re.compile(r"(?<=<EQN/>).*?(?=<EQN/>)")
# X = buf.findall(test)

Y = EQNfunc(test)
print Y

--0-1284257744-1048399081=:41886--