Text Suffix to Prefix Conversion

7stud bbxx789_05ss at yahoo.com
Thu Apr 19 02:40:48 EDT 2007


On Apr 18, 11:08 pm, Steven Bethard <steven.beth... at gmail.com> wrote:
> EMC ROY wrote:
> > Original Sentence: An apple for you.
> > Present:           An<AT0> apple<NN1> for<PRP> you<PNP> .<.>
> > Desire:            <AT0>An <NN1>apple <PRP>for <PNP>you <.>.
> >>> text = 'An<AT0> apple<NN1> for<PRP> you<PNP> .<.>'
> >>> import re
> >>> re.sub(r'(\S+)(<[^>]+>)(\s*)', r'\2\1\3', text)
>
> '<AT0>An <NN1>apple <PRP>for <PNP>you <.>.'

If you end up calling re.sub() repeatedly, e.g. for each line in your
file, then you should "compile" the regular expression so that python
doesn't have to recompile it for every call:

import re

text = 'An<AT0> apple<NN1> for<PRP> you<PNP> .<.>'
myR = re.compile(r'(\S+)(<[^>]+>)(\s*)', r'\2\1\3')
re.sub(myR, r'\2\1\3', text)


Unfortunately, I must be doing something wrong because I can't get
that code to work.  When I run it, I get the error:

Traceback (most recent call last):
  File "2pythontest.py", line 3, in ?
    myR = re.compile(r'(\S+)(<[^>]+>)(\s*)', r'\2\1\3')
  File "/Library/Frameworks/Python.framework/Versions/2.4/lib/
python2.4/sre.py", line 180, in compile
    return _compile(pattern, flags)
  File "/Library/Frameworks/Python.framework/Versions/2.4/lib/
python2.4/sre.py", line 225, in _compile
    p = sre_compile.compile(pattern, flags)
  File "/Library/Frameworks/Python.framework/Versions/2.4/lib/
python2.4/sre_compile.py", line 496, in compile
    p = sre_parse.parse(p, flags)
  File "/Library/Frameworks/Python.framework/Versions/2.4/lib/
python2.4/sre_parse.py", line 668, in parse
    p = _parse_sub(source, pattern, 0)
  File "/Library/Frameworks/Python.framework/Versions/2.4/lib/
python2.4/sre_parse.py", line 308, in _parse_sub
    itemsappend(_parse(source, state))
  File "/Library/Frameworks/Python.framework/Versions/2.4/lib/
python2.4/sre_parse.py", line 396, in _parse
    if state.flags & SRE_FLAG_VERBOSE:
TypeError: unsupported operand type(s) for &: 'str' and 'int'


Yet, these two examples work without error:

------
import re

text = 'An<AT0> apple<NN1> for<PRP> you<PNP> .<.>'
#myR = re.compile(r'(\S+)(<[^>]+>)(\s*)', r'\2\1\3')
print re.sub(r'(\S+)(<[^>]+>)(\s*)', r'\2\1\3', text)

myR = re.compile(r'(hello)')
text = "hello world"
print re.sub(myR, r"\1XXX", text)

---------output:
<AT0>An <NN1>apple <PRP>for <PNP>you <.>.
helloXXX world


Can anyone help?







More information about the Python-list mailing list