[newbie] Strange behavior of the re module

Fred fred at acme.com
Sat Aug 21 00:21:27 EDT 2004


Hi,

	While parsing through a bunch of HTML pages using the latest
ActivePython, I experienced something funny using the re module. I
extracted the part that generates the errors (I'm just trying to
substitute once item with another in a string):

--------------------------------
import re

#NOK : doesn't like a single, ending backslash
#stuff = "\colortbl\red0\green0\"
# => SyntaxError: EOL while scanning single-quoted string

#NOK : doesn't like gn0? :-)
stuff="\colortbl\red0\gn0"

# => traceback (most recent call last):
#  File "C:\test.py", line 10, in ?
#    template = re.sub('BLA', stuff, template)
#  File "G:\Python23\lib\sre.py", line 143, in sub
#    return _compile(pattern, 0).sub(repl, string, count)
#  File "G:\Python23\lib\sre.py", line 257, in _subx
#    template = _compile_repl(template, pattern)
#  File "G:\Python23\lib\sre.py", line 244, in _compile_repl
#    raise error, v # invalid expression
#sre_constants.error: bad group name

#OK....
stuff="\colortbl\red0\n0"

template = "BLA"

template = re.sub('BLA', stuff, template)
--------------------------------

=> It appears that the re module isn't very friendly with backslashes,
at least on the Windows platform. Does someone know why, and what I
could do, since I can't rewrite the source HTML documents that contain
backslashes.

Thank you
Fred.



More information about the Python-list mailing list