[Tutor] Regexp question
Kristoffer Erlandsson
krier115@student.liu.se
Mon May 19 10:51:02 2003
On Mon, May 19, 2003 at 05:30:15PM +0300, Ovidiu Bivolaru wrote:
> Hi all,
>
> I'm trying to parse some HTML forms to get the values from "name" and
> "value" attributes and then to add them in a list. I'm encountering a
> problem with the regular expressions and I can't figure out why the
> expression is invalid.
>
> Bellow is the code that I'm using:
> for lines in buffer.readlines():
> print lines
> regexp = 'value="(.*)?"\s*name\s*=\s*"(.*:\d+:\d+)?"'
> print regexp
> p = re.search(regexp,lines)
>
> The error message is:
> p = re.search(regexp,lines)
> File "/usr/lib/python2.2/sre.py", line 137, in search
> return _compile(pattern, flags).search(string)
> File "/usr/lib/python2.2/sre.py", line 229, in _compile
> raise error, v # invalid expression
> error: nothing to repeat
>
> Can anybody tell me what is wrong with the regular expression?? Also,
> are any other possibilities to parse the HTML using functions already
> implemented (i.e. HTMLPasrse module) ??
>
The first '?' in your regexp shouldn't be there. Since '*' means "0 or
more times" the part "(.*)?" of your regexp says "(any character zero or
more times) zero or one time", using some weird paranthesis-notation. So
remove that '?' and you won't get the error message. Note that I haven't
checked your regexp so it matches what you think, I just got rid of the
error message.
HTH,
Kristoffer
--
Kristoffer Erlandsson
E-mail: krier115@student.liu.se
ICQ#: 378225