[Tutor] Regexp question

Kristoffer Erlandsson krier115@student.liu.se
Mon May 19 10:51:02 2003


On Mon, May 19, 2003 at 05:30:15PM +0300, Ovidiu Bivolaru wrote:
> Hi all,
> 
> I'm trying to parse some HTML forms to get the values from "name" and
> "value" attributes and then to add them in a list. I'm encountering a
> problem with the regular expressions and I can't figure out why the
> expression is invalid.
> 
> Bellow is the code that I'm using:
>     for lines in buffer.readlines():
>       print lines
>       regexp = 'value="(.*)?"\s*name\s*=\s*"(.*:\d+:\d+)?"'
>       print regexp
>       p = re.search(regexp,lines)
> 
> The error message is:
>     p = re.search(regexp,lines)
>   File "/usr/lib/python2.2/sre.py", line 137, in search
>     return _compile(pattern, flags).search(string)
>   File "/usr/lib/python2.2/sre.py", line 229, in _compile
>     raise error, v # invalid expression
> error: nothing to repeat
> 
> Can anybody tell me what is wrong with the regular expression?? Also,
> are any other possibilities to parse the HTML using functions already
> implemented  (i.e. HTMLPasrse module) ??
> 

The first '?' in your regexp shouldn't be there. Since '*' means "0 or
more times" the part "(.*)?" of your regexp says "(any character zero or
more times) zero or one time", using some weird paranthesis-notation. So
remove that '?' and you won't get the error message. Note that I haven't
checked your regexp so it matches what you think, I just got rid of the
error message.

HTH,
Kristoffer

-- 
Kristoffer Erlandsson
E-mail:  krier115@student.liu.se
ICQ#:    378225