[Tutor] Regexp question
Ovidiu Bivolaru
ovidiu.bivolaru@ravantivirus.com
Mon May 19 11:00:02 2003
--=-gN7Gv/TbSH+7XjSyYVgG
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable
Thanks a lot :) !! That's it !
Regards,
Ovidiu
On Mon, 2003-05-19 at 17:49, Kristoffer Erlandsson wrote:
> On Mon, May 19, 2003 at 05:30:15PM +0300, Ovidiu Bivolaru wrote:
> > Hi all,
> >=20
> > I'm trying to parse some HTML forms to get the values from "name" and
> > "value" attributes and then to add them in a list. I'm encountering a
> > problem with the regular expressions and I can't figure out why the
> > expression is invalid.
> >=20
> > Bellow is the code that I'm using:
> > for lines in buffer.readlines():
> > print lines
> > regexp =3D 'value=3D"(.*)?"\s*name\s*=3D\s*"(.*:\d+:\d+)?"'
> > print regexp
> > p =3D re.search(regexp,lines)
> >=20
> > The error message is:
> > p =3D re.search(regexp,lines)
> > File "/usr/lib/python2.2/sre.py", line 137, in search
> > return _compile(pattern, flags).search(string)
> > File "/usr/lib/python2.2/sre.py", line 229, in _compile
> > raise error, v # invalid expression
> > error: nothing to repeat
> >=20
> > Can anybody tell me what is wrong with the regular expression?? Also,
> > are any other possibilities to parse the HTML using functions already
> > implemented (i.e. HTMLPasrse module) ??
> >=20
>=20
> The first '?' in your regexp shouldn't be there. Since '*' means "0 or
> more times" the part "(.*)?" of your regexp says "(any character zero or
> more times) zero or one time", using some weird paranthesis-notation. So
> remove that '?' and you won't get the error message. Note that I haven't
> checked your regexp so it matches what you think, I just got rid of the
> error message.
>=20
> HTH,
> Kristoffer
--=-gN7Gv/TbSH+7XjSyYVgG
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)
iD8DBQA+yPBGjQG4unFXu9sRAnpwAJ0ZN2GQoA3vNI8Ezdm+R2Plh+QSyQCfeMmp
bWCXeMbkXDusk93sb3ywOss=
=CvGP
-----END PGP SIGNATURE-----
--=-gN7Gv/TbSH+7XjSyYVgG--