Newbie question: strange Re behaviour

Eric Brunel eric.brunel at pragmadev.com
Mon Apr 15 13:26:42 EDT 2002


Patrick.Bussi at space.alcatel.fr wrote:
> 
> Could someone help me understand what's wrong with this very simple regex,
> whose initial purpose was to extract the value of an HTML field. For
> demonstration, I have oversimplified the code below:
> 
> -----start-----
> [pat]$ python
> Python 2.0 (#1, Apr 11 2001, 19:18:08)
> [GCC 2.96 20000731 (Linux-Mandrake 8.0 2.96-0.48mdk)] on linux-i386
> Type "copyright", "credits" or "license" for more information.
>>>>
>>>> import re
>>>> s='[^value=].*[^>]'

That means "any character not in the set ('v', 'a', 'l', 'u', 'e', '='), 
followed by any string, followed by a character that is not a '>'". I guess 
that's not what you want... but it explains why it ignores the leading 'a' 
(which *is* in the set ('v', 'a', 'l', 'u', 'e', '=')).

Try:
s = "value=(.*)>"
and see what group(1) returns after compiling the re...

HTH
-- 
- Eric Brunel <eric.brunel at pragmadev.com> -
PragmaDev : Real Time Software Development Tools - http://www.pragmadev.com




More information about the Python-list mailing list