regex confusion
Luther Barnum
Spam_Sucks at rr.com
Tue Dec 9 11:16:42 EST 2003
MAybe you meant:
import re, urllib
rgxPrev = re.compile('.*?a.*?')
url =
'http://nitace.bsd.uchicago.edu:8080/files/share/showdown_example2.html'
s = urllib.urlopen(url).read()
***m = match(rgxPrev,s)***
print m
print s.find('a')
match takes two arguments
"John Hunter" <jdhunter at ace.bsd.uchicago.edu> wrote in message
news:mailman.266.1070985064.16879.python-list at python.org...
>
> In trying to sdebug why a certain regex wasn't working like I expected
> it to, I came across this strange (to me) behavior. The file I am
> trying to match definitely contains many instances of the letter 'a',
> so I would expect the regex
>
> rgxPrev = re.compile('.*?a.*?')
>
> to match it the string contents of the file. But it doesn't. Here is
> a complete example
>
> import re, urllib
> rgxPrev = re.compile('.*?a.*?')
>
> url =
'http://nitace.bsd.uchicago.edu:8080/files/share/showdown_example2.html'
> s = urllib.urlopen(url).read()
> m = rgxPrev.match(s)
> print m
> print s.find('a')
>
> m is None (no match) and the s.find('a') reports an 'a' at index 48.
>
> I read the regex to mean non-greedy match of anything up to an a,
> followed by non-greedy match of anything following an a, which this
> file should match.
>
> Or am I insane?
>
> John Hunter
>
>
> hunter:~/python/projects/poker/data/pokerroom> uname -a
> Linux hunter.paradise.lost 2.4.20-8smp #1 SMP Thu Mar 13 17:45:54 EST 2003
i686
> i686 i386 GNU/Linux
> hunter:~/python/projects/poker/data/pokerroom> python
> Python 2.3.2 (#1, Oct 13 2003, 11:33:15)
> [GCC 3.3.1] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> Welcome to rlcompleter2 0.95
> for nice experiences hit <tab> multiple times
>
>
More information about the Python-list
mailing list