[Tutor] (no subject)
wong chow cheok
wong_chow_cheok@hotmail.com
Wed, 18 Apr 2001 17:31:19 +0800
hello ya all. i have a problem again. still trying to extract url from the
web. but now i need to extract multiple url and not just one. i ahve tried
using findall() but all i get is the 'http' and nothing else.
http_url=r'''
(http|https|ftp|wais|telnet|mailto|gopher|file)
:
[\w.#@&=\,-_~/;:\n]+
(?=([,.:;\-?!\s]))
'''
http_re=re.compile(http_url, re.X)
p=http_re.findall("http:\\www.hotmial.com abnd http:\\www.my.com")
print p
this was very helpful and very confusing. after reading more on it i am only
more confused with all the simbols. (?=...) what does this mean. i read it
up but still am having trouble using it and underdtanding how it works. and
i still cannot extract more than one url. tried many other ways but this is
the only one with a bit of progress(the 'http')
well sorry for being so muxh trouble but your help is appreciated. thank you
_________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.