Using regular expressions in internet searches

MyHaz support.services.complaints at gmail.com
Sun Jul 3 15:05:30 EDT 2005


Python would be good for this, but if you just want a chuck an rumble
solution might be.


bash $wget -r --ignore-robots -l 0 -c -t 3 http://www.cnn.com/
bash $ grep -r "Micheal.*" ./www.cnn.com/*

Or you could do a wget/python mix

like

import sys
import re
sys.os.command("wget -r --ignore-robots -l 0 -c -t 3
http://ww.cnn.com/")
re_iraq=re.compile("iraq .+?",re.IGNORECASE)

while "file in dirs under ./www.cnn.com/ "
        iraqs = re_iraq.findall(file.read())
        print iraqs




More information about the Python-list mailing list