[XML-SIG] xml / html parsing for webbot
kentsin
kentsin@sinaman.com
Sun Dec 10 18:00:53 CST 2000
Dear All,
I am learning to build a webbot. I am reading Jeff's webbot code.
I have some difficults and doubts:
1. xml.dom.walker and xml.dom.writer is missing in python 2.0 's xml package. What are their usage?
2. I have think of not building a dom tree but using regular expressions to extract all links. Can somebody tell me from their experience some comparision of the two approaches? What is better? Especially I found some pages which were generated by scripts, do contain unmatched tags in the pages. How the two approaches handle them?
Rgs,
KEnt Sin
===================================================================
新浪免費電子郵箱 http://sinamail.sina.com.hk
立即下載 SinaTicker http://sinaticker.sina.com.hk