[XML-SIG] xml / html parsing for webbot

kentsin kentsin@sinaman.com
Sun Dec 10 18:00:53 CST 2000


Dear All,

I am learning to build a webbot. I am reading Jeff's webbot code. 

I have some difficults and doubts:

1. xml.dom.walker and xml.dom.writer is missing in python 2.0 's xml package. What are their usage?

2. I have think of not building a dom tree but using regular expressions to extract all links. Can somebody tell me from their experience some comparision of the two approaches? What is better? Especially I found some pages which were generated by scripts, do contain unmatched tags in the pages. How the two approaches handle them?

Rgs,

KEnt Sin


===================================================================
新浪免費電子郵箱 http://sinamail.sina.com.hk 
立即下載 SinaTicker http://sinaticker.sina.com.hk