Improving the web page download code.
Alister
alister.ware at ntlworld.com
Wed Aug 28 04:58:21 EDT 2013
On Tue, 27 Aug 2013 12:41:10 -0700, mukesh tiwari wrote:
> Hello All,
> I am doing web stuff first time in python so I am looking for
> suggestions. I wrote this code to download the title of webpages using
> as much less resource ( server time, data download) as possible and
> should be fast enough. Initially I used BeautifulSoup for parsing but
> the person who is going to use this code asked me not to use this and
> use regular expressions ( The reason was BeautifulSoup is not fast
> enough ? ).
By the time you have written enough RE to reliably parse HTML(I ma not
sure that that is even strictly possible) you will have re-inverted
BeautifullSoup, Badly. unless you are looking for a very explicit section
of data in the page this is not a good idea.
More information about the Python-list
mailing list