Testing for changes on a web page (was: how to find difference in number of characters)
Stefan Behnel
stefan_ml at behnel.de
Sat Oct 9 08:41:27 EDT 2010
harryos, 09.10.2010 14:24:
> I am trying to determine if a wep page is updated by x number of
> characters..Mozilla firefox plugin 'update scanner' has a similar
> functionality ..A user can specify the x ..I think this would be done
> by reading from the same url at two different times and finding the
> change in body text.
"Number of characters" sounds like a rather useless measure here. I'd
rather apply an XPath, CSS selector or PyQuery expression to the parsed
page and check if the interesting subtree of it has changed at all or not,
potentially disregarding any structural changes by stripping all tags and
normalising the resulting text to ignore whitespace and case differences.
Stefan
More information about the Python-list
mailing list