Fetching a clean copy of a changing web page
John Nagle
nagle at animats.com
Mon Jul 16 01:00:36 EDT 2007
I'm reading the PhishTank XML file of active phishing sites,
at "http://data.phishtank.com/data/online-valid/" This changes
frequently, and it's big (about 10MB right now) and on a busy server.
So once in a while I get a bogus copy of the file because the file
was rewritten while being sent by the server.
Any good way to deal with this, short of reading it twice
and comparing?
John Nagle
More information about the Python-list
mailing list