Spider - path conflict [../test.htm,www.nic.nl/index.html]
Skip Montanaro
skip at pobox.com
Fri Apr 1 08:07:12 EST 2005
martijn> I thought I was ready with my own spider... But then there was
martijn> a bug, or in other words a missing part in my code.
martijn> I forget that people do this in website html:
martijn> <a href="http://www.nic.nl/monkey.html">is oke</a>
martijn> <a href="../monkey.html">error</a>
martijn> <a href="../../monkey.html">error</a>
pydoc urlparse.urljoin
Skip
More information about the Python-list
mailing list