Spider - path conflict [../test.htm,www.nic.nl/index.html]

Skip Montanaro skip at pobox.com
Fri Apr 1 08:07:12 EST 2005


    martijn> I thought I was ready with my own spider...  But then there was
    martijn> a bug, or in other words a missing part in my code.

    martijn> I forget that people do this in website html:
    martijn> <a href="http://www.nic.nl/monkey.html">is oke</a>
    martijn> <a href="../monkey.html">error</a>
    martijn> <a href="../../monkey.html">error</a>

pydoc urlparse.urljoin

Skip



More information about the Python-list mailing list