urllib/ftpwrapper

Guido van Rossum guido at python.org
Thu May 18 18:02:02 EDT 2000


>    In the urllib.py (Python 1.5.2) there is a class ftpwrapper. In the line
> 610 of the module it has the call self.ftp.voidresp() (method endtransfer).
>    This call hangs an almost all FTP URLs. You noted it too - reread your
> comment in webchecker.py, method safeclose :)
> 
>    To overcome the problem I wrote my version of ftpwrapper (in my URL
> robot) without the call to voidresp. Last night the robot checked a
> database of 3000 URLs, and never failed on FTP URLs.
> 
>    Do you understand and can explain why the call is there? Could we just
> remove it from urllib?

Hm...  Tricky...  The voidresp() call is needed in the ftp protocol to
consume the '226 Transfer complete' message it sends after the
transfer is complete.  However you will only get that after the
transfer is indeed complete.  This is what the comment in webchecker
refers to.

If you don't make the voidresp() call, you can't reuse the open ftp
connection for something else.  Since urllib tries to cache ftp
connections, tht would be a bad thing...

So I believe it's necessary.

I presume that your robot is not reading all the data off the
connection, is that right?  That's probably why it hangs on "almost
all" FTP URLs.  The proper fix would be the same as in webchecker --
read the rest of the data.

--Guido van Rossum (home page: http://www.python.org/~guido/)




More information about the Python-list mailing list