[issue12368] packaging.pypi.simple.Crawler assumes external download links are ok to follow

Sun Jun 19 23:12:20 CEST 2011

Éric Araujo <merwok at netwok.org> added the comment:

Extract from IRC:
<pumazi> hmm... I'm thinking Crawler's follow_externals flag isn't working as expected
[...]
<pumazi> I'm not sure, my assumption of [its] function could be off
[...]
<merwok> “hosts is a list of hosts allowed to be processed if follow_externals is true (default behavior is to follow all hosts), follow_externals enables or disables following external links (default is false, meaning disabled).”
<pumazi> Well, I was assuming it would disable external downloads
<merwok> I think “external links” are external links to be scraped, not download links
<merwok> But I see your misunderstanding
<pumazi> I see, but wouldn't we want the same restrictions on download links?
[...]
<merwok> IIUC, follow_externals can be disabled because it’s guesswork
<merwok> The info obtained from XML-RPC or the simple interface is not guesswork
<merwok> So I think you could want to disable guessing from external links, but I don’t see why you should care about the origin of the download
<pumazi> trust issues I suppose
<merwok> But the same person can upload a malicious file to PyPI as well as on their site
<merwok> Without reading the code, I think this is the rationale.  OTOH, if easy_install and pip can restrict downloads and your user expectations show that it can be needed to restrict downloads, let’s file a bug

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue12368>
_______________________________________