[Catalog-sig] Hosting documentation on PyPI

"Martin v. Löwis" martin at v.loewis.de
Wed Aug 6 18:44:56 CEST 2008


> There's an XSS concern if users can upload arbitrary HTML.  Approval
> would address some of that, but it might be better to avoid the issue
> altogether.
> 
> One way to handle that would be to host each package's documentation on
> a different domain.  E.g., package.pypi.python.org.

Can you please elaborate? What is the issue, and how could creating
domains resolve it?

Also, what would be the best way to set up the web server to implement
that? Getting a delegation for a pypi.python.org zone onto that machine
should be possible, and I know how to update zone files once an hour.
However, I feel slightly uncomfortable with generating a huge Apache
config with hundreds of virtual hosts, and having Apache restart every
hour.

> Another option is using an HTML scrubber.  But removing Javascript would
> be unfortunate in this case as there's a lot of good uses of it, so
> multiple domains would be better IMHO.

For this, I'm very skeptical. There will be too many complaints that it
removes stuff incorrectly.

> If implemented I think all existing packages could be approved, which
> would greatly reduce the approval queue.

I wouldn't mind this starting slowly, say, being experimental until the
end of the year. Currently, python.org doesn't provide any similar
hosting (although the PyPI-generated package pages come close), so there
could be many risks that cause us to pull the plug.

As for "all existing packages could be approved": the existing ones
perhaps, but for new ones, wouldn't there still be a chance of somebody
uploading/linking porn, viruses, whatever?

Most likely, it works out just fine, of course, as people have to leave
real email addresses, and interact in a fairly involved manner already,
which has prevented spambots from registering so far (I'm sure the RSS
publication would cause immediate reaction from the community should a
spammer make it "through").

Regards,
Martin



More information about the Catalog-SIG mailing list