[SciPy-User] Central File Exchange for Scipy
Pauli Virtanen
pav at iki.fi
Thu Apr 21 08:37:01 EDT 2011
Thu, 21 Apr 2011 03:07:29 -0500, Jason Grout wrote:
[clip]
>>> So when you say "Hosted software", are you thinking of a PyPi type of
>>> site, where the release tarball might be hosted on the site, rather
>>> than the development repository?
>>
>> Precisely so. The aim would be to make it less hassle to use than PyPi
>> for a relative Python newbie. (Although uploading packages to PyPi is
>> not hugely hassle-ful at the moment, as it is possible to do it using
>> only the web interface.) So there would be a bit of an overlap with
>> PyPi; one could however add some recommendations etc. to push people to
>> use PyPi, if they are willing to jump through some extra hoops.
>
> What extra hoops?
Mainly, for small pieces of code, you might not even want to create a
named Python package. So some form of code hosting seems to be useful ---
whether it is called "snippets" (multi-file) or "hosted projects". (The
wiki-style Cookbook content is then the third category of items that
could be useful to have.)
Re: PyPi usability
You cannot just upload a .py file onto PyPi. PyPi checks that the
uploaded file (i) is a tarball, zip, or egg, and (ii) is named in a
specific way, and, (iii) you need to go to a different site.
So there are user experience issues with the web upload. Sure, the upload
is manageable once you practice a bit, and many of the issues are
probably fixable.
[clip]
>> At the moment, one thing seems clear:
>>
>> - Pointers to externally hosted projects (& semi-automatic
>> import from PyPi)
>
> I just read up more on PyPi, and in particular, read up on their recent
> discussion which led to the disabling of the rating system [1]. I'm not
> convinced that this point is clear. How is our pointing to packages on
> PyPi and elsewhere improving on PyPi? Are there a number of other
> packages out there that are not cataloged on PyPi and which should not
> be cataloged there?
By the "clear" thing, I mean pointing to packages on PyPi (= almost all
externally hosted projects), augmented with community tags etc. Pointing
to packages outside PyPi is not essential --- but would be easy to add.
One thing running for allowing "external" links is that adding packages
to PyPi can only be done by their authors. There is currently a small
number of relevant packages usable from Python that are not on PyPi
(although they should); for example PyTrilinos. But I guess it should be
possible to browbeat their authors to add a PyPi entry.
> I could see us adding value by having a better tagging system that was
> customized more for scientific software. On the other hand, maybe we
> could just improve the PyPi entries for such software so that a keyword
> search would pull up the packages.
It seems clear to me that this feature would be useful. Also, combining
the PyPi data with smaller code snippets would create a one-stop-shop.
As you can surmise from the discussion you linked to, there is resistance
in adding new community-oriented features to PyPi itself, as some people
feel that such features are out-of-scope for it. Doing it externally also
makes sense from the usability and branding point of view --- a site
called "Python in science" with filtered package selection can be more
convincing and convenient to navigate than browsing "Topic :: Scientific/
Engineering" on PyPi.
The discussion on rating systems there is an useful read --- it's why I
left out any star-based rating systems so far. Just adding "I use this"
popularity measure probably works around most issues. (The PyPi download
data is not a very reliable measure, as many of the bigger packages host
their files externally.)
***
On improving the entries on PyPi -- the keywords etc. there are editable
only by the original submitter, and this will probably not change, so I
don't think that will be a possible way to go.
The PyPi package classifiers as they are now are assigned solely by the
package authors, more or less at random and from a limited selection, and
are not very reliable. There are several packages in the "Topic ::
Scientific/Engineering" category that don't actually have much focus on
either science or engineering. So, some filtering of PyPi entries would
already be useful.
> > But the following are not so clear:
> >
> > - Hosted projects -- how much to overlap with PyPi?
>
> If it's easy to host on PyPi, it seems like we should point people over
> there. We have far fewer users (and infrastructure maintainers) than
> PyPi, and PyPi itself already has the authoritative blessing of Python.
For small contributions, you might not want to use PyPi. Aside from that,
the hosted projects are not really required, provided PyPi is easy enough
to use (which is not true for the setup.py way, but may be true for the
web interace).
> > - Snippets -- the Wiki or the Knol? Or both? How much overlap with
> > hosted projects?
>
> The python snippet repository is:
> http://code.activestate.com/recipes/langs/python/
[clip]
The activestate snippet library seems not to be very actively used for
Scipy et al. at the moment -- there are only ~15 recipes tagged with
"scipy", "numpy", "scientific", "sage", or "science".
Also:
- Since it's not a focused site, relevant code snippets are mixed
with non-relevant ones, including ones written in languages other
than Python.
- As tags are specified by the users, and free-form, stuff will
be lost in the midst of non-relevant content.
- The tagging feature could perhaps be improved --- it appears they
can only be assigned by the author of the snippet.
- There are some usability problems: e.g. clicking the "Tags" link on
the top takes you away from Python-specific content.
- The search feature is not especially good: it's just Google's site:
search, so it does not explicitly know about tags or metadata.
Other than that, it seems to do a reasonable work.
[clip]
> So: thoughts on the scope of this new project, and how it differentiates
> enough from the existing sites to be useful enough to build and
> maintain?
Already focusing on "scientific" content is a differentiation big enough,
IMHO. It's mostly a social question of creating a hub for exchanging this
type of content; and also a question of branding. Technically, sure,
there is not so much new under the sun. The first point would just to be
to make the implementation slick and useful enough to attract people to a
single place. The second point would be to provide a one-stop-shop for
whatever you need related to Python in science --- which would have the
extra benefit of showcasing that it is doing well, and is a credible tool
for many purposes.
At least based on earlier discussions on this list, it seems that at
least the people who chimed in would prefer such a central hub over what
is currently available.
The current situation is, if I want to share a something science-related
written in Python, it is not obvious where I should put it so that there
would be some audience. For small contributions, in generic snippet sites
your stuff gets easily lost in the middle of non-relevant content ---
also, I'm not convinced many people (e.g. those on this mailing list)
follow those. The scipy.org/Cookbook is also not very usable, as it's a
generic wiki. For larger contributions, PyPi works (although it's mildly
clumsy to use), but it does not offer much visibility. If I name my
package as scikits.* it goes to scikits.appspot.com, but I guess that's
not very widely used either.
So that's the motivation. The snippet/hosted-projects part alone would
address a part of what is missing, but I think one might as well go and
make a one-stop-shop out of it.
Best,
Pauli
More information about the SciPy-User
mailing list