[SciPy-dev] Dataset for examples and license

Anne Archibald peridot.faceted at gmail.com
Tue Apr 24 01:53:45 EDT 2007


On 24/04/07, David Cournapeau <david at ar.media.kyoto-u.ac.jp> wrote:
> Hi,
>
>     I would like to know what should be done when including some dataset
> in scipy ? For example, during the development of my project pymachine,
> I would like to include some famous data like iris/old faithful data,
> etc... for demo of classic machine learning algorithms. R has some
> intereseting data, but is licensed under the GPL, and I am not quite
> sure what the status of the data are wrt the license ? Does GPL also
> cover raw data ?

Not necessarily appropriate for machine learning, and this doesn't
answer your question, but there's lots of astronomy data which is
public (and in fact I think in the public domain as it's a NASA
product).

For inclusion in scipy, supposing the license is fine, if the data is
small (a few kilobytes?) it can go in a test case. (Does scipy *have*
a collection of example code in the distribution? It would be nice...)
If it's bigger (a few megabytes?) it could go on the Wiki; if it's
really big it could probably go on the Wikimedia Commons (though do
they support arbitrary file types?).

Uh, I should say, I'm not a scipy developer, so this is rather my best
guess at what they would permit.

Anne



More information about the SciPy-Dev mailing list