[SciPy-Dev] New subpackage: scipy.data

Stefan van der Walt stefanv at berkeley.edu
Thu Mar 29 19:16:17 EDT 2018


On Thu, 29 Mar 2018 18:54:52 -0400, Warren Weckesser wrote:
> Can you summarize the problems that make you regret including the
> data?

- The size of the repository (extra time on each clone, and that for
  data that isn't necessary in most use cases)
  
- Artificial limit on data sizes: we now have a default place to store
  data, but we still need an additional mechanism for larger datasets.
  How do you choose the threshold for what goes in, what is too big?
  
- Because these tiny embedded datasets are easily available, they become
  the default for demos.  If data is stored externally, realistic
  examples become more feasible and likely.

Best regards
Stéfan


More information about the SciPy-Dev mailing list