[SciPy-Dev] New subpackage: scipy.data

Ralf Gommers ralf.gommers at gmail.com
Sun Apr 29 01:58:44 EDT 2018


On Tue, Apr 3, 2018 at 1:06 AM, Daπid <davidmenhur at gmail.com> wrote:

>
>
> On 31 March 2018 at 02:17, Ralf Gommers <ralf.gommers at gmail.com> wrote:
>
>>
>>
>> On Fri, Mar 30, 2018 at 12:03 PM, Eric Larson <larson.eric.d at gmail.com>
>> wrote:
>>
>>> Top-level module for them alone sounds overkill, and I'm not sure if
>>>> discoverability alone is enough.
>>>>
>>>
>>> Fine by me. And if we follow the idea that these should be added
>>> sparingly, we can maintain discoverability without it growing out of
>>> hand by populating the See Also sections of each function.
>>>
>>
>> I agree with this, the 2 images and 1 ECG signal (to be added) that we
>> have doesn't justify a top-level module. We don't want to grow more than
>> the absolute minimum of datasets. The package is already very large, which
>> is problematic in certain cases. E.g. numpy + scipy still fits in the AWS
>> Lambda limit of 50 MB, but there's not much margin.
>>
>
> The biggest subpackage is sparse, and there most of the space is taken by _
> sparsetools.cpython-35m-x86_64-linux-gnu.so According to size -A -d, the
> biggest sections are debug. The same goes for the second biggest, special.
> Can it run without those sections? On preliminary checks, it seems that
> stripping .debug_info and .debug_loc trim down the size from 38 to 3.7 MB,
> and the test suite still passes.
>

Should work. That's a lot more gain than I'd realized. Given that we hardly
ever get useful gdb tracebacks, it may be worth considering doing that for
releases.


>
> If we really need to trim down the size for installing in things like
> Lambda, could we have a scipy-lite for production environments, that is the
> same as scipy but without unnecessary debug? I imagine tracebacks would not
> be as informative, but that shouldn't matter for production environments.
> My first thought was to remove docstrings, comments, tests, and data, but
> maybe they don't amount to so much for the trouble.
>

Recipes for such things are floating around, and it makes sense to do that.
I'd rather not maintain an official scipy-lite package though, rather just
make choices within scipy that enable third parties to do that.

Ralf



>
>
> On the topic at hand, I would agree to having a few, small datasets to
> showcase functionality. I think a few kilobytes can go a long way to show
> and benchmark. As far as I can see, a top level module is free: it wouldn't
> add any maintenance burden, and would make them easier to find.
>
> /David.
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20180428/f1455f5b/attachment.html>


More information about the SciPy-Dev mailing list