[scikit-learn] 2 million samples dataset caused python and OS crash

Andrew Howe ahowe42 at gmail.com
Wed Jan 6 06:46:36 EST 2021


A core dump generally happens when a process tries to access memory outside
it's allocated address space. You've not specified what estimator you were
using, but I'd guess it attempted to do something with the dataset that
resulted in it being duplicated or otherwise expanded beyond the memory
capacity. Perhaps the full stack trace would be helpful.

Andrew


<~~~~~~~~~~~~~~~~~~~~~~~~~~~>
J. Andrew Howe, PhD
LinkedIn Profile <http://www.linkedin.com/in/ahowe42>
ResearchGate Profile <http://www.researchgate.net/profile/John_Howe12/>
Open Researcher and Contributor ID (ORCID)
<http://orcid.org/0000-0002-3553-1990>
Github Profile <http://github.com/ahowe42>
Personal Website <http://www.andrewhowe.com>
I live to learn, so I can learn to live. - me
<~~~~~~~~~~~~~~~~~~~~~~~~~~~>


On Wed, Jan 6, 2021 at 11:02 AM Liu James <icefrog1950 at gmail.com> wrote:

> Hi all,
>
> I'm using a medium dataset KDD99  IDS(
> https://www.ll.mit.edu/r-d/datasets/1999-darpa-intrusion-detection-evaluation-dataset)
> for model training, and the dataset has 2 million  samples.  When using
> fit_transform(), the OS crashed with log "Process 13851(python) of user xxx
> dumped core. Stack trace
> .../numpy/core/_multiarray_umath_cpython_36m_x86_64... ".
>
> The hardware: Centos 8, Intel i9, 128GB RAM, stack size is set unlimited.
> Such crash can be reproduced.
>
> Thanks.
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20210106/035ee120/attachment-0001.html>


More information about the scikit-learn mailing list