[scikit-learn] 2 million samples dataset caused python and OS crash
Liu James
icefrog1950 at gmail.com
Wed Jan 6 06:00:46 EST 2021
Hi all,
I'm using a medium dataset KDD99 IDS(
https://www.ll.mit.edu/r-d/datasets/1999-darpa-intrusion-detection-evaluation-dataset)
for model training, and the dataset has 2 million samples. When using
fit_transform(), the OS crashed with log "Process 13851(python) of user xxx
dumped core. Stack trace
.../numpy/core/_multiarray_umath_cpython_36m_x86_64... ".
The hardware: Centos 8, Intel i9, 128GB RAM, stack size is set unlimited.
Such crash can be reproduced.
Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20210106/4072f441/attachment.html>
More information about the scikit-learn
mailing list