[issue17343] Add a version of str.split which returns an iterator
Paweł Miech
report at bugs.python.org
Fri Feb 26 07:05:12 EST 2021
Paweł Miech <pawelmhm at gmail.com> added the comment:
Making string.split iterator sounds like an interesting task. I found this issue because recently we talked in project that string.split returns a list and it can cause increased memory usage footprint for some tasks when there is large response to parse.
Here is small script, created by my friend Juancarlo Anez, with iterator version of string.split. Compared with default string split it uses much less memory. When running with memory-profiler tool: https://pypi.org/project/memory-profiler/
It creates this output
3299999
Filename: main.py
Line # Mem usage Increment Occurences Line Contents
============================================================
24 39.020 MiB 39.020 MiB 1 @profile
25 def generate_string():
26 39.020 MiB 0.000 MiB 1 n = 100000
27 49.648 MiB 4.281 MiB 100003 long_string = " ".join([uuid.uuid4().hex.upper() for _ in range(n)])
28 43.301 MiB -6.348 MiB 1 print(len(long_string))
29
30 43.301 MiB 0.000 MiB 1 z = isplit(long_string)
31 43.301 MiB 0.000 MiB 100001 for line in z:
32 43.301 MiB 0.000 MiB 100000 continue
33
34 52.281 MiB 0.297 MiB 100001 for line in long_string.split():
35 52.281 MiB 0.000 MiB 100000 continue
You can see that default string.split uses much more memory.
----------
nosy: +Paweł Miech
Added file: https://bugs.python.org/file49837/main.py
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue17343>
_______________________________________
More information about the Python-bugs-list
mailing list