[issue37723] important performance regression on regular expression parsing
yannvgn
report at bugs.python.org
Wed Jul 31 12:28:12 EDT 2019
yannvgn <hi at yannvgn.io> added the comment:
> Indeed, it was not expected that the character set contains hundreds of thousands items. What is its size in your real code?
> Could you please show benchmarking results for different implementations and different sizes?
I can't precisely answer that, but sacremoses (a tokenization package) for example is strongly impacted. See https://github.com/alvations/sacremoses/issues/61#issuecomment-516401853
----------
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue37723>
_______________________________________
More information about the Python-bugs-list
mailing list