[issue32397] textwrap output may change if you wrap a paragraph twice

Andrei Kulakov report at bugs.python.org
Tue Aug 3 13:53:12 EDT 2021


Andrei Kulakov <andrei.avk at gmail.com> added the comment:

Irit: I assume you mean r' \r?\n', that's a great idea, it's much faster than adding a separate replacement step.

Latest version I came up with is this:

                if re.search(r' \r?\n', text):
                    text = re.sub(r' \r?\n', ' ', text)
                if re.search(r'\r?\n ', text):
                    text = re.sub(r'\r?\n ', ' ', text)

This optimizes the case when there's no newlines, which is likely the most common case for small fragments of text, but it may be the less common case for larger fragments where performance is more important; so I'm not sure if it's worth it.

Timings:
# sub() has to run
2904 (~/opensource/cpython) % ./python.exe -mtimeit 'import textwrap' 'textwrap.wrap("abc foo\n bar baz", 5)'       ----VICMD----
5000 loops, best of 5: 67.6 usec per loop

# search() runs; but sub() does NOT because there's no adjacent space
2906 (~/opensource/cpython) % ./python.exe -mtimeit 'import textwrap' 'textwrap.wrap("abc foo\nbar baz", 5)'        ----VICMD----
5000 loops, best of 5: 60.3 usec per loop

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue32397>
_______________________________________


More information about the Python-bugs-list mailing list