[issue43518] textwrap.shorten does not always respect word boundaries

Andrei Kulakov report at bugs.python.org
Thu Jul 1 20:43:19 EDT 2021


Andrei Kulakov <andrei.avk at gmail.com> added the comment:

Some observations:

 - Just to be clear (because annesylvie implied this is caused by exclamation marks), punctuation at the end of the word is not required to hit this bug:

In [44]: shorten("hello universe", width=7, placeholder="")
Out[44]: 'hello u'

(so for example adding an option to break at the boundary of word/punctuation would not fix this issue)

 - It would be good to fix this because my guess would be most code using `shorten` does it with default value of break_long_words, and this issue is easy to miss in testing.

 - My guess is that the goal of shorten is to return a shortened (okay, this much is obvious :) ) but representative snapshot of the text.

 - A user might also expect that it's consistent with TextWrapper, since it's essentially a wrapper around TextWrapper :)

Therefore if we make a backwards incompatible change, the following would be also nice to have, perhaps requiring a new arg:

width=5
 1. universe => unive
 2. hi universe => hi
 3. hi universe => hi un
 4. universe => universe # allow longer if can't get width without breaking words

#4 would be consistent with TextWrapper handling of `break_long_words=False`

Some option (perhaps new arg?) should produce both #1 and #2, the idea being that we remove the Nth word if it doesn't fit, but break the 1st word so that there's still representation of text rather than a blank.

#3 would be the existing `break_long_words=True`, respecting width but providing max possible representation.

 - Generally speaking, shortening into one line is somewhat different than splitting into multiple lines, so it results in awkwardness when shortening is done by splitting into lines and keeping the first line.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue43518>
_______________________________________


More information about the Python-bugs-list mailing list