Confusing textwrap parameters, and request for RE help

DL Neil PythonList at DancesWithMice.info
Tue Mar 24 17:02:50 EDT 2020


On 23/03/20 8:00 AM, Chris Angelico wrote:
> When using textwrap.fill() or friends, setting break_long_words=False
> without also setting break_on_hyphens=False has the very strange
> behaviour that a long hyphenated word will still be wrapped. I
> discovered this as a very surprising result when trying to wrap a
> paragraph that contained a URL, and wanting the URL to be kept
> unchanged:

I dropped textwrap years ago. Which policy likely shows my/my 
applications' bias.

Today it feels like an anachronism because it is comes from the era of 
fixed-width fonts and line-lengths denominated in characters*. The issue 
is that it was designed to re-define 'white space' and to enable the 
conversion of text 'wrapped' in one (fixed) format, to suit another. 
With the arrival?predominance of proportional-width fonts, the skills of 
hyphenation have started to go the way of cursive hand-writing 
[substitute any number of grumpy, old man regrets/favorite complaints, 
here].


Beyond such syntactic concerns, textwrap applies no semantic meaning to 
its text-content. For that we have to move to markup languages (HTML is 
my bias - but (largely) presumes screen presentation). Python seems to 
prefer reST (eg Sphinx). See also, markdown.


However, it's annoying if one is 'stuck' within a spec and some list 
smart-alec comes along saying 'change the tool'...

Your idea of sub-classing (as I'm sure YOU know, textwrap is but a 
convenience-function) struck me as clever-thinking! Could textwrap's 
'final format' be caught just before 'return', enabling a post-process 
to undo anything textwrap has done, and (re-)format the URLs to spec, or 
to treat textwrap's output as a template and 'inject' the URL 
appropriately? If not a sub-class, a decorator?

My idea (being more simple-minded than you!), would be to partition the 
text (yes, am alluding to the Python str.method):
- textwrap the 'early text',
- treat the URL as a string using the required convention,
- textwrap the 'later text', and
- str.join() the three components/partitions afterwards.

Both likely 'force' the URL to occupy a line of its own, and thus create 
some odd-looking results!


* that said, my terminal windows and SublimeText are all configured to 
use fixed-width fonts - seems easier on the eyes! Similarly, the 
mainframe legacy of punched-card input is alive-and-well in Python 
'standards', which talk of a 79- or 80-character line-width. Please 
don't take me by the ear and wash-out my mouth with soap and water!
-- 
Regards =dn


More information about the Python-list mailing list