[New-bugs-announce] [issue25760] TextWrapper fails to split 'two-and-a-half-hour' correctly
Samwyse
report at bugs.python.org
Sat Nov 28 21:07:24 EST 2015
New submission from Samwyse:
Single character words in a hyphenated phrase are not split correctly. The root issue it the wordsep_re class variable. To reproduce, run the following:
>>> import textwrap
>>> textwrap.TextWrapper.wordsep_re.split('two-and-a-half-hour')
['', 'two-', 'and-a', '-half-', 'hour']
It works if 'a' is replaces with two or more alphabetic characters.
>>> textwrap.TextWrapper.wordsep_re.split('two-and-aa-half-hour')
['', 'two-', '', 'and-', '', 'aa-', '', 'half-', 'hour']
The problem is in this part of the pattern: (?=\w+[^0-9\W])
I confess that I don't understand the situation that would require that complicated of a pattern. Why wouldn't (?=\w) would work?
----------
components: Library (Lib)
messages: 255558
nosy: samwyse
priority: normal
severity: normal
status: open
title: TextWrapper fails to split 'two-and-a-half-hour' correctly
type: behavior
versions: Python 2.7, Python 3.2, Python 3.3, Python 3.4
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue25760>
_______________________________________
More information about the New-bugs-announce
mailing list