[issue13706] non-ascii fill characters no longer work in formatting

STINNER Victor report at bugs.python.org
Fri Feb 24 01:56:28 CET 2012


STINNER Victor <victor.stinner at gmail.com> added the comment:

> The ps_AF locale fails with an assert after the latest commit:
> ...
> format(13232434234.23423, "n")
> python: Python/formatter_unicode.c:606: fill_number: 
> Assertion `r == spec->n_grouped_digits' failed.
> Aborted

Oh, you found a locale with a non-ASCII decimal point, cool! I failed to find such locale. The last commit makes Python supports non-ASCII decimal point.

Your comment is incorrect, it was already failing before my commit ;-) Example at changeset 548a023c8230:

$ LANG=ps_AF ./python 
Python 3.3.0a0 (default:548a023c8230, Feb 24 2012, 01:48:01) 
>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'ps_AF')
'ps_AF'
>>> format(0.1, 'n')
python: Objects/unicodeobject.c:391: _PyUnicode_CheckConsistency: Assertion `maxchar < 128' failed.
Abandon

--

By the way, Python 3.2 fails also to handle non-ASCII thousands separator or non-ASCII decimal point:

$ LANG=ps_AF python3
Python 3.2.1 (default, Jul 11 2011, 18:54:42) 
[GCC 4.6.1 20110627 (Red Hat 4.6.1-1)] on linux2
>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'ps_AF')
'ps_AF'
>>> format(1234, 'n')
'1\Uffffffd9\Uffffffac234'
>>> format(0.1, 'n')
'0\Uffffffd9\Uffffffab1'

D9 AC/AB are byte strings b'\xD9\xAC' and b'\xD9\xAB' which are UTF-8 encode strings corresponding to U+066C (arabic thousands separator) and U+066B (arabic decimal separator).

\Uffffffab is a bug in a cast from signed char to 32-bit unsigned integer (Py_UNICODE on Linux).

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue13706>
_______________________________________


More information about the Python-bugs-list mailing list