[New-bugs-announce] [issue7327] format: minimum width: UTF-8 separators and decimal points
Stefan Krah
report at bugs.python.org
Sun Nov 15 11:29:31 CET 2009
New submission from Stefan Krah <stefan-usenet at bytereef.org>:
This issue affects the format functions of float and decimal.
When calculating the padding necessary to reach the minimum width,
UTF-8 separators and decimal points are calculated by their byte
lengths. This can lead to printed representations that are too short.
Real world example (separator):
>>> import locale
>>> from decimal import *
>>> locale.setlocale(locale.LC_NUMERIC, "cs_CZ.UTF-8")
'cs_CZ.UTF-8'
>>> s = format(Decimal("-1.5"), ' 019.18n')
>>> len(s)
19
>>> len(s.decode('utf-8'))
16
>>> s
'-0\xc2\xa0000\xc2\xa0000\xc2\xa0001,5'
>>>
>>>
>>> s = format(-1.5, ' 019.18n')
>>> s
'-0\xc2\xa0000\xc2\xa0000\xc2\xa0001,5'
>>> len(s.decode('utf-8'))
16
>>>
Constructed example (separator and decimal point):
>>> u = {'decimal_point' : "\xc2\xbf", 'grouping' : [3, 3, 0],
'thousands_sep': "\xc2\xb4"}
>>> def get_fmt(x, locale, fmt='n'):
... return Decimal.__format__(Decimal(x), fmt, _localeconv=locale)
...
>>> s = get_fmt(Decimal("1.5"), u, "020n")
>>> s
'00\xc2\xb4000\xc2\xb4000\xc2\xb4001\xc2\xbf5'
>>> len(s.decode('utf-8'))
16
----------
messages: 95283
nosy: eric.smith, mark.dickinson, skrah
severity: normal
status: open
title: format: minimum width: UTF-8 separators and decimal points
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue7327>
_______________________________________
More information about the New-bugs-announce
mailing list