[New-bugs-announce] [issue7327] format: minimum width: UTF-8 separators and decimal points

Stefan Krah report at bugs.python.org
Sun Nov 15 11:29:31 CET 2009


New submission from Stefan Krah <stefan-usenet at bytereef.org>:

This issue affects the format functions of float and decimal.

When calculating the padding necessary to reach the minimum width,
UTF-8 separators and decimal points are calculated by their byte
lengths. This can lead to printed representations that are too short.


Real world example (separator):

>>> import locale
>>> from decimal import *
>>> locale.setlocale(locale.LC_NUMERIC, "cs_CZ.UTF-8")
'cs_CZ.UTF-8'
>>> s = format(Decimal("-1.5"),  ' 019.18n')
>>> len(s)
19
>>> len(s.decode('utf-8'))
16
>>> s
'-0\xc2\xa0000\xc2\xa0000\xc2\xa0001,5'
>>> 
>>> 
>>> s = format(-1.5,  ' 019.18n')
>>> s
'-0\xc2\xa0000\xc2\xa0000\xc2\xa0001,5'
>>> len(s.decode('utf-8'))
16
>>> 


Constructed example (separator and decimal point):

>>> u = {'decimal_point' : "\xc2\xbf",  'grouping' : [3, 3, 0],
'thousands_sep': "\xc2\xb4"}
>>> def get_fmt(x, locale, fmt='n'):
...     return Decimal.__format__(Decimal(x), fmt, _localeconv=locale)
... 
>>> s = get_fmt(Decimal("1.5"), u, "020n")
>>> s
'00\xc2\xb4000\xc2\xb4000\xc2\xb4001\xc2\xbf5'
>>> len(s.decode('utf-8'))
16

----------
messages: 95283
nosy: eric.smith, mark.dickinson, skrah
severity: normal
status: open
title: format: minimum width: UTF-8 separators and decimal points

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue7327>
_______________________________________


More information about the New-bugs-announce mailing list