Rough draft: Proposed format specifier for a thousands separator

Raymond Hettinger python at rcn.com
Thu Mar 12 12:35:43 EDT 2009


On Mar 12, 7:51 am, prueba... at latinmail.com wrote:
> On Mar 12, 3:30 am, Raymond Hettinger <pyt... at rcn.com> wrote:
>
>
>
> > If anyone here is interested, here is a proposal I posted on the
> > python-ideas list.
>
> > The idea is to make numbering formatting a little easier with the new
> > format() builtin
> > in Py2.6 and Py3.0:  http://docs.python.org/library/string.html#formatspec
>
> > -------------------------------------------------------------
>
> > Motivation:
>
> >     Provide a simple, non-locale aware way to format a number
> >     with a thousands separator.
>
> >     Adding thousands separators is one of the simplest ways to
> >     improve the professional appearance and readability of
> >     output exposed to end users.
>
> >     In the finance world, output with commas is the norm.  Finance
> > users
> >     and non-professional programmers find the locale approach to be
> >     frustrating, arcane and non-obvious.
>
> >     It is not the goal to replace locale or to accommodate every
> >     possible convention.  The goal is to make a common task easier
> >     for many users.
>
> > Research so far:
>
> >     Scanning the web, I've found that thousands separators are
> >     usually one of COMMA, PERIOD, SPACE, or UNDERSCORE.  The
> >     COMMA is used when a PERIOD is the decimal separator.
>
> >     James Knight observed that Indian/Pakistani numbering systems
> >     group by hundreds.   Ben Finney noted that Chinese group by
> >     ten-thousands.
>
> >     Visual Basic and its brethren (like MS Excel) use a completely
> >     different style and have ultra-flexible custom format specifiers
> >     like: "_($* #,##0_)".
>
> > Proposal I (from Nick Coghlan]:
>
> >     A comma will be added to the format() specifier mini-language:
>
> >     [[fill]align][sign][#][0][minimumwidth][,][.precision][type]
>
> >     The ',' option indicates that commas should be included in the
> > output as a
> >     thousands separator. As with locales which do not use a period as
> > the
> >     decimal point, locales which use a different convention for digit
> >     separation will need to use the locale module to obtain
> > appropriate
> >     formatting.
>
> >     The proposal works well with floats, ints, and decimals.  It also
> >     allows easy substitution for other separators.  For example:
>
> >         format(n, "6,f").replace(",", "_")
>
> >     This technique is completely general but it is awkward in the one
> >     case where the commas and periods need to be swapped.
>
> >         format(n, "6,f").replace(",", "X").replace(".", ",").replace
> > ("X", ".")
>
> > Proposal II (to meet Antoine Pitrou's request):
>
> >     Make both the thousands separator and decimal separator user
> > specifiable
> >     but not locale aware.  For simplicity, limit the choices to a
> > comma, period,
> >     space, or underscore..
>
> >     [[fill]align][sign][#][0][minimumwidth][T[tsep]][dsep precision]
> > [type]
>
> >     Examples:
>
> >         format(1234, "8.1f")    -->     '  1234.0'
> >         format(1234, "8,1f")    -->     '  1234,0'
> >         format(1234, "8T.,1f")  -->     ' 1.234,0'
> >         format(1234, "8T .f")   -->     ' 1 234,0'
> >         format(1234, "8d")      -->     '    1234'
> >         format(1234, "8T,d")      -->   '   1,234'
>
> >     This proposal meets mosts needs (except for people wanting
> > grouping
> >     for hundreds or ten-thousands), but it comes at the expense of
> >     being a little more complicated to learn and remember.  Also, it
> > makes it
> >     more challenging to write custom __format__ methods that follow
> > the
> >     format specification mini-language.
>
> >     For the locale module, just the "T" is necessary in a formatting
> > string
> >     since the tool already has procedures for figuring out the actual
> >     separators from the local context.
>
> > Comments and suggestions are welcome but I draw the line at supporting
> > Mayan numbering conventions ;-)
>
> > Raymond
>
> As far as I am concerned the most simple version plus a way to swap
> around commas and period is all that is needed.

Thanks for the feedback.

FWIW, posted a cleaned-up version of the proposal at
  http://www.python.org/dev/peps/pep-0378/


Raymond



More information about the Python-list mailing list