[Tutor] String Formatting

Kent Johnson kent_johnson at skillsoft.com
Fri Sep 10 18:08:16 CEST 2004


I'm always up for an optimization challenge :-) I tried several methods. 
Other than the locale-based solution, they all work for non-negative 
integers only. Here is the fastest one I found. It takes about 1/10 the 
time of the locale version and wouldn't be too hard to modify to work for 
negative integers also.

def commafy(val):
     ''' Straightforward approach using string slices '''
     s = str(val)
     start = len(s) % 3

     chunks = []
     if start > 0:
         chunks.append(s[:start])

     for i in range(start, len(s), 3):
         chunks.append(s[i:i+3])

     return ','.join(chunks)

Note that much of the time of the locale version is in accessing and 
switching locales. If you can set the locale once before doing the 
conversions, it is about 3x faster than the version that sets and restores 
the locale.

Here are all the versions I tried and a timing harness to compare them. The 
output I get is this:
commafy1: 1.052021 secs
commafy2: 1.116770 secs
commafy3: 1.067124 secs
commafy4: 2.469994 secs
commafy5: 1.125030 secs
commafyLocale: 7.309395 secs
commafyLocale2: 3.077568 secs

Kent

###############################################
# commafy 1 to 3 are all variations on string slice and join
def commafy1(val):
     ''' Straightforward approach using string slices '''
     s = str(val)
     start = len(s) % 3

     chunks = []
     if start > 0:
         chunks.append(s[:start])

     for i in range(start, len(s), 3):
         chunks.append(s[i:i+3])

     return ','.join(chunks)


def commafy2(val):
     ''' Use list.extend() instead of an append loop '''
     s = str(val)
     start = len(s) % 3

     chunks = []
     if start > 0:
         chunks.append(s[:start])

     chunks.extend([s[i:i+3] for i in range(start, len(s), 3)])

     return ','.join(chunks)


def commafy3(val):
     ''' Use a list comprehension instead of a loop. The initial
         segment is a problem. '''
     s = str(val)
     start = len(s) % 3

     chunks = [s[i:i+3] for i in range(start, len(s), 3)]
     if start > 0:
         chunks.insert(0, s[:start])

     return ','.join(chunks)


def commafy4(val):
     ''' Iterate over the input and insert the commas directly '''
     s = list(str(val))
     s.reverse()
     i = iter(s)
     result = []
     while True:
         try:
             result.append(i.next())
             result.append(i.next())
             result.append(i.next())
             result.append(',')
         except StopIteration:
             if result[-1] == ',': result.pop()
             break
     result.reverse()
     return ''.join(result)


def commafy5(val):
     ''' Use divmod to make the chunks '''
     if val == 0: return '0'

     chunks = []
     while val > 999:
         val, rem = divmod(val, 1000)
         chunks.append(str(rem).zfill(3))
     chunks.append(str(val))
     chunks.reverse()
     return ','.join(chunks)


import locale
def commafyLocale(val):
     ''' Official solution using locale module '''
     oldloc = locale.setlocale(locale.LC_ALL)
     locale.setlocale(locale.LC_ALL, 'en')
     result = locale.format('%d', val, True)
     locale.setlocale(locale.LC_ALL, oldloc)
     return result


locale.setlocale(locale.LC_ALL, 'en')   # This line speeds up the ABOVE 
version by 30% !
def commafyLocale2(val):
     ''' Use locale module but don't change locale '''
     result = locale.format('%d', val, True)
     return result


# timing test
import timeit
def test(f):
     return [f(10**i) for i in range(11)]

correctAnswer = [
'1',
'10',
'100',
'1,000',
'10,000',
'100,000',
'1,000,000',
'10,000,000',
'100,000,000',
'1,000,000,000',
'10,000,000,000',
]

def timeOne(fn):
     # First run the function and check that it gets the correct results
     actualAnswer = test(fn)
     if actualAnswer != correctAnswer:
         print fn.__name__, 'does not give the correct answer'
         print actualAnswer
         return

     # Now time it
     setup = "from __main__ import test, " + fn.__name__
     stmt = 'test(%s)' % fn.__name__

     t = timeit.Timer(stmt, setup)
     secs = t.timeit(10000)
     print '%s: %f secs' % (fn.__name__, secs)


fnsToTest = [
     commafy1,
     commafy2,
     commafy3,
     commafy4,
     commafy5,
     commafyLocale,
     commafyLocale2,
]

for fn in fnsToTest:
     timeOne(fn)


At 10:34 AM 9/10/2004 -0400, Isr Gish wrote:
>Thanks orbitz,
>
>But...
>
>    >oldloc = locale.setlocale(locale.LC_ALL)
>    >locale.setlocale(locale.LC_ALL, 'en_US')
>    >locale.format('%d', some_num, True)
>    >locale.setlocale(locale.LC_ALL, oldloc)
>    >
>
>I would rather not use this way, iit takes about 25 times longer then a 
>regular % format. And I'm using it for about 1,000,000 times.
>
>Isr
>
>-----Original Message-----
>    >From: "orbitz"<orbitz at ezabel.com>
>    >Sent: 9/9/04 10:55:22 PM
>    >To: "Isr Gish"<isrgish at fastem.com>, "tutor at python.org"<tutor at python.org>
>    >Subject: Re: [Tutor] String Formatting
>    >
>    >Easiest way would probably be using locale.format in some code I do:
>    >
>    >oldloc = locale.setlocale(locale.LC_ALL)
>    >locale.setlocale(locale.LC_ALL, 'en_US')
>    >locale.format('%d', some_num, True)
>    >locale.setlocale(locale.LC_ALL, oldloc)
>    >
>    >
>    >Isr Gish wrote:
>    >
>    >>Hi,
>    >>
>    >>How can I format a integer in a format string with a comma for example
>    >>print 'Your Account has %f' %amount
>    >>That should print:
>    >>Your Account has 1,000.00
>    >>
>    >>Thanks
>    >>Isr
>    >>
>    >>_______________________________________________
>    >>Tutor maillist  -  Tutor at python.org
>    >>http://mail.python.org/mailman/listinfo/tutor
>    >>
>    >>
>    >>
>    >
>
>_______________________________________________
>Tutor maillist  -  Tutor at python.org
>http://mail.python.org/mailman/listinfo/tutor



More information about the Tutor mailing list