[Tutor] String Formatting
Kent Johnson
kent_johnson at skillsoft.com
Fri Sep 10 18:08:16 CEST 2004
I'm always up for an optimization challenge :-) I tried several methods.
Other than the locale-based solution, they all work for non-negative
integers only. Here is the fastest one I found. It takes about 1/10 the
time of the locale version and wouldn't be too hard to modify to work for
negative integers also.
def commafy(val):
''' Straightforward approach using string slices '''
s = str(val)
start = len(s) % 3
chunks = []
if start > 0:
chunks.append(s[:start])
for i in range(start, len(s), 3):
chunks.append(s[i:i+3])
return ','.join(chunks)
Note that much of the time of the locale version is in accessing and
switching locales. If you can set the locale once before doing the
conversions, it is about 3x faster than the version that sets and restores
the locale.
Here are all the versions I tried and a timing harness to compare them. The
output I get is this:
commafy1: 1.052021 secs
commafy2: 1.116770 secs
commafy3: 1.067124 secs
commafy4: 2.469994 secs
commafy5: 1.125030 secs
commafyLocale: 7.309395 secs
commafyLocale2: 3.077568 secs
Kent
###############################################
# commafy 1 to 3 are all variations on string slice and join
def commafy1(val):
''' Straightforward approach using string slices '''
s = str(val)
start = len(s) % 3
chunks = []
if start > 0:
chunks.append(s[:start])
for i in range(start, len(s), 3):
chunks.append(s[i:i+3])
return ','.join(chunks)
def commafy2(val):
''' Use list.extend() instead of an append loop '''
s = str(val)
start = len(s) % 3
chunks = []
if start > 0:
chunks.append(s[:start])
chunks.extend([s[i:i+3] for i in range(start, len(s), 3)])
return ','.join(chunks)
def commafy3(val):
''' Use a list comprehension instead of a loop. The initial
segment is a problem. '''
s = str(val)
start = len(s) % 3
chunks = [s[i:i+3] for i in range(start, len(s), 3)]
if start > 0:
chunks.insert(0, s[:start])
return ','.join(chunks)
def commafy4(val):
''' Iterate over the input and insert the commas directly '''
s = list(str(val))
s.reverse()
i = iter(s)
result = []
while True:
try:
result.append(i.next())
result.append(i.next())
result.append(i.next())
result.append(',')
except StopIteration:
if result[-1] == ',': result.pop()
break
result.reverse()
return ''.join(result)
def commafy5(val):
''' Use divmod to make the chunks '''
if val == 0: return '0'
chunks = []
while val > 999:
val, rem = divmod(val, 1000)
chunks.append(str(rem).zfill(3))
chunks.append(str(val))
chunks.reverse()
return ','.join(chunks)
import locale
def commafyLocale(val):
''' Official solution using locale module '''
oldloc = locale.setlocale(locale.LC_ALL)
locale.setlocale(locale.LC_ALL, 'en')
result = locale.format('%d', val, True)
locale.setlocale(locale.LC_ALL, oldloc)
return result
locale.setlocale(locale.LC_ALL, 'en') # This line speeds up the ABOVE
version by 30% !
def commafyLocale2(val):
''' Use locale module but don't change locale '''
result = locale.format('%d', val, True)
return result
# timing test
import timeit
def test(f):
return [f(10**i) for i in range(11)]
correctAnswer = [
'1',
'10',
'100',
'1,000',
'10,000',
'100,000',
'1,000,000',
'10,000,000',
'100,000,000',
'1,000,000,000',
'10,000,000,000',
]
def timeOne(fn):
# First run the function and check that it gets the correct results
actualAnswer = test(fn)
if actualAnswer != correctAnswer:
print fn.__name__, 'does not give the correct answer'
print actualAnswer
return
# Now time it
setup = "from __main__ import test, " + fn.__name__
stmt = 'test(%s)' % fn.__name__
t = timeit.Timer(stmt, setup)
secs = t.timeit(10000)
print '%s: %f secs' % (fn.__name__, secs)
fnsToTest = [
commafy1,
commafy2,
commafy3,
commafy4,
commafy5,
commafyLocale,
commafyLocale2,
]
for fn in fnsToTest:
timeOne(fn)
At 10:34 AM 9/10/2004 -0400, Isr Gish wrote:
>Thanks orbitz,
>
>But...
>
> >oldloc = locale.setlocale(locale.LC_ALL)
> >locale.setlocale(locale.LC_ALL, 'en_US')
> >locale.format('%d', some_num, True)
> >locale.setlocale(locale.LC_ALL, oldloc)
> >
>
>I would rather not use this way, iit takes about 25 times longer then a
>regular % format. And I'm using it for about 1,000,000 times.
>
>Isr
>
>-----Original Message-----
> >From: "orbitz"<orbitz at ezabel.com>
> >Sent: 9/9/04 10:55:22 PM
> >To: "Isr Gish"<isrgish at fastem.com>, "tutor at python.org"<tutor at python.org>
> >Subject: Re: [Tutor] String Formatting
> >
> >Easiest way would probably be using locale.format in some code I do:
> >
> >oldloc = locale.setlocale(locale.LC_ALL)
> >locale.setlocale(locale.LC_ALL, 'en_US')
> >locale.format('%d', some_num, True)
> >locale.setlocale(locale.LC_ALL, oldloc)
> >
> >
> >Isr Gish wrote:
> >
> >>Hi,
> >>
> >>How can I format a integer in a format string with a comma for example
> >>print 'Your Account has %f' %amount
> >>That should print:
> >>Your Account has 1,000.00
> >>
> >>Thanks
> >>Isr
> >>
> >>_______________________________________________
> >>Tutor maillist - Tutor at python.org
> >>http://mail.python.org/mailman/listinfo/tutor
> >>
> >>
> >>
> >
>
>_______________________________________________
>Tutor maillist - Tutor at python.org
>http://mail.python.org/mailman/listinfo/tutor
More information about the Tutor
mailing list