[New-bugs-announce] [issue14744] Use _PyUnicodeWriter API in str.format() internals

STINNER Victor report at bugs.python.org
Tue May 8 00:09:21 CEST 2012


New submission from STINNER Victor <victor.stinner at gmail.com>:

Since 7be716a47e9d (issue #14716), str.format() uses the "unicode_writer" API. I propose to continue the work in this direction to avoid more temporary buffers.

Python 3.3:

1000000 loops, best of 3: 0.573 usec per loop
100000 loops, best of 3: 16.4 usec per loop
1000000 loops, best of 3: 0.705 usec per loop
100000 loops, best of 3: 2.61 usec per loop

Python 3.3 + patch (compared to Python 3.3):

1000000 loops, best of 3: 0.516 usec per loop (-10%)
100000 loops, best of 3: 13.2 usec per loop (-20%)
1000000 loops, best of 3: 0.574 usec per loop (-18%)
100000 loops, best of 3: 2.59 usec per loop (-1%)

--

If this patch is accepted, it's more to go even deeper: use _PyUnicodeWriter in long_to_decimal_string() for example.

--

Benchmark Python 3 / Python 2 bytes:

python -m timeit -s 'fmt="{0}.{1}.{2}"' 'fmt.format("http", "client", "HTTPConnection")'
python -m timeit -s 'fmt="{0:s}"*100' 'fmt.format("ABCDEF")'
python -m timeit -s 'fmt=" [line {0:2d}] "' 'fmt.format(5)'
python -m timeit -s 'fmt="x={} y={} z={}"' 'fmt.format(12345, 12.345, 12.345+2j)'

Benchmark Python 2 unicode:

python -m timeit -s 'fmt=u"{0}.{1}.{2}"' 'fmt.format(u"http", u"client", u"HTTPConnection")'
python -m timeit -s 'fmt=u"{0:s}"*100' 'fmt.format(u"ABCDEF")'
python -m timeit -s 'fmt=u" [line {0:2d}] "' 'fmt.format(5)'
python -m timeit -s 'fmt=u"x={} y={} z={}"' 'fmt.format(12345, 12.345, 12.345+2j)'

Python 2.7 bytes:

1000000 loops, best of 3: 0.393 usec per loop
100000 loops, best of 3: 9.72 usec per loop
1000000 loops, best of 3: 0.337 usec per loop
1000000 loops, best of 3: 1.56 usec per loop

Python 2.7 wide:

1000000 loops, best of 3: 0.443 usec per loop
100000 loops, best of 3: 10.3 usec per loop
1000000 loops, best of 3: 0.785 usec per loop
100000 loops, best of 3: 2.48 usec per loop

Python 3.2 wide:

1000000 loops, best of 3: 0.457 usec per loop
100000 loops, best of 3: 10.5 usec per loop
1000000 loops, best of 3: 0.538 usec per loop
100000 loops, best of 3: 2.36 usec per loop

----------
components: Interpreter Core
files: format_writer.patch
keywords: patch
messages: 160176
nosy: haypo, loewis, pitrou, storchaka
priority: normal
severity: normal
status: open
title: Use _PyUnicodeWriter API in str.format() internals
versions: Python 3.3
Added file: http://bugs.python.org/file25490/format_writer.patch

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue14744>
_______________________________________


More information about the New-bugs-announce mailing list