[issue15573] Support unknown formats in memoryview comparisons

Stefan Krah report at bugs.python.org
Tue Aug 14 12:07:35 CEST 2012


Stefan Krah added the comment:

Here is a patch implementing by-value comparisons for all format strings
understood by the struct module. It is slightly longer than promised, since
for larger arrays it is necessary to cache an unpacking object for acceptable
performance. The fast path for identical single element native format strings
is unchanged.

The new comparison rules are stated in the memoryview docs.


For Georg's benefit, here are the memoryobject.c changes and the reasons why
I think the patch can go into 3.3:

  o cmp_structure() is split into cmp_format() and cmp_shape(), with
    unchanged semantics.

  o The new section "unpack using the struct module" is largely identical
    to existing parts of _testbuffer.c:

      - struct_get_unpacker()  ==> see _testbuffer.c:ndarray_as_list()

      - struct_unpack_single() ==> see base case in _testbuffer.c:unpack_rec()

  o The new code is only called in the previous default case of unpack_cmp().

  o The new code has 100% coverage.



Performance:
============

Identical format, bytes:
------------------------

$ ./python -m timeit -n 1000 -s "import array; x = array.array('B', [1]*10000); y = array.array('B', [1]*10000);" "x == y"
1000 loops, best of 3: 116 usec per loop

$ ./python -m timeit -n 1000 -s "import array; x = array.array('B', [1]*10000); y = array.array('B', [1]*10000); a = memoryview(x); b = memoryview(y)" "a == b"
1000 loops, best of 3: 49.1 usec per loop


Identical format, double:
-------------------------

$ ./python -m timeit -n 1000 -s "import array; x = array.array('d', [1.0]*10000); y = array.array('d', [1.0]*10000);" "x == y"
1000 loops, best of 3: 319 usec per loop

$ ./python -m timeit -n 1000 -s "import array; x = array.array('d', [1.0]*10000); y = array.array('d', [1.0]*10000); a = memoryview(x); b = memoryview(y)" "a == b"
1000 loops, best of 3: 65.7 usec per loop


Different format ('B', 'b'):
----------------------------

$ ./python -m timeit -n 100 -s "import array; x = array.array('B', [1]*10000); y = array.array('b', [1]*10000);" "x == y"
100 loops, best of 3: 131 usec per loop

$ ./python -m timeit -n 1000 -s "import array; x = array.array('B', [1]*10000); y = array.array('b', [1]*10000); a = memoryview(x); b = memoryview(y)" "a == b"
1000 loops, best of 3: 3.42 msec per loop


Different format ('d', 'f'):
----------------------------

$ ./python -m timeit -n 1000 -s "import array; x = array.array('d', [1.0]*10000); y = array.array('f', [1.0]*10000);" "x == y"
1000 loops, best of 3: 315 usec per loop

$ ./python -m timeit -n 1000 -s "import array; x = array.array('d', [1.0]*10000); y = array.array('f', [1.0]*10000); a = memoryview(x); b = memoryview(y)" "a == b"
1000 loops, best of 3: 3.59 msec per loop

----------
Added file: http://bugs.python.org/file26800/issue15573-struct.diff

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue15573>
_______________________________________


More information about the Python-bugs-list mailing list