[Cryptography-dev] hash.SHA256 cpu expensive per byte versus byte-string?

Frank Siebenlist frank.siebenlist at gmail.com
Mon Jul 11 23:42:26 EDT 2016


I ran in some unexpected timing issues while using pyca/cryptography’s hash.SHA256,
and I’m wondering if there is something wrong with the timing discrepancy I see between two different hashing approaches.

When I hash a single byte-string of 10million bytes, it seems to take 2-3 orders of magnitude less time than when I loop over the bytes and hash them one by one.

Please look at the following bare-bone snippet:

—
from __future__ import absolute_import, division, print_function
import time
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.backends import default_backend
#
d1 = hashes.Hash(algorithm=hashes.SHA256(),backend=default_backend())
d2 = d1.copy()
#
n = 10000000
print('n:', n)
#
b = b'a'
ba = bytearray(n*b'a')
bs = bytes(ba)
#
s = time.time()
d1.update(bs)
t = time.time() - s
print('ba: ', t)
print(d1.finalize())
#
s = time.time()
for i in range(n):
    d2.update(b)
t = time.time() - s
print('b: ', t)
print(d2.finalize())
#
—

The output is:

—
/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/bin/python3.5 /Users/franksiebenlist/git/pyvate23/src/pyvate/messagedigest_tst.py
n: 10000000
ba:  0.027185916900634766
b'\x01\xf4\xa8|\x04\xb4\n\xf5\x9a\xad\xc0\xe8\x12)5\tp\x9c\x9a\x87c\xa6\x0b\x7f\x9e\x1903"\xf8\xb0<'
b:  15.677960872650146
b'\x01\xf4\xa8|\x04\xb4\n\xf5\x9a\xad\xc0\xe8\x12)5\tp\x9c\x9a\x87c\xa6\x0b\x7f\x9e\x1903"\xf8\xb0<'

Process finished with exit code 0
—

Results for python 2 and 3 are similar.

I understand that there may be a few more object-creations and casts involved in the looping, but 500 times slower… that was un unexpected surprise.

Comments?
Observation?

Thanks, Frank.



More information about the Cryptography-dev mailing list