[Cryptography-dev] Destroying keys and secrets?

Wed Feb 21 10:51:19 EST 2018

> On Feb 20, 2018, at 11:00 , cryptography-dev-request at python.org wrote:
> Date: Mon, 19 Feb 2018 17:14:25 -0800
> From: Paul Kehrer <paul.l.kehrer at gmail.com>
> To: cryptography-dev at python.org
> Subject: Re: [Cryptography-dev] Cryptography-dev Digest, Vol 54, Issue
> 	2
> Message-ID:
> 	<CABj5TKTv8cP6XGdCrABxF6qXJNZoipufe6_xy2y_N+R95KSuTA at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8”
> 
> Access to Python's memory (via side channel or dumping as root) is not part
> of pyca/cryptography's threat model at this time so we don't attempt to
> protect against it. Making it part of our threat model would be difficult
> due in part to the reasons you stated above as well as the difficulty in
> writing tests to prevent regression, but let's talk about what CPython does
> in this case.

Paul,

	Thank you for your enlightening comments above and below.

What Am I Really Trying To Do?:

This function is initially an AWS lambda function that I also intend to support on Google Cloud and MS Azure. This august group of engineers obviously knows of my concern but I nonetheless wish to emphasize the problem. Perhaps I can also contribute to its solution? While I have control of the function’s environment, I don’t have total control. For example, while I can hope the host OS has enough entropy in its pool to support os.urandom(), the AWS recommendation, with which I concur, is that I get my random_bytes from the AWS KMS service and stir it up with an HKDF. As in:

random_bytes = os.urandom(512//8)  # <= Should come from AWS KMS random bytes service.

# Derive a SECP384R1 private key using SHA-512, _salt512, and guid.
hkdf = HKDF(algorithm=sha512, length=384//8, salt=_salt512, info=guid, backend=backend)
secret_bytes = hkdf.derive(random_bytes)
secret_int = int.from_bytes(secret_bytes, byteorder='big')
private_key = ec.derive_private_key(secret_int, ec.SECP384R1(), backend)

As I understand it, each of the cloud vendors offers a high entropy source of random bytes. Upon completion of this function, I want to scrub RAM of key material. I will never get another chance to address the existence of this key material. There is no goodbye kiss from the cloud function execution environments. These are one-shot function calls. If I am to ensure that this crypto toxic waste doesn’t come back to bite my service, then I must dispose of it in the function context that created it. 

In other words, pyca/cryptography is likely to be used in a much more dynamic environment than heretofore. Considering that pyca/cryptography has largely succeeded in building a civilized interface to crypto routines, helping folks implement good secret/key hygiene seems in scope for the project’s goals. Considering Amazon’s embrace of pyca/cryptography for their Python lambda functions and AWS Encryption SDK, I am unlikely to be the sole user.

> int.from_bytes will unfortunately make a copy (to a Python integer). That
> int will then be copied into a BN via _int_to_bn (
> https://github.com/pyca/cryptography/blob/master/src/cryptography/hazmat/backends/openssl/backend.py#L317-L346)
> when you call derive_private_key. It will actually be converted twice (a
> thing we should fix) (
> https://github.com/pyca/cryptography/blob/master/src/cryptography/hazmat/backends/openssl/backend.py#L1383-L1419).
> Although the resulting BNs will themselves be zeroed as freed, this means a
> secret scalar bytestring created in Python will be resident in memory no
> less than 5 times (3 byte strings, 2 numbers).
> 
> Obviously the next logical question is why you'd provide a Python integer
> when we're just going to convert it back to big endian bytes anyway.
> Disregarding the memory clearing issue it's also inefficient. When
> originally designing some of the APIs we made a mistake and chose integers
> instead of big endian bytes (see: numbers classes). We have not yet added
> alternate APIs to potentially enable us to deprecate numbers because the
> improvement in efficiency probably isn't worth the pain of trying to
> convert the huge number of users of those classes.
> 
> ec.derive_private_key_from_bytes(secret_bytes, ec.SECP384R1(), backend)
> could potentially be a way to do this specific operation while reducing the
> number of copies (to zero in Python and 2-3 in OpenSSL, although the latter
> are zeroed), but without tests that can detect non-required copies of
> secret material it would be extremely hard to prevent regression in the
> long term as the code is updated.

	Thank you for the above excellent exposition of the state of data copying in pyca/cryptography. Considering the appropriate "stürm und drang" that accompanied Meltdown, Specter, and other exploits around key material, I think many developers have an interest in making the changes to their code that allows them to ensure precise key material lifetimes. Of course, with modern internet survey tools, the pyca/cryptography team can easily ask their users if they would make these changes.

	To your point about regression testing the copying of key material: I place this in the category of expertise of the library developer. There are plenty of similar cryptographic issues that the pyca/cryptography team has had to tackle through discipline and intra-project communication. Obviously, because your team already clears and then frees items passed to and from OpenSSL, you know how to do this. How can I help?

	I also note that ec.derive_private_key() has a paired routine where pyca/cryptography takes responsibility for the secret_bytes: ec.generate_private_key(). If it manages its secret copies carefully, I might consider using it instead of providing my own secret_bytes. It would, of course, depend upon pyca/cryptography’s high entropy data source (os.urandom()?). As this is in the hazmat section of the library, perhaps a discussion in the documentation about digital toxic/hazardous waste is appropriate? How can I help?

> Given your chosen constraints have you considered deriving a key in
> subprocess, serializing it, and reading it from stdout in the parent
> process? By doing this you'd have a precisely defined intermediate object
> lifetime and the only secret in the parent process's memory would be a
> single DER or PEM bytestring containing the EC key.

	That is one excellent way to achieve my goal in a normal execution environment. All of the industry trends lead me to believe that AWS lambda function-like environments will become a high percentage of where crypto operations are performed. Due to their small scope and statelessness, they lend themselves to focussed and highly tuned solutions. They are a natural place for the cryptographic core functions of many systems to reside. Put all of your keys in one basket and then watch that basket very carefully.

	Thank you for taking the time to examine my concerns. 

Anon,
Andrew
____________________________________
Andrew W. Donoho
Donoho Design Group, L.L.C.
andrew.donoho at gmail.com, +1 (512) 666-7596, twitter.com/adonoho

No risk, no art.
	No art, no reward.
		-- Seth Godin