From frank.siebenlist at gmail.com  Fri Jul  1 12:11:10 2016
From: frank.siebenlist at gmail.com (Frank Siebenlist)
Date: Fri, 1 Jul 2016 09:11:10 -0700
Subject: [Cryptography-dev] "intrinsic" symmetric key identifier?
Message-ID: <CAJGvTrQiwH8msDBxj-xkBL4AALE6-RRkexfj=EO5=2h+Ky_aew@mail.gmail.com>

Many times you will have two parties with a shared symmetric key that
they will use to communicate authenticated and private messages to
each other. If you have multiple keys, then you somehow have to match
the key to the received message based on the context, the sender, or
some key identifier that both parties associate with the used key.

I'm looking for a good symmetric key identifier to use without the
need for context or any pre-shared key-identifier. Some standardized
way to derive a key-id from the key itself, such that both parties can
derive it independently without any pre-shared key specific knowledge.
Of course that key identifier shouldn't reveal anything that could
compromise the key itself.

I haven't been able to find a well-established way to achieve this (yet)...

One possible solution could be to just taking the sha256 of the key.
As long as the key is truly random... that should be ok (?).
It could conflict with possible derived keys that are generated that way.

Or maybe using one of the available KDFs?
Those should be one-way-functions that wouldn't leak anything(?)
Maybe use a well-known nonce to avoid any possible collisions with derived-keys.

Any suggestions? Anything I missed?

Regards, Frank.

From _ at lvh.io  Fri Jul  1 12:51:12 2016
From: _ at lvh.io (lvh)
Date: Fri, 1 Jul 2016 11:51:12 -0500
Subject: [Cryptography-dev] "intrinsic" symmetric key identifier?
In-Reply-To: <CAJGvTrQiwH8msDBxj-xkBL4AALE6-RRkexfj=EO5=2h+Ky_aew@mail.gmail.com>
References: <CAJGvTrQiwH8msDBxj-xkBL4AALE6-RRkexfj=EO5=2h+Ky_aew@mail.gmail.com>
Message-ID: <749A9BBD-6191-46E6-BF2B-4134FC37A27B@lvh.io>

Hi Frank,

> On Jul 1, 2016, at 11:11 AM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
> 
> snip snip key identifiers

This is why some key derivation functions and PRFs have ?purpose? or ?info" fields, yes; including BLAKE2 and HKDF. Deriving a lesser key (which might just be a keyid) is a perfectly valid strategy from objcap practice. I?m doing something similar in the scheme of a larger semiprivate key scheme using libsodium. You probably do want something that explicitly supports that instead of just implicitly picking a particular nonce or whatever ? I?m not sure which nonce you?re referring to, I don?t think the systems you mentioned take one. TL;DR: make the derivation completely distinct based on what you?re deriving and why you?re deriving it :)

You might also want to look at the related concept of NMR and key-wrap, which might let you solve the problem at a slightly different part of your protocol; essentially giving you a protected key with associated data about that key. It?s not entirely clear what the people standardizing GCM-SIV want to do exactly (other than ?not TLS?, I don?t think they?ve said), but this is the obvious choice, especially given GCM-SIVs separate code path for tiny messages and the historical linking of the two from a crypto design perspective.

I am also writing NMR stuff on the side in libsodium/caesium, but that focuses mostly on being a Fernet replacement, rather than a keywrap, using secretbox (which makes it easy because big nonce space). Pretty sure I can translate it to the AEAD schemes, but the security proof gets iffier. Which reminds me: we should talk about Clojure bindings to libsodium some time :)


lvh

From frank.siebenlist at gmail.com  Fri Jul  1 13:54:44 2016
From: frank.siebenlist at gmail.com (Frank Siebenlist)
Date: Fri, 1 Jul 2016 10:54:44 -0700
Subject: [Cryptography-dev] "intrinsic" symmetric key identifier?
In-Reply-To: <749A9BBD-6191-46E6-BF2B-4134FC37A27B@lvh.io>
References: <CAJGvTrQiwH8msDBxj-xkBL4AALE6-RRkexfj=EO5=2h+Ky_aew@mail.gmail.com>
 <749A9BBD-6191-46E6-BF2B-4134FC37A27B@lvh.io>
Message-ID: <CAJGvTrRFuuhU=N4vBL04WEkjrFFgPdkrg0X+xmMPLN6zMO09+A@mail.gmail.com>

Hi lvh,

Guess you're the "lvh" who is responsible for "lvh/caesium" ;-).  Good
to see that you've reanimated that project! Believe you were kind of
distracted for awhile, which "forced" me to play around with
"franks42/naclj"... which has been on live-support for about a year
now, because my new job consumes even my playtime.

As part of that "franks42/naclj" effort, I suggested to standardize
the derivation of a kid from the two curve25519 public keys. However,
I recognize that you do not always have any DH-keys available when you
have a bare symmetric key, so I suggested a scheme based on blake2. I
wrote up some rationale for those choices here:
"https://github.com/franks42/naclj/blob/master/Keys%2C%20IDs%2C%20and%20URNs.md",
but never got much traction on the libsodium list,... and then I got
distracted.

Now I'm faced again with similar key-management issues, which could
benefit from such key-derived kid's - so I try again.

In summary, your suggestions all resonate very well, but... there are
too many of them. Let's just pick one identifier derivation mechanism
for symmetric keys, document it, implement it, use it!

Groetjes, Frank.

On Fri, Jul 1, 2016 at 9:51 AM, lvh <_ at lvh.io> wrote:
> Hi Frank,
>
>> On Jul 1, 2016, at 11:11 AM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
>>
>> snip snip key identifiers
>
> This is why some key derivation functions and PRFs have ?purpose? or ?info" fields, yes; including BLAKE2 and HKDF. Deriving a lesser key (which might just be a keyid) is a perfectly valid strategy from objcap practice. I?m doing something similar in the scheme of a larger semiprivate key scheme using libsodium. You probably do want something that explicitly supports that instead of just implicitly picking a particular nonce or whatever ? I?m not sure which nonce you?re referring to, I don?t think the systems you mentioned take one. TL;DR: make the derivation completely distinct based on what you?re deriving and why you?re deriving it :)
>
> You might also want to look at the related concept of NMR and key-wrap, which might let you solve the problem at a slightly different part of your protocol; essentially giving you a protected key with associated data about that key. It?s not entirely clear what the people standardizing GCM-SIV want to do exactly (other than ?not TLS?, I don?t think they?ve said), but this is the obvious choice, especially given GCM-SIVs separate code path for tiny messages and the historical linking of the two from a crypto design perspective.
>
> I am also writing NMR stuff on the side in libsodium/caesium, but that focuses mostly on being a Fernet replacement, rather than a keywrap, using secretbox (which makes it easy because big nonce space). Pretty sure I can translate it to the AEAD schemes, but the security proof gets iffier. Which reminds me: we should talk about Clojure bindings to libsodium some time :)
>
>
> lvh
> _______________________________________________
> Cryptography-dev mailing list
> Cryptography-dev at python.org
> https://mail.python.org/mailman/listinfo/cryptography-dev

From _ at lvh.io  Fri Jul  1 18:53:12 2016
From: _ at lvh.io (lvh)
Date: Fri, 1 Jul 2016 17:53:12 -0500
Subject: [Cryptography-dev] "intrinsic" symmetric key identifier?
In-Reply-To: <CAJGvTrRFuuhU=N4vBL04WEkjrFFgPdkrg0X+xmMPLN6zMO09+A@mail.gmail.com>
References: <CAJGvTrQiwH8msDBxj-xkBL4AALE6-RRkexfj=EO5=2h+Ky_aew@mail.gmail.com>
 <749A9BBD-6191-46E6-BF2B-4134FC37A27B@lvh.io>
 <CAJGvTrRFuuhU=N4vBL04WEkjrFFgPdkrg0X+xmMPLN6zMO09+A@mail.gmail.com>
Message-ID: <0D20123F-F12B-4234-84DF-E9AECE6E31C9@lvh.io>


> On Jul 1, 2016, at 12:54 PM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
> 
> Hi lvh,
> 
> Guess you're the "lvh" who is responsible for "lvh/caesium" ;-).

Yup. I?m also a founding member of PyCA and the resident cryptographer, which is why I?m on this list :-)

> Good to see that you've reanimated that project! Believe you were kind of
> distracted for awhile, which "forced" me to play around with
> "franks42/naclj"... which has been on live-support for about a year
> now, because my new job consumes even my playtime.

It did what I needed it to do at the time, so I didn?t fix what wasn?t broken ;-) I don?t recall anyone reaching out or filing issues. Once someone did ask questions and contributed code, I was happy to merge/review/cut new releases/do new development. More dev is happening now to scratch my own itch :)

Currently I?m doing a lot of work around NMR as mentioned before, and API design around e.g. different byte buffer types, so that for example you can efficiently dump a nonce and a ciphertext in the same buffer, or derive multiple keys in one iteration of BLAKE2, etc. Also a bunch of work around e.g. pinning and verification of the produced binding and related benchmarking :)

I invite you to look at caesium again, because some of the criticisms you make in naclj?s README no longer apply (e.g. caesium no longer uses kalium and instead binds libsodium directly, albeit for a different reason than what naclj mentions). Because the binding is done in Clojure, it can do all sorts of metaprogramming including binding every permutation of a particular method for various byte types in addition to the inspection mentioned above, e.g.: https://github.com/lvh/caesium/blob/master/src/caesium/binding.clj#L56-L62

Do you intend to continue to develop naclj, or is it effectively retired?

> As part of that "franks42/naclj" effort, I suggested to standardize
> the derivation of a kid from the two curve25519 public keys. However,
> I recognize that you do not always have any DH-keys available when you
> have a bare symmetric key,

Is that scheme documented anywhere? I wonder what the use case is for two curve25519 pubkeys ? the ?obvious" case would seem to easily degenerate to the shared symmetric secret (after doing a DH exchange).

> so I suggested a scheme based on blake2. I
> wrote up some rationale for those choices here:
> "https://github.com/franks42/naclj/blob/master/Keys%2C%20IDs%2C%20and%20URNs.md",
> but never got much traction on the libsodium list,... and then I got
> distracted.
> 
> Now I'm faced again with similar key-management issues, which could
> benefit from such key-derived kid's - so I try again.
> 
> In summary, your suggestions all resonate very well, but... there are
> too many of them. Let's just pick one identifier derivation mechanism
> for symmetric keys, document it, implement it, use it!

I think there are a few problems preventing this from happening right now, including:

- Historically, cryptographers have not researched key wrap anywhere near as much as other schemes. I think the only reason it?s en vogue now is the interest in NMR, which at least a handful of cryptographers (Rogaway, Krovetz, and humbly, myself) care about now, and is incidentally a related problem.
- People want subtly different things for their protocols, further reducing interest. Do you just want to identify a key? That?s fine, but a problem many protocols dodge. Do you want to ship a key to someone who already has a secret or asymmetric key? AEAD (including NMR AEAD in particular, so key wrap) and just asymmetric encryption (a la non-PFS TLS or GPG) is probably where you?re going to land.
- How does this fit in a grander protocol and what is that protocol trying to accomplish?
- How is the key identifier authenticated? What prevents Mallory from just modifying the key id bytes to effectively deny service? E.g. if I?m doing this to make sure I can rotate keys effectively, how do I auth that? Ideally without replacing an unrotatable secret key with another unrotatable secret key :D (Effective key rotation for KEKs is definitely something I care about.)
- When keys are being sent alongside messages, how do we make this not a footgun for e.g. key selection attacks? (Granted, harder for EdDSA, but I want protocols to be correct for arbitrary schemes :)). PyCA cares about recipes being not footguns. That?s a mixed bag: on the one hand, it means we can give safe advice, on the other hand, it does mean that all we have is Fernet...

Overall, I think this is a reasonable idea for some protocols, but I think we need to be extremely clear about what that is, who it?s for, and how to use it.


lvh

> Groetjes, Frank.
> 
> On Fri, Jul 1, 2016 at 9:51 AM, lvh <_ at lvh.io> wrote:
>> Hi Frank,
>> 
>>> On Jul 1, 2016, at 11:11 AM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
>>> 
>>> snip snip key identifiers
>> 
>> This is why some key derivation functions and PRFs have ?purpose? or ?info" fields, yes; including BLAKE2 and HKDF. Deriving a lesser key (which might just be a keyid) is a perfectly valid strategy from objcap practice. I?m doing something similar in the scheme of a larger semiprivate key scheme using libsodium. You probably do want something that explicitly supports that instead of just implicitly picking a particular nonce or whatever ? I?m not sure which nonce you?re referring to, I don?t think the systems you mentioned take one. TL;DR: make the derivation completely distinct based on what you?re deriving and why you?re deriving it :)
>> 
>> You might also want to look at the related concept of NMR and key-wrap, which might let you solve the problem at a slightly different part of your protocol; essentially giving you a protected key with associated data about that key. It?s not entirely clear what the people standardizing GCM-SIV want to do exactly (other than ?not TLS?, I don?t think they?ve said), but this is the obvious choice, especially given GCM-SIVs separate code path for tiny messages and the historical linking of the two from a crypto design perspective.
>> 
>> I am also writing NMR stuff on the side in libsodium/caesium, but that focuses mostly on being a Fernet replacement, rather than a keywrap, using secretbox (which makes it easy because big nonce space). Pretty sure I can translate it to the AEAD schemes, but the security proof gets iffier. Which reminds me: we should talk about Clojure bindings to libsodium some time :)
>> 
>> 
>> lvh
>> _______________________________________________
>> Cryptography-dev mailing list
>> Cryptography-dev at python.org
>> https://mail.python.org/mailman/listinfo/cryptography-dev
> _______________________________________________
> Cryptography-dev mailing list
> Cryptography-dev at python.org
> https://mail.python.org/mailman/listinfo/cryptography-dev

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 643 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/cryptography-dev/attachments/20160701/a4542067/attachment-0001.sig>

From _ at lvh.io  Fri Jul  1 18:56:33 2016
From: _ at lvh.io (lvh)
Date: Fri, 1 Jul 2016 17:56:33 -0500
Subject: [Cryptography-dev] "intrinsic" symmetric key identifier?
In-Reply-To: <0D20123F-F12B-4234-84DF-E9AECE6E31C9@lvh.io>
References: <CAJGvTrQiwH8msDBxj-xkBL4AALE6-RRkexfj=EO5=2h+Ky_aew@mail.gmail.com>
 <749A9BBD-6191-46E6-BF2B-4134FC37A27B@lvh.io>
 <CAJGvTrRFuuhU=N4vBL04WEkjrFFgPdkrg0X+xmMPLN6zMO09+A@mail.gmail.com>
 <0D20123F-F12B-4234-84DF-E9AECE6E31C9@lvh.io>
Message-ID: <74CD0FB6-06AD-4601-BEEE-34F27BCCDBB6@lvh.io>

? esprit de l?escalier: there?s also the difference between public-parameter hashes and a PRF, and BLAKE2 will do both for you. So, are you trying to identify a key in such a way that Eve can not detect the key being reused (but Bob shares a key with you), or is that OK?

lvh
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 639 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/cryptography-dev/attachments/20160701/3f203ecb/attachment.sig>

From frank.siebenlist at gmail.com  Sat Jul  2 19:52:27 2016
From: frank.siebenlist at gmail.com (Frank Siebenlist)
Date: Sat, 2 Jul 2016 16:52:27 -0700
Subject: [Cryptography-dev] "intrinsic" symmetric key identifier?
In-Reply-To: <0D20123F-F12B-4234-84DF-E9AECE6E31C9@lvh.io>
References: <CAJGvTrQiwH8msDBxj-xkBL4AALE6-RRkexfj=EO5=2h+Ky_aew@mail.gmail.com>
 <749A9BBD-6191-46E6-BF2B-4134FC37A27B@lvh.io>
 <CAJGvTrRFuuhU=N4vBL04WEkjrFFgPdkrg0X+xmMPLN6zMO09+A@mail.gmail.com>
 <0D20123F-F12B-4234-84DF-E9AECE6E31C9@lvh.io>
Message-ID: <5471F776-D4B0-4941-9FAB-96341A530A55@gmail.com>

Hi Laurens,

I'm afraid that I have not been very good in explaining my use case, because the questions you ask point at more complicated solutions than I thought were necessary.

The aim is to find the most convenient symmetric key identifier to embed in a cipher message that would require the minimum amount of key management.

What is best depends on the context. Sometimes it's easy because there is only one key, or the security context is so unambiguous that associating the right key is trivial. Other times it's a bit more challenging. We have an existing application with tens of long-lived keys, and the current key-management complicates key-rotation and upgrades to modern algos and such.

If both Alice and Bob can generate key identifiers (kid's) from the key that they share directly, like derive if from the symmetric key, then there is no need to exchange or agree upon a name for that key as it would be kind of a "true name" (read Vinge if you haven't ;-) ). The parties only have to agree on the key identifier derivation method.

For example, if Alice and Bob agree to name their symmetric keys by taking the sha256 of that key's bytes, base64url encode the hash, and represent it as a urn, like "urn:s256:V2jyhd8tX-19vpEhyrDzIHgUYyDA5MS1Qi71iw1SUP0". This would allow both parties to maintain their own key-db with (kid, key) associations. Embedding the kid in the exchanged cipher messages would allow both parties to easily find the key to decrypt the received message.
(very much like we often use the hash of the public key (or pk-cert) to identify the private key to decrypt)

The kid embedded in the cipher message is no more than a ?hint?. It could be signed as part of the whole cipher message, but its integrity can only be confirmed after the message is decrypted&authenticated. Changing the kid in a cipher message results in DoS, but so would flipping any other bit in that message.

In its most simple form, I believe that the kid-derivation could be a sha2 of the key as long as the key is "truly" random. The only concern may be that some use a simple hash of the key for key derivation...(?). To avoid any of those usage collisions, you could define the convention of pre-pending the key with some publicly know constant, like b'pre-kid-constant' or fancier.

If one believes that a simple sha2 hash is only borderline enough secure (?), then maybe use a CMAC or HMAC, where you use the key on the key-value itself, and the resulting tag would constitute the identifier. (I did something like that in franks42/naclj with blake2)

Or use HKDF, with maybe a kid-derivation specific constant for the salt, a kid-specific info value, and a sufficient length of the resulting key, i.e. identifier, that makes everybody happy.

Maybe something based on HKDF would be best (?).

Hope this additional explanation helps.

Thanks, Frank.

PS. Don?t believe I will resurrect that franks42/naclj - I?ll add a note about depreciation and send them to your effort - it was a good experience learning about Curve/Ed25519 and the nacl/libsodium code though - also trying to keep all data structures as immutable as possible was a good exercise.


On Fri, Jul 1, 2016 at 3:53 PM, lvh <_ at lvh.io> wrote:
> 
>> On Jul 1, 2016, at 12:54 PM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
>> 
>> Hi lvh,
>> 
>> Guess you're the "lvh" who is responsible for "lvh/caesium" ;-).
> 
> Yup. I?m also a founding member of PyCA and the resident cryptographer, which is why I?m on this list :-)
> 
>> Good to see that you've reanimated that project! Believe you were kind of
>> distracted for awhile, which "forced" me to play around with
>> "franks42/naclj"... which has been on live-support for about a year
>> now, because my new job consumes even my playtime.
> 
> It did what I needed it to do at the time, so I didn?t fix what wasn?t broken ;-) I don?t recall anyone reaching out or filing issues. Once someone did ask questions and contributed code, I was happy to merge/review/cut new releases/do new development. More dev is happening now to scratch my own itch :)
> 
> Currently I?m doing a lot of work around NMR as mentioned before, and API design around e.g. different byte buffer types, so that for example you can efficiently dump a nonce and a ciphertext in the same buffer, or derive multiple keys in one iteration of BLAKE2, etc. Also a bunch of work around e.g. pinning and verification of the produced binding and related benchmarking :)
> 
> I invite you to look at caesium again, because some of the criticisms you make in naclj?s README no longer apply (e.g. caesium no longer uses kalium and instead binds libsodium directly, albeit for a different reason than what naclj mentions). Because the binding is done in Clojure, it can do all sorts of metaprogramming including binding every permutation of a particular method for various byte types in addition to the inspection mentioned above, e.g.: https://github.com/lvh/caesium/blob/master/src/caesium/binding.clj#L56-L62
> 
> Do you intend to continue to develop naclj, or is it effectively retired?
> 
>> As part of that "franks42/naclj" effort, I suggested to standardize
>> the derivation of a kid from the two curve25519 public keys. However,
>> I recognize that you do not always have any DH-keys available when you
>> have a bare symmetric key,
> 
> Is that scheme documented anywhere? I wonder what the use case is for two curve25519 pubkeys ? the ?obvious" case would seem to easily degenerate to the shared symmetric secret (after doing a DH exchange).
> 
>> so I suggested a scheme based on blake2. I
>> wrote up some rationale for those choices here:
>> "https://github.com/franks42/naclj/blob/master/Keys%2C%20IDs%2C%20and%20URNs.md",
>> but never got much traction on the libsodium list,... and then I got
>> distracted.
>> 
>> Now I'm faced again with similar key-management issues, which could
>> benefit from such key-derived kid's - so I try again.
>> 
>> In summary, your suggestions all resonate very well, but... there are
>> too many of them. Let's just pick one identifier derivation mechanism
>> for symmetric keys, document it, implement it, use it!
> 
> I think there are a few problems preventing this from happening right now, including:
> 
> - Historically, cryptographers have not researched key wrap anywhere near as much as other schemes. I think the only reason it?s en vogue now is the interest in NMR, which at least a handful of cryptographers (Rogaway, Krovetz, and humbly, myself) care about now, and is incidentally a related problem.
> - People want subtly different things for their protocols, further reducing interest. Do you just want to identify a key? That?s fine, but a problem many protocols dodge. Do you want to ship a key to someone who already has a secret or asymmetric key? AEAD (including NMR AEAD in particular, so key wrap) and just asymmetric encryption (a la non-PFS TLS or GPG) is probably where you?re going to land.
> - How does this fit in a grander protocol and what is that protocol trying to accomplish?
> - How is the key identifier authenticated? What prevents Mallory from just modifying the key id bytes to effectively deny service? E.g. if I?m doing this to make sure I can rotate keys effectively, how do I auth that? Ideally without replacing an unrotatable secret key with another unrotatable secret key :D (Effective key rotation for KEKs is definitely something I care about.)
> - When keys are being sent alongside messages, how do we make this not a footgun for e.g. key selection attacks? (Granted, harder for EdDSA, but I want protocols to be correct for arbitrary schemes :)). PyCA cares about recipes being not footguns. That?s a mixed bag: on the one hand, it means we can give safe advice, on the other hand, it does mean that all we have is Fernet...
> 
> Overall, I think this is a reasonable idea for some protocols, but I think we need to be extremely clear about what that is, who it?s for, and how to use it.
> 
> 
> lvh
> 
>> Groetjes, Frank.
>> 
>> On Fri, Jul 1, 2016 at 9:51 AM, lvh <_ at lvh.io> wrote:
>>> Hi Frank,
>>> 
>>>> On Jul 1, 2016, at 11:11 AM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
>>>> 
>>>> snip snip key identifiers
>>> 
>>> This is why some key derivation functions and PRFs have ?purpose? or ?info" fields, yes; including BLAKE2 and HKDF. Deriving a lesser key (which might just be a keyid) is a perfectly valid strategy from objcap practice. I?m doing something similar in the scheme of a larger semiprivate key scheme using libsodium. You probably do want something that explicitly supports that instead of just implicitly picking a particular nonce or whatever ? I?m not sure which nonce you?re referring to, I don?t think the systems you mentioned take one. TL;DR: make the derivation completely distinct based on what you?re deriving and why you?re deriving it :)
>>> 
>>> You might also want to look at the related concept of NMR and key-wrap, which might let you solve the problem at a slightly different part of your protocol; essentially giving you a protected key with associated data about that key. It?s not entirely clear what the people standardizing GCM-SIV want to do exactly (other than ?not TLS?, I don?t think they?ve said), but this is the obvious choice, especially given GCM-SIVs separate code path for tiny messages and the historical linking of the two from a crypto design perspective.
>>> 
>>> I am also writing NMR stuff on the side in libsodium/caesium, but that focuses mostly on being a Fernet replacement, rather than a keywrap, using secretbox (which makes it easy because big nonce space). Pretty sure I can translate it to the AEAD schemes, but the security proof gets iffier. Which reminds me: we should talk about Clojure bindings to libsodium some time :)
>>> 
>>> 
>>> lvh
>>> _______________________________________________
>>> Cryptography-dev mailing list
>>> Cryptography-dev at python.org
>>> https://mail.python.org/mailman/listinfo/cryptography-dev
>> _______________________________________________
>> Cryptography-dev mailing list
>> Cryptography-dev at python.org
>> https://mail.python.org/mailman/listinfo/cryptography-dev
> 
> 
> _______________________________________________
> Cryptography-dev mailing list
> Cryptography-dev at python.org
> https://mail.python.org/mailman/listinfo/cryptography-dev
> 

From frank.siebenlist at gmail.com  Mon Jul  4 14:24:54 2016
From: frank.siebenlist at gmail.com (Frank Siebenlist)
Date: Mon, 4 Jul 2016 11:24:54 -0700
Subject: [Cryptography-dev] "intrinsic" symmetric key identifier?
In-Reply-To: <5471F776-D4B0-4941-9FAB-96341A530A55@gmail.com>
References: <CAJGvTrQiwH8msDBxj-xkBL4AALE6-RRkexfj=EO5=2h+Ky_aew@mail.gmail.com>
 <749A9BBD-6191-46E6-BF2B-4134FC37A27B@lvh.io>
 <CAJGvTrRFuuhU=N4vBL04WEkjrFFgPdkrg0X+xmMPLN6zMO09+A@mail.gmail.com>
 <0D20123F-F12B-4234-84DF-E9AECE6E31C9@lvh.io>
 <5471F776-D4B0-4941-9FAB-96341A530A55@gmail.com>
Message-ID: <CAJGvTrTzfa+-mr2x697NyWJvzcrySPivWZYDOxJLnrqbqTybiw@mail.gmail.com>

To make it a little more real, please look at this gist:

https://gist.github.com/franks42/b8b28049adcdf4504271238391c3525b

which implements a key identifier generation based on HKDF.

Any security concerns with such an approach?

Better alternatives?

Thanks, Frank.


On Sat, Jul 2, 2016 at 4:52 PM, Frank Siebenlist
<frank.siebenlist at gmail.com> wrote:
> Hi Laurens,
>
> I'm afraid that I have not been very good in explaining my use case, because the questions you ask point at more complicated solutions than I thought were necessary.
>
> The aim is to find the most convenient symmetric key identifier to embed in a cipher message that would require the minimum amount of key management.
>
> What is best depends on the context. Sometimes it's easy because there is only one key, or the security context is so unambiguous that associating the right key is trivial. Other times it's a bit more challenging. We have an existing application with tens of long-lived keys, and the current key-management complicates key-rotation and upgrades to modern algos and such.
>
> If both Alice and Bob can generate key identifiers (kid's) from the key that they share directly, like derive if from the symmetric key, then there is no need to exchange or agree upon a name for that key as it would be kind of a "true name" (read Vinge if you haven't ;-) ). The parties only have to agree on the key identifier derivation method.
>
> For example, if Alice and Bob agree to name their symmetric keys by taking the sha256 of that key's bytes, base64url encode the hash, and represent it as a urn, like "urn:s256:V2jyhd8tX-19vpEhyrDzIHgUYyDA5MS1Qi71iw1SUP0". This would allow both parties to maintain their own key-db with (kid, key) associations. Embedding the kid in the exchanged cipher messages would allow both parties to easily find the key to decrypt the received message.
> (very much like we often use the hash of the public key (or pk-cert) to identify the private key to decrypt)
>
> The kid embedded in the cipher message is no more than a ?hint?. It could be signed as part of the whole cipher message, but its integrity can only be confirmed after the message is decrypted&authenticated. Changing the kid in a cipher message results in DoS, but so would flipping any other bit in that message.
>
> In its most simple form, I believe that the kid-derivation could be a sha2 of the key as long as the key is "truly" random. The only concern may be that some use a simple hash of the key for key derivation...(?). To avoid any of those usage collisions, you could define the convention of pre-pending the key with some publicly know constant, like b'pre-kid-constant' or fancier.
>
> If one believes that a simple sha2 hash is only borderline enough secure (?), then maybe use a CMAC or HMAC, where you use the key on the key-value itself, and the resulting tag would constitute the identifier. (I did something like that in franks42/naclj with blake2)
>
> Or use HKDF, with maybe a kid-derivation specific constant for the salt, a kid-specific info value, and a sufficient length of the resulting key, i.e. identifier, that makes everybody happy.
>
> Maybe something based on HKDF would be best (?).
>
> Hope this additional explanation helps.
>
> Thanks, Frank.
>
> PS. Don?t believe I will resurrect that franks42/naclj - I?ll add a note about depreciation and send them to your effort - it was a good experience learning about Curve/Ed25519 and the nacl/libsodium code though - also trying to keep all data structures as immutable as possible was a good exercise.
>
>
> On Fri, Jul 1, 2016 at 3:53 PM, lvh <_ at lvh.io> wrote:
>>
>>> On Jul 1, 2016, at 12:54 PM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
>>>
>>> Hi lvh,
>>>
>>> Guess you're the "lvh" who is responsible for "lvh/caesium" ;-).
>>
>> Yup. I?m also a founding member of PyCA and the resident cryptographer, which is why I?m on this list :-)
>>
>>> Good to see that you've reanimated that project! Believe you were kind of
>>> distracted for awhile, which "forced" me to play around with
>>> "franks42/naclj"... which has been on live-support for about a year
>>> now, because my new job consumes even my playtime.
>>
>> It did what I needed it to do at the time, so I didn?t fix what wasn?t broken ;-) I don?t recall anyone reaching out or filing issues. Once someone did ask questions and contributed code, I was happy to merge/review/cut new releases/do new development. More dev is happening now to scratch my own itch :)
>>
>> Currently I?m doing a lot of work around NMR as mentioned before, and API design around e.g. different byte buffer types, so that for example you can efficiently dump a nonce and a ciphertext in the same buffer, or derive multiple keys in one iteration of BLAKE2, etc. Also a bunch of work around e.g. pinning and verification of the produced binding and related benchmarking :)
>>
>> I invite you to look at caesium again, because some of the criticisms you make in naclj?s README no longer apply (e.g. caesium no longer uses kalium and instead binds libsodium directly, albeit for a different reason than what naclj mentions). Because the binding is done in Clojure, it can do all sorts of metaprogramming including binding every permutation of a particular method for various byte types in addition to the inspection mentioned above, e.g.: https://github.com/lvh/caesium/blob/master/src/caesium/binding.clj#L56-L62
>>
>> Do you intend to continue to develop naclj, or is it effectively retired?
>>
>>> As part of that "franks42/naclj" effort, I suggested to standardize
>>> the derivation of a kid from the two curve25519 public keys. However,
>>> I recognize that you do not always have any DH-keys available when you
>>> have a bare symmetric key,
>>
>> Is that scheme documented anywhere? I wonder what the use case is for two curve25519 pubkeys ? the ?obvious" case would seem to easily degenerate to the shared symmetric secret (after doing a DH exchange).
>>
>>> so I suggested a scheme based on blake2. I
>>> wrote up some rationale for those choices here:
>>> "https://github.com/franks42/naclj/blob/master/Keys%2C%20IDs%2C%20and%20URNs.md",
>>> but never got much traction on the libsodium list,... and then I got
>>> distracted.
>>>
>>> Now I'm faced again with similar key-management issues, which could
>>> benefit from such key-derived kid's - so I try again.
>>>
>>> In summary, your suggestions all resonate very well, but... there are
>>> too many of them. Let's just pick one identifier derivation mechanism
>>> for symmetric keys, document it, implement it, use it!
>>
>> I think there are a few problems preventing this from happening right now, including:
>>
>> - Historically, cryptographers have not researched key wrap anywhere near as much as other schemes. I think the only reason it?s en vogue now is the interest in NMR, which at least a handful of cryptographers (Rogaway, Krovetz, and humbly, myself) care about now, and is incidentally a related problem.
>> - People want subtly different things for their protocols, further reducing interest. Do you just want to identify a key? That?s fine, but a problem many protocols dodge. Do you want to ship a key to someone who already has a secret or asymmetric key? AEAD (including NMR AEAD in particular, so key wrap) and just asymmetric encryption (a la non-PFS TLS or GPG) is probably where you?re going to land.
>> - How does this fit in a grander protocol and what is that protocol trying to accomplish?
>> - How is the key identifier authenticated? What prevents Mallory from just modifying the key id bytes to effectively deny service? E.g. if I?m doing this to make sure I can rotate keys effectively, how do I auth that? Ideally without replacing an unrotatable secret key with another unrotatable secret key :D (Effective key rotation for KEKs is definitely something I care about.)
>> - When keys are being sent alongside messages, how do we make this not a footgun for e.g. key selection attacks? (Granted, harder for EdDSA, but I want protocols to be correct for arbitrary schemes :)). PyCA cares about recipes being not footguns. That?s a mixed bag: on the one hand, it means we can give safe advice, on the other hand, it does mean that all we have is Fernet...
>>
>> Overall, I think this is a reasonable idea for some protocols, but I think we need to be extremely clear about what that is, who it?s for, and how to use it.
>>
>>
>> lvh
>>
>>> Groetjes, Frank.
>>>
>>> On Fri, Jul 1, 2016 at 9:51 AM, lvh <_ at lvh.io> wrote:
>>>> Hi Frank,
>>>>
>>>>> On Jul 1, 2016, at 11:11 AM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
>>>>>
>>>>> snip snip key identifiers
>>>>
>>>> This is why some key derivation functions and PRFs have ?purpose? or ?info" fields, yes; including BLAKE2 and HKDF. Deriving a lesser key (which might just be a keyid) is a perfectly valid strategy from objcap practice. I?m doing something similar in the scheme of a larger semiprivate key scheme using libsodium. You probably do want something that explicitly supports that instead of just implicitly picking a particular nonce or whatever ? I?m not sure which nonce you?re referring to, I don?t think the systems you mentioned take one. TL;DR: make the derivation completely distinct based on what you?re deriving and why you?re deriving it :)
>>>>
>>>> You might also want to look at the related concept of NMR and key-wrap, which might let you solve the problem at a slightly different part of your protocol; essentially giving you a protected key with associated data about that key. It?s not entirely clear what the people standardizing GCM-SIV want to do exactly (other than ?not TLS?, I don?t think they?ve said), but this is the obvious choice, especially given GCM-SIVs separate code path for tiny messages and the historical linking of the two from a crypto design perspective.
>>>>
>>>> I am also writing NMR stuff on the side in libsodium/caesium, but that focuses mostly on being a Fernet replacement, rather than a keywrap, using secretbox (which makes it easy because big nonce space). Pretty sure I can translate it to the AEAD schemes, but the security proof gets iffier. Which reminds me: we should talk about Clojure bindings to libsodium some time :)
>>>>
>>>>
>>>> lvh
>>>> _______________________________________________
>>>> Cryptography-dev mailing list
>>>> Cryptography-dev at python.org
>>>> https://mail.python.org/mailman/listinfo/cryptography-dev
>>> _______________________________________________
>>> Cryptography-dev mailing list
>>> Cryptography-dev at python.org
>>> https://mail.python.org/mailman/listinfo/cryptography-dev
>>
>>
>> _______________________________________________
>> Cryptography-dev mailing list
>> Cryptography-dev at python.org
>> https://mail.python.org/mailman/listinfo/cryptography-dev
>>

From _ at lvh.io  Tue Jul  5 10:39:08 2016
From: _ at lvh.io (lvh)
Date: Tue, 5 Jul 2016 09:39:08 -0500
Subject: [Cryptography-dev] "intrinsic" symmetric key identifier?
In-Reply-To: <CAJGvTrTzfa+-mr2x697NyWJvzcrySPivWZYDOxJLnrqbqTybiw@mail.gmail.com>
References: <CAJGvTrQiwH8msDBxj-xkBL4AALE6-RRkexfj=EO5=2h+Ky_aew@mail.gmail.com>
 <749A9BBD-6191-46E6-BF2B-4134FC37A27B@lvh.io>
 <CAJGvTrRFuuhU=N4vBL04WEkjrFFgPdkrg0X+xmMPLN6zMO09+A@mail.gmail.com>
 <0D20123F-F12B-4234-84DF-E9AECE6E31C9@lvh.io>
 <5471F776-D4B0-4941-9FAB-96341A530A55@gmail.com>
 <CAJGvTrTzfa+-mr2x697NyWJvzcrySPivWZYDOxJLnrqbqTybiw@mail.gmail.com>
Message-ID: <E1F5B1F7-9FD6-4ADB-B98A-3D7695479EA4@lvh.io>

My apologies for the delay in replying; I?ve been busy taking time off and spending 4th of July weekend with my family. I?ll write a reply to this soon, it?s just that it will probably be a long one ;)

> On Jul 4, 2016, at 1:24 PM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
> 
> To make it a little more real, please look at this gist:
> 
> https://gist.github.com/franks42/b8b28049adcdf4504271238391c3525b
> 
> which implements a key identifier generation based on HKDF.
> 
> Any security concerns with such an approach?
> 
> Better alternatives?
> 
> Thanks, Frank.
> 
> 
> 
> On Sat, Jul 2, 2016 at 4:52 PM, Frank Siebenlist
> <frank.siebenlist at gmail.com> wrote:
>> Hi Laurens,
>> 
>> I'm afraid that I have not been very good in explaining my use case, because the questions you ask point at more complicated solutions than I thought were necessary.
>> 
>> The aim is to find the most convenient symmetric key identifier to embed in a cipher message that would require the minimum amount of key management.
>> 
>> What is best depends on the context. Sometimes it's easy because there is only one key, or the security context is so unambiguous that associating the right key is trivial. Other times it's a bit more challenging. We have an existing application with tens of long-lived keys, and the current key-management complicates key-rotation and upgrades to modern algos and such.
>> 
>> If both Alice and Bob can generate key identifiers (kid's) from the key that they share directly, like derive if from the symmetric key, then there is no need to exchange or agree upon a name for that key as it would be kind of a "true name" (read Vinge if you haven't ;-) ). The parties only have to agree on the key identifier derivation method.
>> 
>> For example, if Alice and Bob agree to name their symmetric keys by taking the sha256 of that key's bytes, base64url encode the hash, and represent it as a urn, like "urn:s256:V2jyhd8tX-19vpEhyrDzIHgUYyDA5MS1Qi71iw1SUP0". This would allow both parties to maintain their own key-db with (kid, key) associations. Embedding the kid in the exchanged cipher messages would allow both parties to easily find the key to decrypt the received message.
>> (very much like we often use the hash of the public key (or pk-cert) to identify the private key to decrypt)
>> 
>> The kid embedded in the cipher message is no more than a ?hint?. It could be signed as part of the whole cipher message, but its integrity can only be confirmed after the message is decrypted&authenticated. Changing the kid in a cipher message results in DoS, but so would flipping any other bit in that message.
>> 
>> In its most simple form, I believe that the kid-derivation could be a sha2 of the key as long as the key is "truly" random. The only concern may be that some use a simple hash of the key for key derivation...(?). To avoid any of those usage collisions, you could define the convention of pre-pending the key with some publicly know constant, like b'pre-kid-constant' or fancier.
>> 
>> If one believes that a simple sha2 hash is only borderline enough secure (?), then maybe use a CMAC or HMAC, where you use the key on the key-value itself, and the resulting tag would constitute the identifier. (I did something like that in franks42/naclj with blake2)
>> 
>> Or use HKDF, with maybe a kid-derivation specific constant for the salt, a kid-specific info value, and a sufficient length of the resulting key, i.e. identifier, that makes everybody happy.
>> 
>> Maybe something based on HKDF would be best (?).
>> 
>> Hope this additional explanation helps.
>> 
>> Thanks, Frank.
>> 
>> PS. Don?t believe I will resurrect that franks42/naclj - I?ll add a note about depreciation and send them to your effort - it was a good experience learning about Curve/Ed25519 and the nacl/libsodium code though - also trying to keep all data structures as immutable as possible was a good exercise.
>> 
>> 
>> On Fri, Jul 1, 2016 at 3:53 PM, lvh <_ at lvh.io> wrote:
>>> 
>>>> On Jul 1, 2016, at 12:54 PM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
>>>> 
>>>> Hi lvh,
>>>> 
>>>> Guess you're the "lvh" who is responsible for "lvh/caesium" ;-).
>>> 
>>> Yup. I?m also a founding member of PyCA and the resident cryptographer, which is why I?m on this list :-)
>>> 
>>>> Good to see that you've reanimated that project! Believe you were kind of
>>>> distracted for awhile, which "forced" me to play around with
>>>> "franks42/naclj"... which has been on live-support for about a year
>>>> now, because my new job consumes even my playtime.
>>> 
>>> It did what I needed it to do at the time, so I didn?t fix what wasn?t broken ;-) I don?t recall anyone reaching out or filing issues. Once someone did ask questions and contributed code, I was happy to merge/review/cut new releases/do new development. More dev is happening now to scratch my own itch :)
>>> 
>>> Currently I?m doing a lot of work around NMR as mentioned before, and API design around e.g. different byte buffer types, so that for example you can efficiently dump a nonce and a ciphertext in the same buffer, or derive multiple keys in one iteration of BLAKE2, etc. Also a bunch of work around e.g. pinning and verification of the produced binding and related benchmarking :)
>>> 
>>> I invite you to look at caesium again, because some of the criticisms you make in naclj?s README no longer apply (e.g. caesium no longer uses kalium and instead binds libsodium directly, albeit for a different reason than what naclj mentions). Because the binding is done in Clojure, it can do all sorts of metaprogramming including binding every permutation of a particular method for various byte types in addition to the inspection mentioned above, e.g.: https://github.com/lvh/caesium/blob/master/src/caesium/binding.clj#L56-L62
>>> 
>>> Do you intend to continue to develop naclj, or is it effectively retired?
>>> 
>>>> As part of that "franks42/naclj" effort, I suggested to standardize
>>>> the derivation of a kid from the two curve25519 public keys. However,
>>>> I recognize that you do not always have any DH-keys available when you
>>>> have a bare symmetric key,
>>> 
>>> Is that scheme documented anywhere? I wonder what the use case is for two curve25519 pubkeys ? the ?obvious" case would seem to easily degenerate to the shared symmetric secret (after doing a DH exchange).
>>> 
>>>> so I suggested a scheme based on blake2. I
>>>> wrote up some rationale for those choices here:
>>>> "https://github.com/franks42/naclj/blob/master/Keys%2C%20IDs%2C%20and%20URNs.md",
>>>> but never got much traction on the libsodium list,... and then I got
>>>> distracted.
>>>> 
>>>> Now I'm faced again with similar key-management issues, which could
>>>> benefit from such key-derived kid's - so I try again.
>>>> 
>>>> In summary, your suggestions all resonate very well, but... there are
>>>> too many of them. Let's just pick one identifier derivation mechanism
>>>> for symmetric keys, document it, implement it, use it!
>>> 
>>> I think there are a few problems preventing this from happening right now, including:
>>> 
>>> - Historically, cryptographers have not researched key wrap anywhere near as much as other schemes. I think the only reason it?s en vogue now is the interest in NMR, which at least a handful of cryptographers (Rogaway, Krovetz, and humbly, myself) care about now, and is incidentally a related problem.
>>> - People want subtly different things for their protocols, further reducing interest. Do you just want to identify a key? That?s fine, but a problem many protocols dodge. Do you want to ship a key to someone who already has a secret or asymmetric key? AEAD (including NMR AEAD in particular, so key wrap) and just asymmetric encryption (a la non-PFS TLS or GPG) is probably where you?re going to land.
>>> - How does this fit in a grander protocol and what is that protocol trying to accomplish?
>>> - How is the key identifier authenticated? What prevents Mallory from just modifying the key id bytes to effectively deny service? E.g. if I?m doing this to make sure I can rotate keys effectively, how do I auth that? Ideally without replacing an unrotatable secret key with another unrotatable secret key :D (Effective key rotation for KEKs is definitely something I care about.)
>>> - When keys are being sent alongside messages, how do we make this not a footgun for e.g. key selection attacks? (Granted, harder for EdDSA, but I want protocols to be correct for arbitrary schemes :)). PyCA cares about recipes being not footguns. That?s a mixed bag: on the one hand, it means we can give safe advice, on the other hand, it does mean that all we have is Fernet...
>>> 
>>> Overall, I think this is a reasonable idea for some protocols, but I think we need to be extremely clear about what that is, who it?s for, and how to use it.
>>> 
>>> 
>>> lvh
>>> 
>>>> Groetjes, Frank.
>>>> 
>>>> On Fri, Jul 1, 2016 at 9:51 AM, lvh <_ at lvh.io> wrote:
>>>>> Hi Frank,
>>>>> 
>>>>>> On Jul 1, 2016, at 11:11 AM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
>>>>>> 
>>>>>> snip snip key identifiers
>>>>> 
>>>>> This is why some key derivation functions and PRFs have ?purpose? or ?info" fields, yes; including BLAKE2 and HKDF. Deriving a lesser key (which might just be a keyid) is a perfectly valid strategy from objcap practice. I?m doing something similar in the scheme of a larger semiprivate key scheme using libsodium. You probably do want something that explicitly supports that instead of just implicitly picking a particular nonce or whatever ? I?m not sure which nonce you?re referring to, I don?t think the systems you mentioned take one. TL;DR: make the derivation completely distinct based on what you?re deriving and why you?re deriving it :)
>>>>> 
>>>>> You might also want to look at the related concept of NMR and key-wrap, which might let you solve the problem at a slightly different part of your protocol; essentially giving you a protected key with associated data about that key. It?s not entirely clear what the people standardizing GCM-SIV want to do exactly (other than ?not TLS?, I don?t think they?ve said), but this is the obvious choice, especially given GCM-SIVs separate code path for tiny messages and the historical linking of the two from a crypto design perspective.
>>>>> 
>>>>> I am also writing NMR stuff on the side in libsodium/caesium, but that focuses mostly on being a Fernet replacement, rather than a keywrap, using secretbox (which makes it easy because big nonce space). Pretty sure I can translate it to the AEAD schemes, but the security proof gets iffier. Which reminds me: we should talk about Clojure bindings to libsodium some time :)
>>>>> 
>>>>> 
>>>>> lvh
>>>>> _______________________________________________
>>>>> Cryptography-dev mailing list
>>>>> Cryptography-dev at python.org
>>>>> https://mail.python.org/mailman/listinfo/cryptography-dev
>>>> _______________________________________________
>>>> Cryptography-dev mailing list
>>>> Cryptography-dev at python.org
>>>> https://mail.python.org/mailman/listinfo/cryptography-dev
>>> 
>>> 
>>> _______________________________________________
>>> Cryptography-dev mailing list
>>> Cryptography-dev at python.org
>>> https://mail.python.org/mailman/listinfo/cryptography-dev
>>> 
> _______________________________________________
> Cryptography-dev mailing list
> Cryptography-dev at python.org
> https://mail.python.org/mailman/listinfo/cryptography-dev


From _ at lvh.io  Wed Jul  6 13:13:43 2016
From: _ at lvh.io (lvh)
Date: Wed, 6 Jul 2016 12:13:43 -0500
Subject: [Cryptography-dev] "intrinsic" symmetric key identifier?
In-Reply-To: <5471F776-D4B0-4941-9FAB-96341A530A55@gmail.com>
References: <CAJGvTrQiwH8msDBxj-xkBL4AALE6-RRkexfj=EO5=2h+Ky_aew@mail.gmail.com>
 <749A9BBD-6191-46E6-BF2B-4134FC37A27B@lvh.io>
 <CAJGvTrRFuuhU=N4vBL04WEkjrFFgPdkrg0X+xmMPLN6zMO09+A@mail.gmail.com>
 <0D20123F-F12B-4234-84DF-E9AECE6E31C9@lvh.io>
 <5471F776-D4B0-4941-9FAB-96341A530A55@gmail.com>
Message-ID: <F6805153-0611-4829-AE0E-89A9B4A2F0B3@lvh.io>

Hi, 

> On Jul 2, 2016, at 6:52 PM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
> The aim is to find the most convenient symmetric key identifier to embed in a cipher message that would require the minimum amount of key management.
> 
> What is best depends on the context. Sometimes it's easy because there is only one key, or the security context is so unambiguous that associating the right key is trivial. Other times it's a bit more challenging. We have an existing application with tens of long-lived keys, and the current key-management complicates key-rotation and upgrades to modern algos and such.

I?m assuming these keys are symmetric and don?t live in an HSM (since you seem to be able to perform arbitrary computation with them)?

> If both Alice and Bob can generate key identifiers (kid's) from the key that they share directly, like derive if from the symmetric key, then there is no need to exchange or agree upon a name for that key as it would be kind of a "true name" (read Vinge if you haven't ;-) ). The parties only have to agree on the key identifier derivation method.
> 
> For example, if Alice and Bob agree to name their symmetric keys by taking the sha256 of that key's bytes, base64url encode the hash, and represent it as a urn, like "urn:s256:V2jyhd8tX-19vpEhyrDzIHgUYyDA5MS1Qi71iw1SUP0". This would allow both parties to maintain their own key-db with (kid, key) associations. Embedding the kid in the exchanged cipher messages would allow both parties to easily find the key to decrypt the received message.
> (very much like we often use the hash of the public key (or pk-cert) to identify the private key to decrypt)

For some more context on ?it depends what you want to accomplish, and generic schemes are hard?; Bob and Alice may also want to have key ids that only work for _them_ ? e.g. Bob and Alice?s static DH keys are used to generate a shared secret used for an AD key wrap scheme.

> The kid embedded in the cipher message is no more than a ?hint?. It could be signed as part of the whole cipher message, but its integrity can only be confirmed after the message is decrypted&authenticated. Changing the kid in a cipher message results in DoS, but so would flipping any other bit in that message.

Does a failed decryption cause Bob to reject the message, or just try all the other keys? If so, what?s the benefit between just giving keys names, like sequence numbers or even strings?

> In its most simple form, I believe that the kid-derivation could be a sha2 of the key as long as the key is "truly" random. The only concern may be that some use a simple hash of the key for key derivation...(?). To avoid any of those usage collisions, you could define the convention of pre-pending the key with some publicly know constant, like b'pre-kid-constant' or fancier.

SHA2?s problems are a little less obvious when inputs are fixed length, but keys aren?t always ? I?d recommend a SHA3-era hash like BLAKE2b or SHA-3 itself to not have to worry about that part at all :)

> If one believes that a simple sha2 hash is only borderline enough secure (?), then maybe use a CMAC or HMAC, where you use the key on the key-value itself, and the resulting tag would constitute the identifier. (I did something like that in franks42/naclj with blake2)

What?s the key used to compute the MAC? (In this case, I think what you _really_ want is AD key wrapping schemes, including GCM-SIV?s tiny mode).

> Or use HKDF, with maybe a kid-derivation specific constant for the salt, a kid-specific info value, and a sufficient length of the resulting key, i.e. identifier, that makes everybody happy.

I?d probably go with BLAKE2b if this is _all_ you?re trying to do, but I think what you might really want is key wrap :)

> Hope this additional explanation helps.

A little :) Is this for encryption at rest, with multiple recipients, where the recipients are assumed to already have all of the keys?

> PS. Don?t believe I will resurrect that franks42/naclj - I?ll add a note about depreciation and send them to your effort - it was a good experience learning about Curve/Ed25519 and the nacl/libsodium code though - also trying to keep all data structures as immutable as possible was a good exercise.

It definitely has. I?m working on a blog post (series of blog posts) on crypto API design, particularly in the context of libsodium and the JVMs plethora of byte types.


lvh

> 
> On Fri, Jul 1, 2016 at 3:53 PM, lvh <_ at lvh.io> wrote:
>> 
>>> On Jul 1, 2016, at 12:54 PM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
>>> 
>>> Hi lvh,
>>> 
>>> Guess you're the "lvh" who is responsible for "lvh/caesium" ;-).
>> 
>> Yup. I?m also a founding member of PyCA and the resident cryptographer, which is why I?m on this list :-)
>> 
>>> Good to see that you've reanimated that project! Believe you were kind of
>>> distracted for awhile, which "forced" me to play around with
>>> "franks42/naclj"... which has been on live-support for about a year
>>> now, because my new job consumes even my playtime.
>> 
>> It did what I needed it to do at the time, so I didn?t fix what wasn?t broken ;-) I don?t recall anyone reaching out or filing issues. Once someone did ask questions and contributed code, I was happy to merge/review/cut new releases/do new development. More dev is happening now to scratch my own itch :)
>> 
>> Currently I?m doing a lot of work around NMR as mentioned before, and API design around e.g. different byte buffer types, so that for example you can efficiently dump a nonce and a ciphertext in the same buffer, or derive multiple keys in one iteration of BLAKE2, etc. Also a bunch of work around e.g. pinning and verification of the produced binding and related benchmarking :)
>> 
>> I invite you to look at caesium again, because some of the criticisms you make in naclj?s README no longer apply (e.g. caesium no longer uses kalium and instead binds libsodium directly, albeit for a different reason than what naclj mentions). Because the binding is done in Clojure, it can do all sorts of metaprogramming including binding every permutation of a particular method for various byte types in addition to the inspection mentioned above, e.g.: https://github.com/lvh/caesium/blob/master/src/caesium/binding.clj#L56-L62
>> 
>> Do you intend to continue to develop naclj, or is it effectively retired?
>> 
>>> As part of that "franks42/naclj" effort, I suggested to standardize
>>> the derivation of a kid from the two curve25519 public keys. However,
>>> I recognize that you do not always have any DH-keys available when you
>>> have a bare symmetric key,
>> 
>> Is that scheme documented anywhere? I wonder what the use case is for two curve25519 pubkeys ? the ?obvious" case would seem to easily degenerate to the shared symmetric secret (after doing a DH exchange).
>> 
>>> so I suggested a scheme based on blake2. I
>>> wrote up some rationale for those choices here:
>>> "https://github.com/franks42/naclj/blob/master/Keys%2C%20IDs%2C%20and%20URNs.md",
>>> but never got much traction on the libsodium list,... and then I got
>>> distracted.
>>> 
>>> Now I'm faced again with similar key-management issues, which could
>>> benefit from such key-derived kid's - so I try again.
>>> 
>>> In summary, your suggestions all resonate very well, but... there are
>>> too many of them. Let's just pick one identifier derivation mechanism
>>> for symmetric keys, document it, implement it, use it!
>> 
>> I think there are a few problems preventing this from happening right now, including:
>> 
>> - Historically, cryptographers have not researched key wrap anywhere near as much as other schemes. I think the only reason it?s en vogue now is the interest in NMR, which at least a handful of cryptographers (Rogaway, Krovetz, and humbly, myself) care about now, and is incidentally a related problem.
>> - People want subtly different things for their protocols, further reducing interest. Do you just want to identify a key? That?s fine, but a problem many protocols dodge. Do you want to ship a key to someone who already has a secret or asymmetric key? AEAD (including NMR AEAD in particular, so key wrap) and just asymmetric encryption (a la non-PFS TLS or GPG) is probably where you?re going to land.
>> - How does this fit in a grander protocol and what is that protocol trying to accomplish?
>> - How is the key identifier authenticated? What prevents Mallory from just modifying the key id bytes to effectively deny service? E.g. if I?m doing this to make sure I can rotate keys effectively, how do I auth that? Ideally without replacing an unrotatable secret key with another unrotatable secret key :D (Effective key rotation for KEKs is definitely something I care about.)
>> - When keys are being sent alongside messages, how do we make this not a footgun for e.g. key selection attacks? (Granted, harder for EdDSA, but I want protocols to be correct for arbitrary schemes :)). PyCA cares about recipes being not footguns. That?s a mixed bag: on the one hand, it means we can give safe advice, on the other hand, it does mean that all we have is Fernet...
>> 
>> Overall, I think this is a reasonable idea for some protocols, but I think we need to be extremely clear about what that is, who it?s for, and how to use it.
>> 
>> 
>> lvh
>> 
>>> Groetjes, Frank.
>>> 
>>> On Fri, Jul 1, 2016 at 9:51 AM, lvh <_ at lvh.io> wrote:
>>>> Hi Frank,
>>>> 
>>>>> On Jul 1, 2016, at 11:11 AM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
>>>>> 
>>>>> snip snip key identifiers
>>>> 
>>>> This is why some key derivation functions and PRFs have ?purpose? or ?info" fields, yes; including BLAKE2 and HKDF. Deriving a lesser key (which might just be a keyid) is a perfectly valid strategy from objcap practice. I?m doing something similar in the scheme of a larger semiprivate key scheme using libsodium. You probably do want something that explicitly supports that instead of just implicitly picking a particular nonce or whatever ? I?m not sure which nonce you?re referring to, I don?t think the systems you mentioned take one. TL;DR: make the derivation completely distinct based on what you?re deriving and why you?re deriving it :)
>>>> 
>>>> You might also want to look at the related concept of NMR and key-wrap, which might let you solve the problem at a slightly different part of your protocol; essentially giving you a protected key with associated data about that key. It?s not entirely clear what the people standardizing GCM-SIV want to do exactly (other than ?not TLS?, I don?t think they?ve said), but this is the obvious choice, especially given GCM-SIVs separate code path for tiny messages and the historical linking of the two from a crypto design perspective.
>>>> 
>>>> I am also writing NMR stuff on the side in libsodium/caesium, but that focuses mostly on being a Fernet replacement, rather than a keywrap, using secretbox (which makes it easy because big nonce space). Pretty sure I can translate it to the AEAD schemes, but the security proof gets iffier. Which reminds me: we should talk about Clojure bindings to libsodium some time :)
>>>> 
>>>> 
>>>> lvh
>>>> _______________________________________________
>>>> Cryptography-dev mailing list
>>>> Cryptography-dev at python.org
>>>> https://mail.python.org/mailman/listinfo/cryptography-dev
>>> _______________________________________________
>>> Cryptography-dev mailing list
>>> Cryptography-dev at python.org
>>> https://mail.python.org/mailman/listinfo/cryptography-dev
>> 
>> 
>> _______________________________________________
>> Cryptography-dev mailing list
>> Cryptography-dev at python.org
>> https://mail.python.org/mailman/listinfo/cryptography-dev
>> 
> _______________________________________________
> Cryptography-dev mailing list
> Cryptography-dev at python.org
> https://mail.python.org/mailman/listinfo/cryptography-dev


From frank.siebenlist at gmail.com  Wed Jul  6 14:22:05 2016
From: frank.siebenlist at gmail.com (Frank Siebenlist)
Date: Wed, 6 Jul 2016 11:22:05 -0700
Subject: [Cryptography-dev] "intrinsic" symmetric key identifier?
In-Reply-To: <F6805153-0611-4829-AE0E-89A9B4A2F0B3@lvh.io>
References: <CAJGvTrQiwH8msDBxj-xkBL4AALE6-RRkexfj=EO5=2h+Ky_aew@mail.gmail.com>
 <749A9BBD-6191-46E6-BF2B-4134FC37A27B@lvh.io>
 <CAJGvTrRFuuhU=N4vBL04WEkjrFFgPdkrg0X+xmMPLN6zMO09+A@mail.gmail.com>
 <0D20123F-F12B-4234-84DF-E9AECE6E31C9@lvh.io>
 <5471F776-D4B0-4941-9FAB-96341A530A55@gmail.com>
 <F6805153-0611-4829-AE0E-89A9B4A2F0B3@lvh.io>
Message-ID: <CAJGvTrSt64=wsCg+Dm-nMMNgSYjUY5OG8KXmE0FUM10EUGXf-Q@mail.gmail.com>

Thanks for the detailed scrutiny!

Comments/answers in-line:

> ...
>> The aim is to find the most convenient symmetric key identifier to embed in a cipher message that would require the minimum amount of key management.
>>
>> What is best depends on the context. Sometimes it's easy because there is only one key, or the security context is so unambiguous that associating the right key is trivial. Other times it's a bit more challenging. We have an existing application with tens of long-lived keys, and the current key-management complicates key-rotation and upgrades to modern algos and such.
>
> I?m assuming these keys are symmetric and don?t live in an HSM (since you seem to be able to perform arbitrary computation with them)?


Correct - you need access to the key's bytes for the key identifier
scheme I'm looking for.


>> If both Alice and Bob can generate key identifiers (kid's) from the key that they share directly, like derive if from the symmetric key, then there is no need to exchange or agree upon a name for that key as it would be kind of a "true name" (read Vinge if you haven't ;-) ). The parties only have to agree on the key identifier derivation method.
>>
>> For example, if Alice and Bob agree to name their symmetric keys by taking the sha256 of that key's bytes, base64url encode the hash, and represent it as a urn, like "urn:s256:V2jyhd8tX-19vpEhyrDzIHgUYyDA5MS1Qi71iw1SUP0". This would allow both parties to maintain their own key-db with (kid, key) associations. Embedding the kid in the exchanged cipher messages would allow both parties to easily find the key to decrypt the received message.
>> (very much like we often use the hash of the public key (or pk-cert) to identify the private key to decrypt)
>
> For some more context on ?it depends what you want to accomplish, and generic schemes are hard?; Bob and Alice may also want to have key ids that only work for _them_ ? e.g. Bob and Alice?s static DH keys are used to generate a shared secret used for an AD key wrap scheme.


You're right - Alice may name the key she shares with Bob: "Bob's
key", while Bob may name the same key: "Alice's key" on his end. They
can/should use what ever name is easiest to construct the cypher
messages that they want to exchange with each other. However, the key
identifier that they embed inside of the cypher message cannot be a
local nickname, but should be one that both parties can use, like the
key identifier that I'm looking for.


>> The kid embedded in the cipher message is no more than a ?hint?. It could be signed as part of the whole cipher message, but its integrity can only be confirmed after the message is decrypted&authenticated. Changing the kid in a cipher message results in DoS, but so would flipping any other bit in that message.
>
> Does a failed decryption cause Bob to reject the message, or just try all the other keys? If so, what?s the benefit between just giving keys names, like sequence numbers or even strings?


What you do with a failed decryption is an interesting question, but
I'm not sure why it's relevant for the key identifier scheme...
(if you loop through all the keys and find one that decrypts the
message even though it doesn't match the kid... could be phishy...).

You could use any key identifier you want, as long as both Alice and
Bob will know how to find the right key for that kid.
When you use uuids, or arbitrary names/strings, though, you require
Alice and Bob to agree on the (identifier, key) separately, before the
cipher message can be decrypted.
However, when you use an "intrinsic" identifier, like the one I'm
proposing, then both Alice and Bob can generate those kid's for all
the keys that they have and share, without any separate agreement -
they only have to agree on the kid-derivation method. That observation
is probably the main selling point.


>> In its most simple form, I believe that the kid-derivation could be a sha2 of the key as long as the key is "truly" random. The only concern may be that some use a simple hash of the key for key derivation...(?). To avoid any of those usage collisions, you could define the convention of pre-pending the key with some publicly know constant, like b'pre-kid-constant' or fancier.
>
> SHA2?s problems are a little less obvious when inputs are fixed length, but keys aren?t always ? I?d recommend a SHA3-era hash like BLAKE2b or SHA-3 itself to not have to worry about that part at all :)
>
>> If one believes that a simple sha2 hash is only borderline enough secure (?), then maybe use a CMAC or HMAC, where you use the key on the key-value itself, and the resulting tag would constitute the identifier. (I did something like that in franks42/naclj with blake2)
>
> What?s the key used to compute the MAC? (In this case, I think what you _really_ want is AD key wrapping schemes, including GCM-SIV?s tiny mode).


For that blake2 scheme that I used in franks42/naclj, the
authentication-key is the key itself - you hash the key and use that
same key to provide additional integrity protection. Pretty sure
HMAC-like schemes were never meant for that purpose... but it doesn't
hurt...


>> Or use HKDF, with maybe a kid-derivation specific constant for the salt, a kid-specific info value, and a sufficient length of the resulting key, i.e. identifier, that makes everybody happy.
>
> I?d probably go with BLAKE2b if this is _all_ you?re trying to do, but I think what you might really want is key wrap :)


Love blake2, but it's not available in plain-vanilla pyca/cryptography...

Any concerns with using HKDF for this as I suggested in the gist?
https://gist.github.com/franks42/b8b28049adcdf4504271238391c3525b


Now comes my question about this "key wrap" that you so obviously try
to promote as a solution ;-)...
If I understand it well, key-wrap schemes also requires a second
kek-like key, which we do not have...
How would that work?


>> Hope this additional explanation helps.
>
> A little :) Is this for encryption at rest, with multiple recipients, where the recipients are assumed to already have all of the keys?

Cipher messages in rest or in flight - both use cases apply - any time
you have to find the key to decrypt/verify a message through a key
identifier send along with that message.

Multiple recipients - sure - they face the same issue of finding the
right key to decrypt - although you may use the individually shared
keys as kek's but those scenarios are probably distracting...

Yes, sending and receiving parties must have a shared (symmetric) key
to make this work - through what key-exchange mechanism this was
achieved is not important for this scheme to work.

Regards, Frank.


>
>> PS. Don?t believe I will resurrect that franks42/naclj - I?ll add a note about depreciation and send them to your effort - it was a good experience learning about Curve/Ed25519 and the nacl/libsodium code though - also trying to keep all data structures as immutable as possible was a good exercise.
>
> It definitely has. I?m working on a blog post (series of blog posts) on crypto API design, particularly in the context of libsodium and the JVMs plethora of byte types.

Crypto API design is complicated and has been screwed up many times -
in my experience API design should probably not be left to the
cryptographers as they live on a different planet ;-) - looking
forward to that blog post!


>
>> On Fri, Jul 1, 2016 at 3:53 PM, lvh <_ at lvh.io> wrote:
>>>
>>>> On Jul 1, 2016, at 12:54 PM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
>>>>
>>>> Hi lvh,
>>>>
>>>> Guess you're the "lvh" who is responsible for "lvh/caesium" ;-).
>>>
>>> Yup. I?m also a founding member of PyCA and the resident cryptographer, which is why I?m on this list :-)
>>>
>>>> Good to see that you've reanimated that project! Believe you were kind of
>>>> distracted for awhile, which "forced" me to play around with
>>>> "franks42/naclj"... which has been on live-support for about a year
>>>> now, because my new job consumes even my playtime.
>>>
>>> It did what I needed it to do at the time, so I didn?t fix what wasn?t broken ;-) I don?t recall anyone reaching out or filing issues. Once someone did ask questions and contributed code, I was happy to merge/review/cut new releases/do new development. More dev is happening now to scratch my own itch :)
>>>
>>> Currently I?m doing a lot of work around NMR as mentioned before, and API design around e.g. different byte buffer types, so that for example you can efficiently dump a nonce and a ciphertext in the same buffer, or derive multiple keys in one iteration of BLAKE2, etc. Also a bunch of work around e.g. pinning and verification of the produced binding and related benchmarking :)
>>>
>>> I invite you to look at caesium again, because some of the criticisms you make in naclj?s README no longer apply (e.g. caesium no longer uses kalium and instead binds libsodium directly, albeit for a different reason than what naclj mentions). Because the binding is done in Clojure, it can do all sorts of metaprogramming including binding every permutation of a particular method for various byte types in addition to the inspection mentioned above, e.g.: https://github.com/lvh/caesium/blob/master/src/caesium/binding.clj#L56-L62
>>>
>>> Do you intend to continue to develop naclj, or is it effectively retired?
>>>
>>>> As part of that "franks42/naclj" effort, I suggested to standardize
>>>> the derivation of a kid from the two curve25519 public keys. However,
>>>> I recognize that you do not always have any DH-keys available when you
>>>> have a bare symmetric key,
>>>
>>> Is that scheme documented anywhere? I wonder what the use case is for two curve25519 pubkeys ? the ?obvious" case would seem to easily degenerate to the shared symmetric secret (after doing a DH exchange).
>>>
>>>> so I suggested a scheme based on blake2. I
>>>> wrote up some rationale for those choices here:
>>>> "https://github.com/franks42/naclj/blob/master/Keys%2C%20IDs%2C%20and%20URNs.md",
>>>> but never got much traction on the libsodium list,... and then I got
>>>> distracted.
>>>>
>>>> Now I'm faced again with similar key-management issues, which could
>>>> benefit from such key-derived kid's - so I try again.
>>>>
>>>> In summary, your suggestions all resonate very well, but... there are
>>>> too many of them. Let's just pick one identifier derivation mechanism
>>>> for symmetric keys, document it, implement it, use it!
>>>
>>> I think there are a few problems preventing this from happening right now, including:
>>>
>>> - Historically, cryptographers have not researched key wrap anywhere near as much as other schemes. I think the only reason it?s en vogue now is the interest in NMR, which at least a handful of cryptographers (Rogaway, Krovetz, and humbly, myself) care about now, and is incidentally a related problem.
>>> - People want subtly different things for their protocols, further reducing interest. Do you just want to identify a key? That?s fine, but a problem many protocols dodge. Do you want to ship a key to someone who already has a secret or asymmetric key? AEAD (including NMR AEAD in particular, so key wrap) and just asymmetric encryption (a la non-PFS TLS or GPG) is probably where you?re going to land.
>>> - How does this fit in a grander protocol and what is that protocol trying to accomplish?
>>> - How is the key identifier authenticated? What prevents Mallory from just modifying the key id bytes to effectively deny service? E.g. if I?m doing this to make sure I can rotate keys effectively, how do I auth that? Ideally without replacing an unrotatable secret key with another unrotatable secret key :D (Effective key rotation for KEKs is definitely something I care about.)
>>> - When keys are being sent alongside messages, how do we make this not a footgun for e.g. key selection attacks? (Granted, harder for EdDSA, but I want protocols to be correct for arbitrary schemes :)). PyCA cares about recipes being not footguns. That?s a mixed bag: on the one hand, it means we can give safe advice, on the other hand, it does mean that all we have is Fernet...
>>>
>>> Overall, I think this is a reasonable idea for some protocols, but I think we need to be extremely clear about what that is, who it?s for, and how to use it.
>>>
>>>
>>> lvh
>>>
>>>> Groetjes, Frank.
>>>>
>>>> On Fri, Jul 1, 2016 at 9:51 AM, lvh <_ at lvh.io> wrote:
>>>>> Hi Frank,
>>>>>
>>>>>> On Jul 1, 2016, at 11:11 AM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
>>>>>>
>>>>>> snip snip key identifiers
>>>>>
>>>>> This is why some key derivation functions and PRFs have ?purpose? or ?info" fields, yes; including BLAKE2 and HKDF. Deriving a lesser key (which might just be a keyid) is a perfectly valid strategy from objcap practice. I?m doing something similar in the scheme of a larger semiprivate key scheme using libsodium. You probably do want something that explicitly supports that instead of just implicitly picking a particular nonce or whatever ? I?m not sure which nonce you?re referring to, I don?t think the systems you mentioned take one. TL;DR: make the derivation completely distinct based on what you?re deriving and why you?re deriving it :)
>>>>>
>>>>> You might also want to look at the related concept of NMR and key-wrap, which might let you solve the problem at a slightly different part of your protocol; essentially giving you a protected key with associated data about that key. It?s not entirely clear what the people standardizing GCM-SIV want to do exactly (other than ?not TLS?, I don?t think they?ve said), but this is the obvious choice, especially given GCM-SIVs separate code path for tiny messages and the historical linking of the two from a crypto design perspective.
>>>>>
>>>>> I am also writing NMR stuff on the side in libsodium/caesium, but that focuses mostly on being a Fernet replacement, rather than a keywrap, using secretbox (which makes it easy because big nonce space). Pretty sure I can translate it to the AEAD schemes, but the security proof gets iffier. Which reminds me: we should talk about Clojure bindings to libsodium some time :)
>>>>>
>>>>>
>>>>> lvh
>>>>> _______________________________________________
>>>>> Cryptography-dev mailing list
>>>>> Cryptography-dev at python.org
>>>>> https://mail.python.org/mailman/listinfo/cryptography-dev
>>>> _______________________________________________
>>>> Cryptography-dev mailing list
>>>> Cryptography-dev at python.org
>>>> https://mail.python.org/mailman/listinfo/cryptography-dev
>>>
>>>
>>> _______________________________________________
>>> Cryptography-dev mailing list
>>> Cryptography-dev at python.org
>>> https://mail.python.org/mailman/listinfo/cryptography-dev
>>>
>> _______________________________________________
>> Cryptography-dev mailing list
>> Cryptography-dev at python.org
>> https://mail.python.org/mailman/listinfo/cryptography-dev
>
> _______________________________________________
> Cryptography-dev mailing list
> Cryptography-dev at python.org
> https://mail.python.org/mailman/listinfo/cryptography-dev

From _ at lvh.cc  Wed Jul  6 18:20:22 2016
From: _ at lvh.cc (Laurens Van Houtven)
Date: Wed, 6 Jul 2016 17:20:22 -0500
Subject: [Cryptography-dev] "intrinsic" symmetric key identifier?
In-Reply-To: <CAJGvTrSt64=wsCg+Dm-nMMNgSYjUY5OG8KXmE0FUM10EUGXf-Q@mail.gmail.com>
References: <CAJGvTrQiwH8msDBxj-xkBL4AALE6-RRkexfj=EO5=2h+Ky_aew@mail.gmail.com>
 <749A9BBD-6191-46E6-BF2B-4134FC37A27B@lvh.io>
 <CAJGvTrRFuuhU=N4vBL04WEkjrFFgPdkrg0X+xmMPLN6zMO09+A@mail.gmail.com>
 <0D20123F-F12B-4234-84DF-E9AECE6E31C9@lvh.io>
 <5471F776-D4B0-4941-9FAB-96341A530A55@gmail.com>
 <F6805153-0611-4829-AE0E-89A9B4A2F0B3@lvh.io>
 <CAJGvTrSt64=wsCg+Dm-nMMNgSYjUY5OG8KXmE0FUM10EUGXf-Q@mail.gmail.com>
Message-ID: <5AAC30C0-1042-4DDF-A6AD-75C2DE0AED6C@lvh.cc>

Hi,

Sent from my iPhone

> On Jul 6, 2016, at 13:22, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
>> For some more context on ?it depends what you want to accomplish, and generic schemes are hard?; Bob and Alice may also want to have key ids that only work for _them_ ? e.g. Bob and Alice?s static DH keys are used to generate a shared secret used for an AD key wrap scheme.
> 
> You're right - Alice may name the key she shares with Bob: "Bob's
> key", while Bob may name the same key: "Alice's key" on his end. They
> can/should use what ever name is easiest to construct the cypher
> messages that they want to exchange with each other. However, the key
> identifier that they embed inside of the cypher message cannot be a
> local nickname, but should be one that both parties can use, like the
> key identifier that I'm looking for.

Key wrap is symmetric, deterministic encryption; they are only local within the context of that key, not local to an identity.

>>> The kid embedded in the cipher message is no more than a ?hint?. It could be signed as part of the whole cipher message, but its integrity can only be confirmed after the message is decrypted&authenticated. Changing the kid in a cipher message results in DoS, but so would flipping any other bit in that message.
>> 
>> Does a failed decryption cause Bob to reject the message, or just try all the other keys? If so, what?s the benefit between just giving keys names, like sequence numbers or even strings?
> 
> What you do with a failed decryption is an interesting question, but
> I'm not sure why it's relevant for the key identifier scheme...
> (if you loop through all the keys and find one that decrypts the
> message even though it doesn't match the kid... could be phishy...).

I'm not sure; how you use it has relevant consequences for how efficient you can make the scheme.

> You could use any key identifier you want, as long as both Alice and
> Bob will know how to find the right key for that kid.
> When you use uuids, or arbitrary names/strings, though, you require
> Alice and Bob to agree on the (identifier, key) separately, before the
> cipher message can be decrypted.
> However, when you use an "intrinsic" identifier, like the one I'm
> proposing, then both Alice and Bob can generate those kid's for all
> the keys that they have and share, without any separate agreement -
> they only have to agree on the kid-derivation method. That observation
> is probably the main selling point.

>>> If one believes that a simple sha2 hash is only borderline enough secure (?), then maybe use a CMAC or HMAC, where you use the key on the key-value itself, and the resulting tag would constitute the identifier. (I did something like that in franks42/naclj with blake2)
>> 
>> What?s the key used to compute the MAC? (In this case, I think what you _really_ want is AD key wrapping schemes, including GCM-SIV?s tiny mode).
> 
> 
> For that blake2 scheme that I used in franks42/naclj, the
> authentication-key is the key itself - you hash the key and use that
> same key to provide additional integrity protection. Pretty sure
> HMAC-like schemes were never meant for that purpose... but it doesn't
> hurt...

Do you have a proof of security for that? I'm in a car and don't have my notebook, but it seems like it'd be pretty easy to build a secure PRF for which that is not OK; I'm thinking CBC-MAC style vulns for example. Doing this securely (with a real key) is what key wrap tries to solve.

>>> Or use HKDF, with maybe a kid-derivation specific constant for the salt, a kid-specific info value, and a sufficient length of the resulting key, i.e. identifier, that makes everybody happy.
>> 
>> I?d probably go with BLAKE2b if this is _all_ you?re trying to do, but I think what you might really want is key wrap :)
> 
> Love blake2, but it's not available in plain-vanilla pyca/cryptography...

You should go fix that! ;)

> Any concerns with using HKDF for this as I suggested in the gist?
> https://gist.github.com/franks42/b8b28049adcdf4504271238391c3525b

Seems fine; will get it a more thorough review when I get back.

> Now comes my question about this "key wrap" that you so obviously try
> to promote as a solution ;-)...

No horse in this race; it's just that the deterministic encryption folks used to encrypt 16/32 bytes at a time and call what they do "key wrap" and it sounded a lot like what you want.

Here's where I would get started:
http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/proposedmodes/siv/siv.pdf

> If I understand it well, key-wrap schemes also requires a second
> kek-like key, which we do not have...
> How would that work?

See above; with a real key; perhaps not exactly what you're looking for.

> Cipher messages in rest or in flight - both use cases apply - any time
> you have to find the key to decrypt/verify a message through a key
> identifier send along with that message.
> 
> Multiple recipients - sure - they face the same issue of finding the
> right key to decrypt - although you may use the individually shared
> keys as kek's but those scenarios are probably distracting...
> 
> Yes, sending and receiving parties must have a shared (symmetric) key
> to make this work - through what key-exchange mechanism this was
> achieved is not important for this scheme to work.

Right. The reason I'm being so persistent is similar to why a lot of cryptographers dislike PAKE -- it's not that it's bad or hard to do -- it just seems like a weird problem to have. To quote Glyph, it sounded a bit like a jackhammer problem :)

In short: HKDF and BLAKE2 seem like what you want :)

lvh


> Regards, Frank.
> 
> 
>> 
>>> PS. Don?t believe I will resurrect that franks42/naclj - I?ll add a note about depreciation and send them to your effort - it was a good experience learning about Curve/Ed25519 and the nacl/libsodium code though - also trying to keep all data structures as immutable as possible was a good exercise.
>> 
>> It definitely has. I?m working on a blog post (series of blog posts) on crypto API design, particularly in the context of libsodium and the JVMs plethora of byte types.
> 
> Crypto API design is complicated and has been screwed up many times -
> in my experience API design should probably not be left to the
> cryptographers as they live on a different planet ;-) - looking
> forward to that blog post!
> 
> 
>> 
>>>> On Fri, Jul 1, 2016 at 3:53 PM, lvh <_ at lvh.io> wrote:
>>>> 
>>>>> On Jul 1, 2016, at 12:54 PM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
>>>>> 
>>>>> Hi lvh,
>>>>> 
>>>>> Guess you're the "lvh" who is responsible for "lvh/caesium" ;-).
>>>> 
>>>> Yup. I?m also a founding member of PyCA and the resident cryptographer, which is why I?m on this list :-)
>>>> 
>>>>> Good to see that you've reanimated that project! Believe you were kind of
>>>>> distracted for awhile, which "forced" me to play around with
>>>>> "franks42/naclj"... which has been on live-support for about a year
>>>>> now, because my new job consumes even my playtime.
>>>> 
>>>> It did what I needed it to do at the time, so I didn?t fix what wasn?t broken ;-) I don?t recall anyone reaching out or filing issues. Once someone did ask questions and contributed code, I was happy to merge/review/cut new releases/do new development. More dev is happening now to scratch my own itch :)
>>>> 
>>>> Currently I?m doing a lot of work around NMR as mentioned before, and API design around e.g. different byte buffer types, so that for example you can efficiently dump a nonce and a ciphertext in the same buffer, or derive multiple keys in one iteration of BLAKE2, etc. Also a bunch of work around e.g. pinning and verification of the produced binding and related benchmarking :)
>>>> 
>>>> I invite you to look at caesium again, because some of the criticisms you make in naclj?s README no longer apply (e.g. caesium no longer uses kalium and instead binds libsodium directly, albeit for a different reason than what naclj mentions). Because the binding is done in Clojure, it can do all sorts of metaprogramming including binding every permutation of a particular method for various byte types in addition to the inspection mentioned above, e.g.: https://github.com/lvh/caesium/blob/master/src/caesium/binding.clj#L56-L62
>>>> 
>>>> Do you intend to continue to develop naclj, or is it effectively retired?
>>>> 
>>>>> As part of that "franks42/naclj" effort, I suggested to standardize
>>>>> the derivation of a kid from the two curve25519 public keys. However,
>>>>> I recognize that you do not always have any DH-keys available when you
>>>>> have a bare symmetric key,
>>>> 
>>>> Is that scheme documented anywhere? I wonder what the use case is for two curve25519 pubkeys ? the ?obvious" case would seem to easily degenerate to the shared symmetric secret (after doing a DH exchange).
>>>> 
>>>>> so I suggested a scheme based on blake2. I
>>>>> wrote up some rationale for those choices here:
>>>>> "https://github.com/franks42/naclj/blob/master/Keys%2C%20IDs%2C%20and%20URNs.md",
>>>>> but never got much traction on the libsodium list,... and then I got
>>>>> distracted.
>>>>> 
>>>>> Now I'm faced again with similar key-management issues, which could
>>>>> benefit from such key-derived kid's - so I try again.
>>>>> 
>>>>> In summary, your suggestions all resonate very well, but... there are
>>>>> too many of them. Let's just pick one identifier derivation mechanism
>>>>> for symmetric keys, document it, implement it, use it!
>>>> 
>>>> I think there are a few problems preventing this from happening right now, including:
>>>> 
>>>> - Historically, cryptographers have not researched key wrap anywhere near as much as other schemes. I think the only reason it?s en vogue now is the interest in NMR, which at least a handful of cryptographers (Rogaway, Krovetz, and humbly, myself) care about now, and is incidentally a related problem.
>>>> - People want subtly different things for their protocols, further reducing interest. Do you just want to identify a key? That?s fine, but a problem many protocols dodge. Do you want to ship a key to someone who already has a secret or asymmetric key? AEAD (including NMR AEAD in particular, so key wrap) and just asymmetric encryption (a la non-PFS TLS or GPG) is probably where you?re going to land.
>>>> - How does this fit in a grander protocol and what is that protocol trying to accomplish?
>>>> - How is the key identifier authenticated? What prevents Mallory from just modifying the key id bytes to effectively deny service? E.g. if I?m doing this to make sure I can rotate keys effectively, how do I auth that? Ideally without replacing an unrotatable secret key with another unrotatable secret key :D (Effective key rotation for KEKs is definitely something I care about.)
>>>> - When keys are being sent alongside messages, how do we make this not a footgun for e.g. key selection attacks? (Granted, harder for EdDSA, but I want protocols to be correct for arbitrary schemes :)). PyCA cares about recipes being not footguns. That?s a mixed bag: on the one hand, it means we can give safe advice, on the other hand, it does mean that all we have is Fernet...
>>>> 
>>>> Overall, I think this is a reasonable idea for some protocols, but I think we need to be extremely clear about what that is, who it?s for, and how to use it.
>>>> 
>>>> 
>>>> lvh
>>>> 
>>>>> Groetjes, Frank.
>>>>> 
>>>>>> On Fri, Jul 1, 2016 at 9:51 AM, lvh <_ at lvh.io> wrote:
>>>>>> Hi Frank,
>>>>>> 
>>>>>>> On Jul 1, 2016, at 11:11 AM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
>>>>>>> 
>>>>>>> snip snip key identifiers
>>>>>> 
>>>>>> This is why some key derivation functions and PRFs have ?purpose? or ?info" fields, yes; including BLAKE2 and HKDF. Deriving a lesser key (which might just be a keyid) is a perfectly valid strategy from objcap practice. I?m doing something similar in the scheme of a larger semiprivate key scheme using libsodium. You probably do want something that explicitly supports that instead of just implicitly picking a particular nonce or whatever ? I?m not sure which nonce you?re referring to, I don?t think the systems you mentioned take one. TL;DR: make the derivation completely distinct based on what you?re deriving and why you?re deriving it :)
>>>>>> 
>>>>>> You might also want to look at the related concept of NMR and key-wrap, which might let you solve the problem at a slightly different part of your protocol; essentially giving you a protected key with associated data about that key. It?s not entirely clear what the people standardizing GCM-SIV want to do exactly (other than ?not TLS?, I don?t think they?ve said), but this is the obvious choice, especially given GCM-SIVs separate code path for tiny messages and the historical linking of the two from a crypto design perspective.
>>>>>> 
>>>>>> I am also writing NMR stuff on the side in libsodium/caesium, but that focuses mostly on being a Fernet replacement, rather than a keywrap, using secretbox (which makes it easy because big nonce space). Pretty sure I can translate it to the AEAD schemes, but the security proof gets iffier. Which reminds me: we should talk about Clojure bindings to libsodium some time :)
>>>>>> 
>>>>>> 
>>>>>> lvh
>>>>>> _______________________________________________
>>>>>> Cryptography-dev mailing list
>>>>>> Cryptography-dev at python.org
>>>>>> https://mail.python.org/mailman/listinfo/cryptography-dev
>>>>> _______________________________________________
>>>>> Cryptography-dev mailing list
>>>>> Cryptography-dev at python.org
>>>>> https://mail.python.org/mailman/listinfo/cryptography-dev
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Cryptography-dev mailing list
>>>> Cryptography-dev at python.org
>>>> https://mail.python.org/mailman/listinfo/cryptography-dev
>>> _______________________________________________
>>> Cryptography-dev mailing list
>>> Cryptography-dev at python.org
>>> https://mail.python.org/mailman/listinfo/cryptography-dev
>> 
>> _______________________________________________
>> Cryptography-dev mailing list
>> Cryptography-dev at python.org
>> https://mail.python.org/mailman/listinfo/cryptography-dev
> _______________________________________________
> Cryptography-dev mailing list
> Cryptography-dev at python.org
> https://mail.python.org/mailman/listinfo/cryptography-dev

From _ at lvh.io  Thu Jul  7 07:43:59 2016
From: _ at lvh.io (lvh)
Date: Thu, 7 Jul 2016 06:43:59 -0500
Subject: [Cryptography-dev] "intrinsic" symmetric key identifier?
In-Reply-To: <CAJGvTrSt64=wsCg+Dm-nMMNgSYjUY5OG8KXmE0FUM10EUGXf-Q@mail.gmail.com>
References: <CAJGvTrQiwH8msDBxj-xkBL4AALE6-RRkexfj=EO5=2h+Ky_aew@mail.gmail.com>
 <749A9BBD-6191-46E6-BF2B-4134FC37A27B@lvh.io>
 <CAJGvTrRFuuhU=N4vBL04WEkjrFFgPdkrg0X+xmMPLN6zMO09+A@mail.gmail.com>
 <0D20123F-F12B-4234-84DF-E9AECE6E31C9@lvh.io>
 <5471F776-D4B0-4941-9FAB-96341A530A55@gmail.com>
 <F6805153-0611-4829-AE0E-89A9B4A2F0B3@lvh.io>
 <CAJGvTrSt64=wsCg+Dm-nMMNgSYjUY5OG8KXmE0FUM10EUGXf-Q@mail.gmail.com>
Message-ID: <81D49326-13EE-4150-A91A-18146990135F@lvh.io>

Hi,

Apologies in advance for late and possibly duplicated message. Originally sent from my iPhone, from wrong e-mail address, which made the mailing list manager unhappy.

> On Jul 6, 2016, at 13:22, Frank Siebenlist <frank.siebenlist at gmail.com <mailto:frank.siebenlist at gmail.com>> wrote:
>> For some more context on ?it depends what you want to accomplish, and generic schemes are hard?; Bob and Alice may also want to have key ids that only work for _them_ ? e.g. Bob and Alice?s static DH keys are used to generate a shared secret used for an AD key wrap scheme.
> 
> You're right - Alice may name the key she shares with Bob: "Bob's
> key", while Bob may name the same key: "Alice's key" on his end. They
> can/should use what ever name is easiest to construct the cypher
> messages that they want to exchange with each other. However, the key
> identifier that they embed inside of the cypher message cannot be a
> local nickname, but should be one that both parties can use, like the
> key identifier that I'm looking for.

Key wrap is symmetric, deterministic encryption; they are only local within the context of that key, not local to an identity.

>>> The kid embedded in the cipher message is no more than a ?hint?. It could be signed as part of the whole cipher message, but its integrity can only be confirmed after the message is decrypted&authenticated. Changing the kid in a cipher message results in DoS, but so would flipping any other bit in that message.
>> 
>> Does a failed decryption cause Bob to reject the message, or just try all the other keys? If so, what?s the benefit between just giving keys names, like sequence numbers or even strings?
> 
> What you do with a failed decryption is an interesting question, but
> I'm not sure why it's relevant for the key identifier scheme...
> (if you loop through all the keys and find one that decrypts the
> message even though it doesn't match the kid... could be phishy...).

I'm not sure; how you use it has relevant consequences for how efficient you can make the scheme.

> You could use any key identifier you want, as long as both Alice and
> Bob will know how to find the right key for that kid.
> When you use uuids, or arbitrary names/strings, though, you require
> Alice and Bob to agree on the (identifier, key) separately, before the
> cipher message can be decrypted.
> However, when you use an "intrinsic" identifier, like the one I'm
> proposing, then both Alice and Bob can generate those kid's for all
> the keys that they have and share, without any separate agreement -
> they only have to agree on the kid-derivation method. That observation
> is probably the main selling point.

>>> If one believes that a simple sha2 hash is only borderline enough secure (?), then maybe use a CMAC or HMAC, where you use the key on the key-value itself, and the resulting tag would constitute the identifier. (I did something like that in franks42/naclj with blake2)
>> 
>> What?s the key used to compute the MAC? (In this case, I think what you _really_ want is AD key wrapping schemes, including GCM-SIV?s tiny mode).
> 
> 
> For that blake2 scheme that I used in franks42/naclj, the
> authentication-key is the key itself - you hash the key and use that
> same key to provide additional integrity protection. Pretty sure
> HMAC-like schemes were never meant for that purpose... but it doesn't
> hurt...

Do you have a proof of security for that? I'm in a car and don't have my notebook, but it seems like it'd be pretty easy to build a secure PRF for which that is not OK; I'm thinking CBC-MAC style vulns for example. Doing this securely (with a real key) is what key wrap tries to solve.

>>> Or use HKDF, with maybe a kid-derivation specific constant for the salt, a kid-specific info value, and a sufficient length of the resulting key, i.e. identifier, that makes everybody happy.
>> 
>> I?d probably go with BLAKE2b if this is _all_ you?re trying to do, but I think what you might really want is key wrap :)
> 
> Love blake2, but it's not available in plain-vanilla pyca/cryptography...

You should go fix that! ;)

> Any concerns with using HKDF for this as I suggested in the gist?
> https://gist.github.com/franks42/b8b28049adcdf4504271238391c3525b <https://gist.github.com/franks42/b8b28049adcdf4504271238391c3525b>

Seems fine; will get it a more thorough review when I get back.

> Now comes my question about this "key wrap" that you so obviously try
> to promote as a solution ;-)...

No horse in this race; it's just that the deterministic encryption folks used to encrypt 16/32 bytes at a time and call what they do "key wrap" and it sounded a lot like what you want.

Here's where I would get started:
http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/proposedmodes/siv/siv.pdf <http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/proposedmodes/siv/siv.pdf>

> If I understand it well, key-wrap schemes also requires a second
> kek-like key, which we do not have...
> How would that work?

See above; with a real key; perhaps not exactly what you're looking for.

> Cipher messages in rest or in flight - both use cases apply - any time
> you have to find the key to decrypt/verify a message through a key
> identifier send along with that message.
> 
> Multiple recipients - sure - they face the same issue of finding the
> right key to decrypt - although you may use the individually shared
> keys as kek's but those scenarios are probably distracting...
> 
> Yes, sending and receiving parties must have a shared (symmetric) key
> to make this work - through what key-exchange mechanism this was
> achieved is not important for this scheme to work.

Right. The reason I'm being so persistent is similar to why a lot of cryptographers dislike PAKE -- it's not that it's bad or hard to do -- it just seems like a weird problem to have. To quote Glyph, it sounded a bit like a jackhammer problem :)

In short: HKDF and BLAKE2 seem like what you want :)

lvh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/cryptography-dev/attachments/20160707/4e19079d/attachment-0001.html>

From simo at redhat.com  Thu Jul  7 08:22:33 2016
From: simo at redhat.com (Simo Sorce)
Date: Thu, 07 Jul 2016 08:22:33 -0400
Subject: [Cryptography-dev] "intrinsic" symmetric key identifier?
In-Reply-To: <5AAC30C0-1042-4DDF-A6AD-75C2DE0AED6C@lvh.cc>
References: <CAJGvTrQiwH8msDBxj-xkBL4AALE6-RRkexfj=EO5=2h+Ky_aew@mail.gmail.com>
 <749A9BBD-6191-46E6-BF2B-4134FC37A27B@lvh.io>
 <CAJGvTrRFuuhU=N4vBL04WEkjrFFgPdkrg0X+xmMPLN6zMO09+A@mail.gmail.com>
 <0D20123F-F12B-4234-84DF-E9AECE6E31C9@lvh.io>
 <5471F776-D4B0-4941-9FAB-96341A530A55@gmail.com>
 <F6805153-0611-4829-AE0E-89A9B4A2F0B3@lvh.io>
 <CAJGvTrSt64=wsCg+Dm-nMMNgSYjUY5OG8KXmE0FUM10EUGXf-Q@mail.gmail.com>
 <5AAC30C0-1042-4DDF-A6AD-75C2DE0AED6C@lvh.cc>
Message-ID: <1467894153.3121.158.camel@redhat.com>

On Wed, 2016-07-06 at 17:20 -0500, Laurens Van Houtven wrote:
> 
> Right. The reason I'm being so persistent is similar to why a lot of
> cryptographers dislike PAKE -- it's not that it's bad or hard to do --
> it just seems like a weird problem to have. To quote Glyph, it sounded
> a bit like a jackhammer problem :)

Sorry for the OT, I find PAKE very useful and we have a draft[1] to get
a variant (SPAKE) in the Kerberos protocol.
Do you have any reference to documents describing this "dislike" ?
I'd like to know more about it.

Simo.

[1]
https://www.ietf.org/archive/id/draft-mccallum-kitten-krb-spake-preauth-00.txt

-- 
Simo Sorce * Red Hat, Inc * New York


From _ at lvh.io  Thu Jul  7 08:36:12 2016
From: _ at lvh.io (lvh)
Date: Thu, 7 Jul 2016 07:36:12 -0500
Subject: [Cryptography-dev] "intrinsic" symmetric key identifier?
In-Reply-To: <1467894153.3121.158.camel@redhat.com>
References: <CAJGvTrQiwH8msDBxj-xkBL4AALE6-RRkexfj=EO5=2h+Ky_aew@mail.gmail.com>
 <749A9BBD-6191-46E6-BF2B-4134FC37A27B@lvh.io>
 <CAJGvTrRFuuhU=N4vBL04WEkjrFFgPdkrg0X+xmMPLN6zMO09+A@mail.gmail.com>
 <0D20123F-F12B-4234-84DF-E9AECE6E31C9@lvh.io>
 <5471F776-D4B0-4941-9FAB-96341A530A55@gmail.com>
 <F6805153-0611-4829-AE0E-89A9B4A2F0B3@lvh.io>
 <CAJGvTrSt64=wsCg+Dm-nMMNgSYjUY5OG8KXmE0FUM10EUGXf-Q@mail.gmail.com>
 <5AAC30C0-1042-4DDF-A6AD-75C2DE0AED6C@lvh.cc>
 <1467894153.3121.158.camel@redhat.com>
Message-ID: <96D84BA1-AEA6-4C9C-B323-A9036B968AB4@lvh.io>


> On Jul 7, 2016, at 7:22 AM, Simo Sorce <simo at redhat.com> wrote:
> On Wed, 2016-07-06 at 17:20 -0500, Laurens Van Houtven wrote:
>> 
>> Right. The reason I'm being so persistent is similar to why a lot of
>> cryptographers dislike PAKE -- it's not that it's bad or hard to do --
>> it just seems like a weird problem to have. To quote Glyph, it sounded
>> a bit like a jackhammer problem :)
> 
> Sorry for the OT, I find PAKE very useful and we have a draft[1] to get
> a variant (SPAKE) in the Kerberos protocol.
> Do you have any reference to documents describing this "dislike" ?
> I'd like to know more about it.


Nope. I don?t share those opinions of PAKE, regardless; but I do agree that it?s a solution to a very specific problem. If you want a reasonable way to go from a low-entropy shared secret to a high-entropy one, then you probably want SPAKE2.


lvh


From simo at redhat.com  Thu Jul  7 08:51:14 2016
From: simo at redhat.com (Simo Sorce)
Date: Thu, 07 Jul 2016 08:51:14 -0400
Subject: [Cryptography-dev] "intrinsic" symmetric key identifier?
In-Reply-To: <96D84BA1-AEA6-4C9C-B323-A9036B968AB4@lvh.io>
References: <CAJGvTrQiwH8msDBxj-xkBL4AALE6-RRkexfj=EO5=2h+Ky_aew@mail.gmail.com>
 <749A9BBD-6191-46E6-BF2B-4134FC37A27B@lvh.io>
 <CAJGvTrRFuuhU=N4vBL04WEkjrFFgPdkrg0X+xmMPLN6zMO09+A@mail.gmail.com>
 <0D20123F-F12B-4234-84DF-E9AECE6E31C9@lvh.io>
 <5471F776-D4B0-4941-9FAB-96341A530A55@gmail.com>
 <F6805153-0611-4829-AE0E-89A9B4A2F0B3@lvh.io>
 <CAJGvTrSt64=wsCg+Dm-nMMNgSYjUY5OG8KXmE0FUM10EUGXf-Q@mail.gmail.com>
 <5AAC30C0-1042-4DDF-A6AD-75C2DE0AED6C@lvh.cc>
 <1467894153.3121.158.camel@redhat.com>
 <96D84BA1-AEA6-4C9C-B323-A9036B968AB4@lvh.io>
Message-ID: <1467895874.3121.159.camel@redhat.com>

On Thu, 2016-07-07 at 07:36 -0500, lvh wrote:
> > On Jul 7, 2016, at 7:22 AM, Simo Sorce <simo at redhat.com> wrote:
> > On Wed, 2016-07-06 at 17:20 -0500, Laurens Van Houtven wrote:
> >> 
> >> Right. The reason I'm being so persistent is similar to why a lot of
> >> cryptographers dislike PAKE -- it's not that it's bad or hard to do --
> >> it just seems like a weird problem to have. To quote Glyph, it sounded
> >> a bit like a jackhammer problem :)
> > 
> > Sorry for the OT, I find PAKE very useful and we have a draft[1] to get
> > a variant (SPAKE) in the Kerberos protocol.
> > Do you have any reference to documents describing this "dislike" ?
> > I'd like to know more about it.
> 
> 
> Nope. I don?t share those opinions of PAKE, regardless; but I do agree
> that it?s a solution to a very specific problem. If you want a
> reasonable way to go from a low-entropy shared secret to a
> high-entropy one, then you probably want SPAKE2.

Yes, we are using SPAKE2, thanks.
Simo.

-- 
Simo Sorce * Red Hat, Inc * New York


From frank.siebenlist at gmail.com  Mon Jul 11 23:42:26 2016
From: frank.siebenlist at gmail.com (Frank Siebenlist)
Date: Mon, 11 Jul 2016 20:42:26 -0700
Subject: [Cryptography-dev] hash.SHA256 cpu expensive per byte versus
 byte-string?
Message-ID: <E61DB184-7873-43AA-B8D2-E73FC31C00DD@gmail.com>

I ran in some unexpected timing issues while using pyca/cryptography?s hash.SHA256,
and I?m wondering if there is something wrong with the timing discrepancy I see between two different hashing approaches.

When I hash a single byte-string of 10million bytes, it seems to take 2-3 orders of magnitude less time than when I loop over the bytes and hash them one by one.

Please look at the following bare-bone snippet:

?
from __future__ import absolute_import, division, print_function
import time
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.backends import default_backend
#
d1 = hashes.Hash(algorithm=hashes.SHA256(),backend=default_backend())
d2 = d1.copy()
#
n = 10000000
print('n:', n)
#
b = b'a'
ba = bytearray(n*b'a')
bs = bytes(ba)
#
s = time.time()
d1.update(bs)
t = time.time() - s
print('ba: ', t)
print(d1.finalize())
#
s = time.time()
for i in range(n):
    d2.update(b)
t = time.time() - s
print('b: ', t)
print(d2.finalize())
#
?

The output is:

?
/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/bin/python3.5 /Users/franksiebenlist/git/pyvate23/src/pyvate/messagedigest_tst.py
n: 10000000
ba:  0.027185916900634766
b'\x01\xf4\xa8|\x04\xb4\n\xf5\x9a\xad\xc0\xe8\x12)5\tp\x9c\x9a\x87c\xa6\x0b\x7f\x9e\x1903"\xf8\xb0<'
b:  15.677960872650146
b'\x01\xf4\xa8|\x04\xb4\n\xf5\x9a\xad\xc0\xe8\x12)5\tp\x9c\x9a\x87c\xa6\x0b\x7f\x9e\x1903"\xf8\xb0<'

Process finished with exit code 0
?

Results for python 2 and 3 are similar.

I understand that there may be a few more object-creations and casts involved in the looping, but 500 times slower? that was un unexpected surprise.

Comments?
Observation?

Thanks, Frank.


From _ at lvh.io  Tue Jul 12 11:07:13 2016
From: _ at lvh.io (lvh)
Date: Tue, 12 Jul 2016 10:07:13 -0500
Subject: [Cryptography-dev] hash.SHA256 cpu expensive per byte versus
 byte-string?
In-Reply-To: <E61DB184-7873-43AA-B8D2-E73FC31C00DD@gmail.com>
References: <E61DB184-7873-43AA-B8D2-E73FC31C00DD@gmail.com>
Message-ID: <CA55C8DC-22B6-45C2-B1F7-3EC7C8FA3D05@lvh.io>

Hi,

> On Jul 11, 2016, at 10:42 PM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:

<snipsnip>

> I understand that there may be a few more object-creations and casts involved in the looping, but 500 times slower? that was un unexpected surprise.

As expected. You both get massively increased C call overhead and the worst case because you don?t get to hit a block until every 512/8 == 64 updates. Alas, openssl speed doesn?t distinguish between the same message sizes but in different chunk sizes, but you can at least clearly see the performance multiplier for larger messages.

lvh

From frank.siebenlist at gmail.com  Tue Jul 12 13:49:07 2016
From: frank.siebenlist at gmail.com (Frank Siebenlist)
Date: Tue, 12 Jul 2016 10:49:07 -0700
Subject: [Cryptography-dev] hash.SHA256 cpu expensive per byte versus
 byte-string?
In-Reply-To: <CA55C8DC-22B6-45C2-B1F7-3EC7C8FA3D05@lvh.io>
References: <E61DB184-7873-43AA-B8D2-E73FC31C00DD@gmail.com>
 <CA55C8DC-22B6-45C2-B1F7-3EC7C8FA3D05@lvh.io>
Message-ID: <CAJGvTrTkxteqXmhQRPUdzJzUJLu-ix0VoDesZ0Jg0hps46e-FQ@mail.gmail.com>

After I sent my message yesterday evening, I was also wondering about
that 512bit (64byte) block-size of sha256, and if that would add to
the observed slowness.
The following output shows time as a function of byte-chunk size
(1,2,8,32,64,128,256 bytes)

b:  12.111763954162598
b2:  5.806451082229614
b8:  1.4664850234985352
b32:  0.37551307678222656
b64:  0.20229697227478027
b128:  0.11141395568847656
b256:  0.06758689880371094
8388608  bs:  0.020879030227661133

Time seems to go down linearly with increase of chunk size, and there
is no perceived "speed boost"  when we go through the 64byte
thresh-hold.
Time seems to be only linearly related to the number of python-to-C calls.

And again, I can understand that the overhead is proportional to the
number of python-to-C calls, but it's just the factor of 500 (2-3
order of magnitude) that (unpleasantly) surprised me. It requires one
to optimize on byte-string size to pass in the update(), when you have
many bytes to hash. For example, if you read from a file or socket,
don't update() 1 byte at the time while you read from the stream, but
fill-up a (big) buffer first and pass that buffer.

-Frank.

PS. I haven't looked at the sha256 C-code, but I can imagine that when
you pass the update() one byte at the time, it will fill-up some
64byte-buffer, and if that buffer is filled, it will churn/hash that
block. The adding a byte to the buffer is all low-level fast code in
C, while the churning would use significantly more CPU cycles... hard
to phantom that you would see much slower performance when you pass a
single byte at the time in C...


On Tue, Jul 12, 2016 at 8:07 AM, lvh <_ at lvh.io> wrote:
> Hi,
>
>> On Jul 11, 2016, at 10:42 PM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
>
> <snipsnip>
>
>> I understand that there may be a few more object-creations and casts involved in the looping, but 500 times slower? that was un unexpected surprise.
>
> As expected. You both get massively increased C call overhead and the worst case because you don?t get to hit a block until every 512/8 == 64 updates. Alas, openssl speed doesn?t distinguish between the same message sizes but in different chunk sizes, but you can at least clearly see the performance multiplier for larger messages.
>
> lvh
> _______________________________________________
> Cryptography-dev mailing list
> Cryptography-dev at python.org
> https://mail.python.org/mailman/listinfo/cryptography-dev

From frank.siebenlist at gmail.com  Thu Jul 14 01:23:31 2016
From: frank.siebenlist at gmail.com (Frank Siebenlist)
Date: Wed, 13 Jul 2016 22:23:31 -0700
Subject: [Cryptography-dev] hash.SHA256 cpu expensive per byte versus
 byte-string?
In-Reply-To: <CAJGvTrTkxteqXmhQRPUdzJzUJLu-ix0VoDesZ0Jg0hps46e-FQ@mail.gmail.com>
References: <E61DB184-7873-43AA-B8D2-E73FC31C00DD@gmail.com>
 <CA55C8DC-22B6-45C2-B1F7-3EC7C8FA3D05@lvh.io>
 <CAJGvTrTkxteqXmhQRPUdzJzUJLu-ix0VoDesZ0Jg0hps46e-FQ@mail.gmail.com>
Message-ID: <CAJGvTrS3idfSbRXXm7Mfyz1Dt=23+VdEkAbpTXOfL0JruMNThA@mail.gmail.com>

Python's native hashing module (hashlib), shows similar results:
- about the same time when passed the 8MB blob in one go
(probably expected as both use openssl)
- substantial overhead when looping over small chunks (up to 100 times)
- except that it's about 6 times faster per single byte..

n: 8388608
b:  1.958238124847412
b2:  1.0818939208984375
b8:  0.2987058162689209
b32:  0.10640311241149902
b64:  0.06242084503173828
b128:  0.04123806953430176
b256:  0.03258681297302246
8388608  bs:  0.02389383316040039

Guess hashlib used some better optimization on the C-calls (?).

This is my last update on this observation.
Conclusion is "so be it", and using bigger chunks for hashing gives
(much) better performance.

-Frank.

On Tue, Jul 12, 2016 at 10:49 AM, Frank Siebenlist
<frank.siebenlist at gmail.com> wrote:
> After I sent my message yesterday evening, I was also wondering about
> that 512bit (64byte) block-size of sha256, and if that would add to
> the observed slowness.
> The following output shows time as a function of byte-chunk size
> (1,2,8,32,64,128,256 bytes)
>
> b:  12.111763954162598
> b2:  5.806451082229614
> b8:  1.4664850234985352
> b32:  0.37551307678222656
> b64:  0.20229697227478027
> b128:  0.11141395568847656
> b256:  0.06758689880371094
> 8388608  bs:  0.020879030227661133
>
> Time seems to go down linearly with increase of chunk size, and there
> is no perceived "speed boost"  when we go through the 64byte
> thresh-hold.
> Time seems to be only linearly related to the number of python-to-C calls.
>
> And again, I can understand that the overhead is proportional to the
> number of python-to-C calls, but it's just the factor of 500 (2-3
> order of magnitude) that (unpleasantly) surprised me. It requires one
> to optimize on byte-string size to pass in the update(), when you have
> many bytes to hash. For example, if you read from a file or socket,
> don't update() 1 byte at the time while you read from the stream, but
> fill-up a (big) buffer first and pass that buffer.
>
> -Frank.
>
> PS. I haven't looked at the sha256 C-code, but I can imagine that when
> you pass the update() one byte at the time, it will fill-up some
> 64byte-buffer, and if that buffer is filled, it will churn/hash that
> block. The adding a byte to the buffer is all low-level fast code in
> C, while the churning would use significantly more CPU cycles... hard
> to phantom that you would see much slower performance when you pass a
> single byte at the time in C...
>
>
> On Tue, Jul 12, 2016 at 8:07 AM, lvh <_ at lvh.io> wrote:
>> Hi,
>>
>>> On Jul 11, 2016, at 10:42 PM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
>>
>> <snipsnip>
>>
>>> I understand that there may be a few more object-creations and casts involved in the looping, but 500 times slower? that was un unexpected surprise.
>>
>> As expected. You both get massively increased C call overhead and the worst case because you don?t get to hit a block until every 512/8 == 64 updates. Alas, openssl speed doesn?t distinguish between the same message sizes but in different chunk sizes, but you can at least clearly see the performance multiplier for larger messages.
>>
>> lvh
>> _______________________________________________
>> Cryptography-dev mailing list
>> Cryptography-dev at python.org
>> https://mail.python.org/mailman/listinfo/cryptography-dev

From _ at lvh.io  Thu Jul 14 09:57:57 2016
From: _ at lvh.io (lvh)
Date: Thu, 14 Jul 2016 08:57:57 -0500
Subject: [Cryptography-dev] hash.SHA256 cpu expensive per byte versus
 byte-string?
In-Reply-To: <CAJGvTrS3idfSbRXXm7Mfyz1Dt=23+VdEkAbpTXOfL0JruMNThA@mail.gmail.com>
References: <E61DB184-7873-43AA-B8D2-E73FC31C00DD@gmail.com>
 <CA55C8DC-22B6-45C2-B1F7-3EC7C8FA3D05@lvh.io>
 <CAJGvTrTkxteqXmhQRPUdzJzUJLu-ix0VoDesZ0Jg0hps46e-FQ@mail.gmail.com>
 <CAJGvTrS3idfSbRXXm7Mfyz1Dt=23+VdEkAbpTXOfL0JruMNThA@mail.gmail.com>
Message-ID: <CCCB16D5-4D52-4FCC-A4FA-F41C7CBEB57D@lvh.io>

Hi Frank,

> On Jul 14, 2016, at 12:23 AM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
> 
> Python's native hashing module (hashlib), shows similar results:
> - about the same time when passed the 8MB blob in one go
> (probably expected as both use openssl)
> - substantial overhead when looping over small chunks (up to 100 times)
> - except that it's about 6 times faster per single byte..

The perf by chunk is a consequence of how SHA256 works. The higher perf for many calls is a consequence of extension modules vs cffi.

lvh

> n: 8388608
> b:  1.958238124847412
> b2:  1.0818939208984375
> b8:  0.2987058162689209
> b32:  0.10640311241149902
> b64:  0.06242084503173828
> b128:  0.04123806953430176
> b256:  0.03258681297302246
> 8388608  bs:  0.02389383316040039
> 
> Guess hashlib used some better optimization on the C-calls (?).
> 
> This is my last update on this observation.
> Conclusion is "so be it", and using bigger chunks for hashing gives
> (much) better performance.
> 
> -Frank.
> 
> On Tue, Jul 12, 2016 at 10:49 AM, Frank Siebenlist
> <frank.siebenlist at gmail.com> wrote:
>> After I sent my message yesterday evening, I was also wondering about
>> that 512bit (64byte) block-size of sha256, and if that would add to
>> the observed slowness.
>> The following output shows time as a function of byte-chunk size
>> (1,2,8,32,64,128,256 bytes)
>> 
>> b:  12.111763954162598
>> b2:  5.806451082229614
>> b8:  1.4664850234985352
>> b32:  0.37551307678222656
>> b64:  0.20229697227478027
>> b128:  0.11141395568847656
>> b256:  0.06758689880371094
>> 8388608  bs:  0.020879030227661133
>> 
>> Time seems to go down linearly with increase of chunk size, and there
>> is no perceived "speed boost"  when we go through the 64byte
>> thresh-hold.
>> Time seems to be only linearly related to the number of python-to-C calls.
>> 
>> And again, I can understand that the overhead is proportional to the
>> number of python-to-C calls, but it's just the factor of 500 (2-3
>> order of magnitude) that (unpleasantly) surprised me. It requires one
>> to optimize on byte-string size to pass in the update(), when you have
>> many bytes to hash. For example, if you read from a file or socket,
>> don't update() 1 byte at the time while you read from the stream, but
>> fill-up a (big) buffer first and pass that buffer.
>> 
>> -Frank.
>> 
>> PS. I haven't looked at the sha256 C-code, but I can imagine that when
>> you pass the update() one byte at the time, it will fill-up some
>> 64byte-buffer, and if that buffer is filled, it will churn/hash that
>> block. The adding a byte to the buffer is all low-level fast code in
>> C, while the churning would use significantly more CPU cycles... hard
>> to phantom that you would see much slower performance when you pass a
>> single byte at the time in C...
>> 
>> 
>> On Tue, Jul 12, 2016 at 8:07 AM, lvh <_ at lvh.io> wrote:
>>> Hi,
>>> 
>>>> On Jul 11, 2016, at 10:42 PM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
>>> 
>>> <snipsnip>
>>> 
>>>> I understand that there may be a few more object-creations and casts involved in the looping, but 500 times slower? that was un unexpected surprise.
>>> 
>>> As expected. You both get massively increased C call overhead and the worst case because you don?t get to hit a block until every 512/8 == 64 updates. Alas, openssl speed doesn?t distinguish between the same message sizes but in different chunk sizes, but you can at least clearly see the performance multiplier for larger messages.
>>> 
>>> lvh
>>> _______________________________________________
>>> Cryptography-dev mailing list
>>> Cryptography-dev at python.org
>>> https://mail.python.org/mailman/listinfo/cryptography-dev
> _______________________________________________
> Cryptography-dev mailing list
> Cryptography-dev at python.org
> https://mail.python.org/mailman/listinfo/cryptography-dev

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 643 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/cryptography-dev/attachments/20160714/41869cf3/attachment.sig>

From frank.siebenlist at gmail.com  Thu Jul 14 12:01:24 2016
From: frank.siebenlist at gmail.com (Frank Siebenlist)
Date: Thu, 14 Jul 2016 09:01:24 -0700
Subject: [Cryptography-dev] hash.SHA256 cpu expensive per byte versus
 byte-string?
In-Reply-To: <CCCB16D5-4D52-4FCC-A4FA-F41C7CBEB57D@lvh.io>
References: <E61DB184-7873-43AA-B8D2-E73FC31C00DD@gmail.com>
 <CA55C8DC-22B6-45C2-B1F7-3EC7C8FA3D05@lvh.io>
 <CAJGvTrTkxteqXmhQRPUdzJzUJLu-ix0VoDesZ0Jg0hps46e-FQ@mail.gmail.com>
 <CAJGvTrS3idfSbRXXm7Mfyz1Dt=23+VdEkAbpTXOfL0JruMNThA@mail.gmail.com>
 <CCCB16D5-4D52-4FCC-A4FA-F41C7CBEB57D@lvh.io>
Message-ID: <CAJGvTrTZHPBZveAfx+DX=gqSiShm+KDkDkwfpmzqDp_v7ENW3w@mail.gmail.com>

> The perf by chunk is a consequence of how SHA256 works.


I politely disagree...

Having chunks smaller or larger than SHA256's 64 byte block size
doesn't seem to affect the timing results in any noticeable way.
If you do not fill-up SHA256's block-size buffer with update(), it
simply returns, and there is only the overhead of the function call.

Unless I misunderstand the inner workings...

Regards, Frank.


On Thu, Jul 14, 2016 at 6:57 AM, lvh <_ at lvh.io> wrote:
> Hi Frank,
>
>> On Jul 14, 2016, at 12:23 AM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
>>
>> Python's native hashing module (hashlib), shows similar results:
>> - about the same time when passed the 8MB blob in one go
>> (probably expected as both use openssl)
>> - substantial overhead when looping over small chunks (up to 100 times)
>> - except that it's about 6 times faster per single byte..
>
> The perf by chunk is a consequence of how SHA256 works. The higher perf for many calls is a consequence of extension modules vs cffi.
>
> lvh
>
>> n: 8388608
>> b:  1.958238124847412
>> b2:  1.0818939208984375
>> b8:  0.2987058162689209
>> b32:  0.10640311241149902
>> b64:  0.06242084503173828
>> b128:  0.04123806953430176
>> b256:  0.03258681297302246
>> 8388608  bs:  0.02389383316040039
>>
>> Guess hashlib used some better optimization on the C-calls (?).
>>
>> This is my last update on this observation.
>> Conclusion is "so be it", and using bigger chunks for hashing gives
>> (much) better performance.
>>
>> -Frank.
>>
>> On Tue, Jul 12, 2016 at 10:49 AM, Frank Siebenlist
>> <frank.siebenlist at gmail.com> wrote:
>>> After I sent my message yesterday evening, I was also wondering about
>>> that 512bit (64byte) block-size of sha256, and if that would add to
>>> the observed slowness.
>>> The following output shows time as a function of byte-chunk size
>>> (1,2,8,32,64,128,256 bytes)
>>>
>>> b:  12.111763954162598
>>> b2:  5.806451082229614
>>> b8:  1.4664850234985352
>>> b32:  0.37551307678222656
>>> b64:  0.20229697227478027
>>> b128:  0.11141395568847656
>>> b256:  0.06758689880371094
>>> 8388608  bs:  0.020879030227661133
>>>
>>> Time seems to go down linearly with increase of chunk size, and there
>>> is no perceived "speed boost"  when we go through the 64byte
>>> thresh-hold.
>>> Time seems to be only linearly related to the number of python-to-C calls.
>>>
>>> And again, I can understand that the overhead is proportional to the
>>> number of python-to-C calls, but it's just the factor of 500 (2-3
>>> order of magnitude) that (unpleasantly) surprised me. It requires one
>>> to optimize on byte-string size to pass in the update(), when you have
>>> many bytes to hash. For example, if you read from a file or socket,
>>> don't update() 1 byte at the time while you read from the stream, but
>>> fill-up a (big) buffer first and pass that buffer.
>>>
>>> -Frank.
>>>
>>> PS. I haven't looked at the sha256 C-code, but I can imagine that when
>>> you pass the update() one byte at the time, it will fill-up some
>>> 64byte-buffer, and if that buffer is filled, it will churn/hash that
>>> block. The adding a byte to the buffer is all low-level fast code in
>>> C, while the churning would use significantly more CPU cycles... hard
>>> to phantom that you would see much slower performance when you pass a
>>> single byte at the time in C...
>>>
>>>
>>> On Tue, Jul 12, 2016 at 8:07 AM, lvh <_ at lvh.io> wrote:
>>>> Hi,
>>>>
>>>>> On Jul 11, 2016, at 10:42 PM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
>>>>
>>>> <snipsnip>
>>>>
>>>>> I understand that there may be a few more object-creations and casts involved in the looping, but 500 times slower? that was un unexpected surprise.
>>>>
>>>> As expected. You both get massively increased C call overhead and the worst case because you don?t get to hit a block until every 512/8 == 64 updates. Alas, openssl speed doesn?t distinguish between the same message sizes but in different chunk sizes, but you can at least clearly see the performance multiplier for larger messages.
>>>>
>>>> lvh
>>>> _______________________________________________
>>>> Cryptography-dev mailing list
>>>> Cryptography-dev at python.org
>>>> https://mail.python.org/mailman/listinfo/cryptography-dev
>> _______________________________________________
>> Cryptography-dev mailing list
>> Cryptography-dev at python.org
>> https://mail.python.org/mailman/listinfo/cryptography-dev
>
>
> _______________________________________________
> Cryptography-dev mailing list
> Cryptography-dev at python.org
> https://mail.python.org/mailman/listinfo/cryptography-dev
>

From donald at stufft.io  Thu Jul 14 12:17:01 2016
From: donald at stufft.io (Donald Stufft)
Date: Thu, 14 Jul 2016 12:17:01 -0400
Subject: [Cryptography-dev] hash.SHA256 cpu expensive per byte versus
 byte-string?
In-Reply-To: <CAJGvTrS3idfSbRXXm7Mfyz1Dt=23+VdEkAbpTXOfL0JruMNThA@mail.gmail.com>
References: <E61DB184-7873-43AA-B8D2-E73FC31C00DD@gmail.com>
 <CA55C8DC-22B6-45C2-B1F7-3EC7C8FA3D05@lvh.io>
 <CAJGvTrTkxteqXmhQRPUdzJzUJLu-ix0VoDesZ0Jg0hps46e-FQ@mail.gmail.com>
 <CAJGvTrS3idfSbRXXm7Mfyz1Dt=23+VdEkAbpTXOfL0JruMNThA@mail.gmail.com>
Message-ID: <A6DC4C53-EEBD-41AF-9730-3752BB884AC8@stufft.io>


> On Jul 14, 2016, at 1:23 AM, Frank Siebenlist <frank.siebenlist at gmail.com> wrote:
> 
> Guess hashlib used some better optimization on the C-calls (?).
> 
> This is my last update on this observation.
> Conclusion is "so be it", and using bigger chunks for hashing gives
> (much) better performance.


I believe this is going to be due to the overhead of CFFI on CPython. Every time we call a C function via CFFI there is some marshaling and such that goes on, so when you call update() a whole lot of times (one per byte) there?s a whole lot of marshaling and crossing the C boundary going on.

In contrast, hashlib is written using the C-EXT API in CPython, which means that it integrates directly into the internals of CPython and doesn?t need to pay that marshaling cost.

In terms of safety, CFFI is far superior to directly writing C in the C-EXT API, it?s also more portable since it utilized a pluggable backend approach, and on PyPy it tends to be much faster since it offers introspection that the JIT can take advantage of.

The downside is, putting a bunch of CFFI calls in a hot loop on CPython can be slower than C-EXTs.

?
Donald Stufft


From frank.siebenlist at gmail.com  Mon Jul 18 16:12:04 2016
From: frank.siebenlist at gmail.com (Frank Siebenlist)
Date: Mon, 18 Jul 2016 13:12:04 -0700
Subject: [Cryptography-dev] Fernet NG or alternative simple, high-level,
 encrypted message module?
In-Reply-To: <1467137494.3121.8.camel@redhat.com>
References: <948A93A3-0FBC-41CB-937E-DFF2A7FBEC9D@gmail.com>
 <CABj5TKQRNAv8p1X-fy8C5pSHKdGSk+m89Fmz8Wn3OLpThX8ggw@mail.gmail.com>
 <1465823210.3498.49.camel@redhat.com>
 <CAJGvTrSp2jUZ-h-EptXEAb8E1nAiGZTGE0A7kziXAysj5ZzEYQ@mail.gmail.com>
 <1467137494.3121.8.camel@redhat.com>
Message-ID: <CAJGvTrS=c1wXoqUrdumPeXL+vUT2BHzOXNRfeenSx4MSkOHdLg@mail.gmail.com>

Did this discussion about jwcrypto integration in pyca/cryptography happen?

Anything we can do to help/facilitate this?

Thanks, Frank.

On Tue, Jun 28, 2016 at 11:11 AM, Simo Sorce <simo at redhat.com> wrote:
> I see no problem matching the license, let's discuss if this merge can
> be done and I will change the license as we start working on it for
> real.
>
> Simo.
>
> On Tue, 2016-06-14 at 09:19 -0700, Frank Siebenlist wrote:
>> Hi Simo - if you could accommodate the jwcrypto-license to match
>> pyca/cryptography's... that would be fantastic and generous!!! -
>> Thanks, Frank.
>>
>> On Mon, Jun 13, 2016 at 6:06 AM, Simo Sorce <simo at redhat.com> wrote:
>> > On Sun, 2016-06-12 at 23:51 -0400, Paul Kehrer wrote:
>> >> In general I'm in favor of pulling jwcrypto (or something like it)
>> >> into cryptography. The obstacles are going to be figuring out the
>> >> licensing (cryptography is Apache2/BSD dual licensed and any code
>> >> contributed to it needs to be available under those licenses),
>> >> discussing what (if any) API changes need to be made to fit in with
>> >> the API design of the hazmat layer, and general "make the code style
>> >> match cryptography".
>> >
>> > Jwcrypto author here,
>> > from my POV we can discuss license/API/style adjustments needed, just
>> > let me know in which form you want to have this discussion.
>> >
>> > Simo.
>> >
>> > --
>> > Simo Sorce * Red Hat, Inc * New York
>> >
>> > _______________________________________________
>> > Cryptography-dev mailing list
>> > Cryptography-dev at python.org
>> > https://mail.python.org/mailman/listinfo/cryptography-dev
>> _______________________________________________
>> Cryptography-dev mailing list
>> Cryptography-dev at python.org
>> https://mail.python.org/mailman/listinfo/cryptography-dev
>
>
> --
> Simo Sorce * Red Hat, Inc * New York
>
> _______________________________________________
> Cryptography-dev mailing list
> Cryptography-dev at python.org
> https://mail.python.org/mailman/listinfo/cryptography-dev

From simo at redhat.com  Wed Jul 20 05:02:15 2016
From: simo at redhat.com (Simo Sorce)
Date: Wed, 20 Jul 2016 05:02:15 -0400
Subject: [Cryptography-dev] Fernet NG or alternative simple, high-level,
 encrypted message module?
In-Reply-To: <CAJGvTrS=c1wXoqUrdumPeXL+vUT2BHzOXNRfeenSx4MSkOHdLg@mail.gmail.com>
References: <948A93A3-0FBC-41CB-937E-DFF2A7FBEC9D@gmail.com>
 <CABj5TKQRNAv8p1X-fy8C5pSHKdGSk+m89Fmz8Wn3OLpThX8ggw@mail.gmail.com>
 <1465823210.3498.49.camel@redhat.com>
 <CAJGvTrSp2jUZ-h-EptXEAb8E1nAiGZTGE0A7kziXAysj5ZzEYQ@mail.gmail.com>
 <1467137494.3121.8.camel@redhat.com>
 <CAJGvTrS=c1wXoqUrdumPeXL+vUT2BHzOXNRfeenSx4MSkOHdLg@mail.gmail.com>
Message-ID: <1469005335.21393.27.camel@redhat.com>

On Mon, 2016-07-18 at 13:12 -0700, Frank Siebenlist wrote:
> Did this discussion about jwcrypto integration in pyca/cryptography
> happen?

Not yet.

> Anything we can do to help/facilitate this?

Jumpstart it with an Issue on github ?

Simo.

> Thanks, Frank.
> 
> On Tue, Jun 28, 2016 at 11:11 AM, Simo Sorce <simo at redhat.com> wrote:
> > 
> > I see no problem matching the license, let's discuss if this merge
> > can
> > be done and I will change the license as we start working on it for
> > real.
> > 
> > Simo.
> > 
> > On Tue, 2016-06-14 at 09:19 -0700, Frank Siebenlist wrote:
> > > 
> > > Hi Simo - if you could accommodate the jwcrypto-license to match
> > > pyca/cryptography's... that would be fantastic and generous!!! -
> > > Thanks, Frank.
> > > 
> > > On Mon, Jun 13, 2016 at 6:06 AM, Simo Sorce <simo at redhat.com>
> > > wrote:
> > > > 
> > > > On Sun, 2016-06-12 at 23:51 -0400, Paul Kehrer wrote:
> > > > > 
> > > > > In general I'm in favor of pulling jwcrypto (or something
> > > > > like it)
> > > > > into cryptography. The obstacles are going to be figuring out
> > > > > the
> > > > > licensing (cryptography is Apache2/BSD dual licensed and any
> > > > > code
> > > > > contributed to it needs to be available under those
> > > > > licenses),
> > > > > discussing what (if any) API changes need to be made to fit
> > > > > in with
> > > > > the API design of the hazmat layer, and general "make the
> > > > > code style
> > > > > match cryptography".
> > > > Jwcrypto author here,
> > > > from my POV we can discuss license/API/style adjustments
> > > > needed, just
> > > > let me know in which form you want to have this discussion.
> > > > 
> > > > Simo.
> > > > 
> > > > --
> > > > Simo Sorce * Red Hat, Inc * New York
> > > > 
> > > > _______________________________________________
> > > > Cryptography-dev mailing list
> > > > Cryptography-dev at python.org
> > > > https://mail.python.org/mailman/listinfo/cryptography-dev
> > > _______________________________________________
> > > Cryptography-dev mailing list
> > > Cryptography-dev at python.org
> > > https://mail.python.org/mailman/listinfo/cryptography-dev
> > 
> > --
> > Simo Sorce * Red Hat, Inc * New York
> > 
> > _______________________________________________
> > Cryptography-dev mailing list
> > Cryptography-dev at python.org
> > https://mail.python.org/mailman/listinfo/cryptography-dev
> _______________________________________________
> Cryptography-dev mailing list
> Cryptography-dev at python.org
> https://mail.python.org/mailman/listinfo/cryptography-dev

From frank.siebenlist at gmail.com  Wed Jul 20 12:37:49 2016
From: frank.siebenlist at gmail.com (Frank Siebenlist)
Date: Wed, 20 Jul 2016 09:37:49 -0700
Subject: [Cryptography-dev] Fernet NG or alternative simple, high-level,
 encrypted message module?
In-Reply-To: <1469005335.21393.27.camel@redhat.com>
References: <948A93A3-0FBC-41CB-937E-DFF2A7FBEC9D@gmail.com>
 <CABj5TKQRNAv8p1X-fy8C5pSHKdGSk+m89Fmz8Wn3OLpThX8ggw@mail.gmail.com>
 <1465823210.3498.49.camel@redhat.com>
 <CAJGvTrSp2jUZ-h-EptXEAb8E1nAiGZTGE0A7kziXAysj5ZzEYQ@mail.gmail.com>
 <1467137494.3121.8.camel@redhat.com>
 <CAJGvTrS=c1wXoqUrdumPeXL+vUT2BHzOXNRfeenSx4MSkOHdLg@mail.gmail.com>
 <1469005335.21393.27.camel@redhat.com>
Message-ID: <CAJGvTrQRTazCTiJOcvRRQ9o08Rich=7KaVoYWmCnBr11PWDSbA@mail.gmail.com>

Issue 57: Adopt jwcrypto as a jose/jwk/jwe/jws hazmat module

https://github.com/pyca/cryptography/issues/3050

On Wed, Jul 20, 2016 at 2:02 AM, Simo Sorce <simo at redhat.com> wrote:
> On Mon, 2016-07-18 at 13:12 -0700, Frank Siebenlist wrote:
>> Did this discussion about jwcrypto integration in pyca/cryptography
>> happen?
>
> Not yet.
>
>> Anything we can do to help/facilitate this?
>
> Jumpstart it with an Issue on github ?
>
> Simo.
>
>> Thanks, Frank.
>>
>> On Tue, Jun 28, 2016 at 11:11 AM, Simo Sorce <simo at redhat.com> wrote:
>> >
>> > I see no problem matching the license, let's discuss if this merge
>> > can
>> > be done and I will change the license as we start working on it for
>> > real.
>> >
>> > Simo.
>> >
>> > On Tue, 2016-06-14 at 09:19 -0700, Frank Siebenlist wrote:
>> > >
>> > > Hi Simo - if you could accommodate the jwcrypto-license to match
>> > > pyca/cryptography's... that would be fantastic and generous!!! -
>> > > Thanks, Frank.
>> > >
>> > > On Mon, Jun 13, 2016 at 6:06 AM, Simo Sorce <simo at redhat.com>
>> > > wrote:
>> > > >
>> > > > On Sun, 2016-06-12 at 23:51 -0400, Paul Kehrer wrote:
>> > > > >
>> > > > > In general I'm in favor of pulling jwcrypto (or something
>> > > > > like it)
>> > > > > into cryptography. The obstacles are going to be figuring out
>> > > > > the
>> > > > > licensing (cryptography is Apache2/BSD dual licensed and any
>> > > > > code
>> > > > > contributed to it needs to be available under those
>> > > > > licenses),
>> > > > > discussing what (if any) API changes need to be made to fit
>> > > > > in with
>> > > > > the API design of the hazmat layer, and general "make the
>> > > > > code style
>> > > > > match cryptography".
>> > > > Jwcrypto author here,
>> > > > from my POV we can discuss license/API/style adjustments
>> > > > needed, just
>> > > > let me know in which form you want to have this discussion.
>> > > >
>> > > > Simo.
>> > > >
>> > > > --
>> > > > Simo Sorce * Red Hat, Inc * New York
>> > > >
>> > > > _______________________________________________
>> > > > Cryptography-dev mailing list
>> > > > Cryptography-dev at python.org
>> > > > https://mail.python.org/mailman/listinfo/cryptography-dev
>> > > _______________________________________________
>> > > Cryptography-dev mailing list
>> > > Cryptography-dev at python.org
>> > > https://mail.python.org/mailman/listinfo/cryptography-dev
>> >
>> > --
>> > Simo Sorce * Red Hat, Inc * New York
>> >
>> > _______________________________________________
>> > Cryptography-dev mailing list
>> > Cryptography-dev at python.org
>> > https://mail.python.org/mailman/listinfo/cryptography-dev
>> _______________________________________________
>> Cryptography-dev mailing list
>> Cryptography-dev at python.org
>> https://mail.python.org/mailman/listinfo/cryptography-dev
> _______________________________________________
> Cryptography-dev mailing list
> Cryptography-dev at python.org
> https://mail.python.org/mailman/listinfo/cryptography-dev