What's so funny? WAS Re: rotor replacement

Sat Jan 29 21:32:05 EST 2005

"Martin v. Löwis" <martin at v.loewis.de> writes:
> Apparently, people disagree on what precisely the API should be. E.g.
> cryptkit has
> 
> obj = aes(key)
> obj.encrypt(data)

I don't disagree about the API.  The cryptkit way is better than ECB
example I gave, but the ECB example shows it's possible to do it in
one call.

> I think I would prefer explicit encrypt/decrypt methods over a
> direction parameter. Whether or not selection of mode is a separate
> parameter, or a separate method, might be debatable

I prefer separate methods too, however if it was done with a direction
flag instead, it wouldn't really cause a problem.  As long as the
functionality is there, I can use it.

> I would trust my intuition more for a single function than for an
> entire API. In this specific proposal, I think I would trust my
> intuition and reject the ECB function because of the direction argument.

As an experienced user of a lot of these packages, I can tell you I've
seen it done both ways and I have a slight preference for separate
calls, but it really doesn't matter one way or the other and it's not
worth getting in a debate about it or having a committee design the
API and worry about such trivial issues.  

BTW, the main reason to reject the example ECB function is that
creating a key object ("key schedule") from a string can take
significant computation (sort of like compiling a regexp) so the ECB
function for some ciphers would have to cache the object like the
regexp module does.  Yuck.

The direction flag question would normally be between:

    key = aes.key(key_data)
    ciphertext = key(plaintext, "e")

or 
    key = aes.key(key_data)
    ciphertext = key.encrypt(plaintext)

FWIW, another way to do it, also sometimes preferable, is:

   key = aes.ecb(key_data, "e")  # e for encryption, or "d" for decryption
   ciphertext = key(plaintext)

I think the module I proposed did it this last way, but I haven't
looked at it in a while.

The point is that when faced with yet another crypto package, I don't
get in too concerned about which simple API variant it uses to do such
a basic operation.  I just care that the operation is available.  I
look in the docs to find that package's particular API for that
operation, and I do what the docs say.

I should make it clear that this module is like Python's low-level
"thread" module in that you have to know what you're doing in order to
use it directly without instantly getting in deep trouble.  Most
applications would instead use it indirectly through one or more
intermediate layers.  

> I fully understand what you desire - to include the module "as a
> battery". What makes this decision difficult is that you fail to
> understand that I don't want included batteries so much that I
> would accept empty or leaking batteries.

I do understand that, and the prospect of empty or leaking batteries
is vitally important to considering whether to include a battery
that's included, but for the purposes of an included-battery
discussion, the characteristics of NON-included batteries is not
relevant, given that we know they exist.

> >>http://sourceforge.net/projects/cryptkit/  ...> 
> > I've examined that module, I wouldn't consider it
> > ideal for the core (besides AES, it has some complicated additional
> > functions that aren't useful to most people)
> 
> Ok, that would be a problem. If this is a simple removal of functions
> that you'ld request (which functions?), 

OK.  First you have to decide whether you want a general crypto
toolkit, or just an AES module.  I've been concentrating on just an
AES module (or rather, a generic block cipher module with AES and DES)
since I figure that creates fewer areas of controversy, etc.  I think
it's too early to standardize a fancy toolkit.  Once there's block
ciphers, we can think about adding more stuff afterwards.

For that module, I'd say remove everything except AES and maybe
SHA256, and ask that DES be added.  SHA256 is possibly useful, but
isn't really part of an encryption package; it can be separated out
like the existing sha and md5 modules.  Also, it should be brought
into PEP 247 compliance if it's not already.

Rationale: I'd get rid of the entropy module now that os.urandom is
available.  Having the OS provide entropy is much better than trying
to do it in user code.  I'd get rid of the elliptic curve stuff unless
there's some widely used standard or protocol that needs that
particular implementation.  Otherwise, if I want ECC in a Python
program, I'd just do it on characteristic-p curves in pure Python
using Python longs.  (Bryan's package uses characteristic-2 curves
which means the arithmetic is all boolean operations, that are more
efficient on binary CPU's, especially small ones.  But that means the
module has to be written in C, because doing all those boolean
operations in Python is quite slow.  It would be like trying to do
multi-precision arithmetic in Python with Python ints instead of
longs).  Once there's a widely accepted standard for ECC like there is
for AES, then I'd want the stdlib to have an implementation of the
standard, but right now there are just a lot of incompatible,
nonstandard approaches running around.

If SHA256 is accepted then SHA512/SHA384 (these are basically the
same) might as well also be.  Not many people are using any of these
hash functions right now.  Usage will increase over time (they are US
federal standards like AES), and they'll probably be worth adding
eventually.  I'm indifferent to whether they're added now.

I think I'd include RC4 under the "toolkit" approach, if it's not
already there.  I'd also include a pair of functions like the ones in
p3.py, i.e. an utterly minimal API like:

    ciphertext = encrypt_string(key_string, plaintext)
    plaintext = decrypt_string(key_string, ciphertext)

that does both encryption and authentication, for key and data strings
of arbitrary size.  This would be what people should use instead of
the rotor module.  It would be about a 10 line Python function that
calls the block cipher API.  The block cipher API itself is intended
for more advanced users.

> I may have made exceptions to this rule in the past, e.g. when the
> proposal is to simply wrap an existing C API in a Python module
> (like shadow passwords). In this case, both the interface and
> the implementation are straight-forward, and I expect no surprises.

I'd be happy with an AES module that simply wrapped a C API and that
should be pretty much surprise-free.  It would be about like the SHA
module in terms of complexity.  What I proposed tries to be a bit more
Pythonic but I can live without that.

> For an AES module (or most other modules), I do expect surprises.

Well, the hmac module was added in 2.2 or 2.3, without any fuss.  It's
written in Python and is somewhat slow, though.  What kind of
development process should it take to replace it in the stdlib with a
C module with the exact same interface?

I think you're imagining a basic AES module to be more complicated
than it really is, because you're not so familiar with this type of
module.  Also, possibly because I'm making it sound like a lot of work
to write.  But that work is just for the C version, and assumes that
it's me personally writing it.  What little experience I've had with
Python's C API has been painful, so I figure on having to spend
considerable time wrestling with it.  Someone more adapt with the C
API could probably implement the module with less effort than I'd need.

> I have said many times that I am in favour of including an AES
> implementation in the Python distribution, e.g. in

Oh, ok.  Earlier you said you wanted user feedback before you could
conclude that there was reason to want an AES module at all.

> What I cannot promise is to include *your* AES implementation,
> not without getting user feedback first. 

That's fine.  But I think it's reasonable to actually approach some
users and say "this module is being considered for addition to the
core--could you try plugging it into your applications that now use
other external modules, and say whether you think it will fill your
needs".

That's impossible if consideration doesn't even start until testing is
complete.  All one could say then is "here's yet another crypto
module, that does less than the toolkit you're already using, could
you please temporarily drop whatever you're doing and update your
programs to switch from your old module that works, to a new module
that MIGHT work?".  If the new module is accepted into the core, of
course, it becomes worth retrofitting the existing toolkits to use it.

> The whole notion of creating the module from scratch just to include
> it in the core strikes me as odd - when there are so many AES
> implementations out there already that have been proven useful to users.

We discussed this already.  Here are three possible contexts for
designing a crypto module:

   1) You're designing it to support some specific application you're
      working on.  The design will reflect the needs of that application
      and might not be so great for a wider range of applications.
   2) You're writing a general purpose module for distribution outside
      the core (example: mxCrypto).  You'll include lots of different
      functions, pre-built implementations of a variety of protocols, etc.
      You might include bindings that try to be compatible with other
      packages, etc.  Maybe this can get added to the core someday,
      like numarray, but for now, that's a rather big step.
   3) You're designing to add basic functionality to the core.  Here,
      you try to pick a general purpose API not slanted towards a
      particular app, and provide just some standard building blocks
      that other stuff can be built around.  This is more like the
      math module, which just does basics: sqrt, sin, cos, etc., with no
      attempt at the stuff in a package like numarray.  But if there's
      a standard like FIPS 80 and if there's good reason to implement
      a big subset of it (which there is), then you may as well
      implement the standard completely unless there's a good reason
      not to (which there isn't; the less important operations are a
      few dozen lines of code total).

I think context #3 gets you something better suited for the core and
none of the existing crypto modules were written that way.  The same
is in fact true for many of the non-crypto modules, that seem to have
been written in context #1 and would have been better under context #3.

Also, there's the plain fact that none of the authors of the existing
crypto modules have offered to contribute them.  So somebody had to
step up and do something.

> > what is the your own subjective estimate of the probability?
> Eventually, with hard work, I estimate the chances at, say, 90%. 

Hmm, this is very very interesting.  I am highly confident that all
the purely technical questions (i.e. everything about the API and the
code quality, etc.) can converge to a consensus-acceptable solution
without much hassle.  I had thought there were insurmountable
obstacles of a nontechnical nature, mainly caused by legal issues, and
that these are beyond any influence that I might have by writing or
releasing anything.