Strategy to Verify Python Program is POST'ing to a web server.

Sat Jun 18 16:40:14 EDT 2011

On Sat, Jun 18, 2011 at 1:26 PM, Chris Angelico <rosuav at gmail.com> wrote:
> SSL certificates are good, but they can be stolen (very easily if the
> client is open source). Anything algorithmic suffers from the same
> issue.

This is only true if you distribute your app with one built-in
certificate, which does indeed seem like a bad idea.  When you know
your user base though, especially if this is a situation with a small
number of deployments, than you can distribute a unique certificate to
each client, signed by your CA.  Not knowing what kind of statistics
the OP is trying to collect, we really don't know if this client will
be running in one place or thousands.

Even if there will be thousands of deployments, you could generate an
RSA key-pair on the client similar to how an ssh client does, and use
that to sign the data.  Then you can at least track which client each
submission came from (storing the public key and IP address), and then
remove submissions as necessary if you detect abuse.

> In the example you gave, there's no solution. Someone could easily
> spoof it and stuff the ballot. But if you make that more difficult
> than the survey is worth, then you can largely trust your data.
>
> You could go a long way toward it, though, by
> using something ridiculously complex, such as:
>
> * Client connects via SSL to host, using a known certificate.
> * Server verifies certificate, and sends client some Python code to execute.
> * Client verifies the server's certificate (vital!).
> * Client executes the code it's given, and based on the result, plus
> some other data, sends the server a hash value.
> * Server executes the same code it gave the client, knows the data it
> was working with, and calculates the equivalent hash.
> * If the two hashes match, the client is deemed to be valid.

An authentication process that involves the client executing code
supplied by the server opens up one single point of failure (server is
compromised or man-in-the-middle attack is happening) by which
arbitrary code could get executed on the client.  Yikes!  It's ok to
execute server-supplied code in a sandbox (i.e. javascript), but I
would never want to use software that sends me code over the network
to be executed directly on my system (unless that's the express
purpose of the software, like celery).  Besides, it seems that all
you've accomplished is verifying that the client can execute python
code and you've made it a bit less convenient to attack.

The TLS handshake really does verify that the client has a certificate
which has been previously signed by the CA.  If you can get signed
certs to each deployment, that is spectacular security that will serve
you well.  The above sounds a bit like you're trying to create a new
cipher based on exchanged code that gets executed.  I encourage you to
not reinvent the wheel, and stick with the ciphers that are already
standard in the SSL/TLS handshake.

If you cannot uniquely authenticate each client (either through a
signed cert or by having the user supply credentials interactively),
then you'll have to accept that you cannot trust the submitted data
100%, and just take measures to mitigate abuse.

Michael