How to protect Python source from modification

Steve M sjmaster at gmail.com
Mon Sep 12 14:22:06 EDT 2005


This is a heck of a can of worms. I've been thinking about these sorts
of things for awhile now. I can't write out a broad, well-structured
advice at the moment, but here are some things that come to mind.

1. Based on your description, don't trust the client. Therefore,
"security", whatever that amounts to, basically has to happen on the
server. The server should be designed with the expectation that any
input is possible, from slightly tweaked variants of the normal
messages to a robotic client that spews the most horrible ill-formed
junk frequently and in large volumes. It is the server's job to decide
what it should do. For example, consider a website that has a form for
users to fill out. The form has javascript, which executes on the
client, that helps to validate the data by refusing to submit the form
unless the user has filled in required fields, etc. This is client-side
validation (analagous to authentication). It is trivial for an attacker
to force the form to submit without filling in required fields. Now if
the server didn't bother to do its own validation but just inserted a
new record into the database with whatever came in from the form
submission, on the assumption that the client-side validation was
sufficient, this would constitute a serious flaw. (If you wonder then
why bother putting in client-side validation at all - two reasons are
that it enhances the user experience and that it reduces the average
load on the server.)

2. If you're moving security and business logic to the server you have
to decide how to implement that. It is possible to rely solely on the
RDBMS e.g., PostgreSQL. This has many consequences for deployment as
well as development. FOr example, if you need to restrict actions based
on user, you will have a different PgSQL user for every business user,
and who is allowed to modify what will be a matter of PgSQL
configuration. The PgSQL is mature and robust and well developed so you
can rely on things to work as you tell them to. On the other hand, you
(and your clients?) must be very knowledgeable about the database
system to control your application. You have to be able to describe
permissions in terms of the database. They have to be able to add new
users to PgSQL for every new business user, and be able to adjust
permissions if those change. You have to write code in the RDBMS
procedural language which, well, I don't know a lot about it but I'm
not to thrilled about the idea. Far more appealing is to write code in
Python. Lots of other stuff.
Imagine in contrast that user authentication is done in Python. In this
scenario, you can have just a single PgSQL user for the application
that has all access, and the Python always uses that database user but
decides internally whether a given action is permitted based on the
business user. Of course in this case you have to come up with your own
security model which I'd imagine isn't trivial.  You could also improve
security by combining the approaches, e.g. have 3 database users for 3
different business "roles" with different database permissions, and
then in Python you can decide which role applies to a business user and
use the corresponding database user to send commands to the database.
That could help to mitigate the risks of a flaw in the Python code.

3. You should therefore have  a layer of Python that runs on the server
and mediates between client and database. Here you can put
authentication, validation and other security. You can also put all
business logic. It receives all input with the utmost suspicion and
only if everything is in order will it query the database and send
information to the client. There is little or no UI stuff in this
layer. To this end, you should check out Dabo at www.dabodev.com. This
is an exciting Python project that I haven't used much but am really
looking forward to when I have the chance, and as it becomes more
developed. My impression is that it is useable right now. They
basically provide a framework for a lot of stuff you seem to have done
by hand, and it can give you some great ideas about how to structure
your program. You may even decide to port it to Dabo.




More information about the Python-list mailing list