[Tutor] hacking 101
Erik Price
erikprice@mac.com
Sun, 31 Mar 2002 21:25:22 -0500
On Sunday, March 31, 2002, at 04:56 PM, Remco Gerlich wrote:
> At every place where you get user input, *in any form*, try to think of
> the
> weirdest form it could take, the syntax you don't expect. Question your
> assumptions - if it's user input, they're not held to your assumptions.
>
> Input length is part of the input (but not as critical in Python as it
> often
> is in C). Input timing is part of the input. Etc.
>
> Get into the mindset of someone who sees a set of rules for a game, and
> tries to figure out how to cheat at it.
>
> Focus at user input. Everything that comes from outside is suspect.
> Trace
> the path of this data through your code.
I haven't yet had a chance to use Python for web stuff (I'm still
learning the basics), but Remco sums up the attitude I take when
programming app stuff in PHP -- for security, you need to have a robust
server like apache that is configured such that it cannot be easily
taken advantage of -- this is really beyond the topic of programming and
more the domain of system administration -- and for the code itself, you
write what I call "error-checking" functions to ensure that user input
conforms to certain criteria or that your program knows what to do if it
doesn't. For instance, in the content mgmt site i'm developing for my
employer, every single user input is checked via regexes to make sure
that it's appropriate. The app won't accept letters or punctuation when
it's expecting an integer. If the user enters this, I except an error
message and re-display the user's values. Really, my security is
designed to protect the user from making a mistake than from an
intruder, but it serves the same purpose. I check for as many
possibilities as I can possibly think of. And while it would be nice to
be able to accept "two-thousand-two" OR "2002" as input, it's really
outside the scope of the app I am making to be this flexible. I try to
make sure that my forms use as many non-textual inputs as possible,
limiting the user to making choices rather than producing their own
input.
Of course, on this last note, be very careful -- just because the form
only displays three choices doesn't mean that those are the only choices
that the user has! For instance, perhaps the user isn't even using a
web browser to communicate with the script! Perhaps they've telnetted
in and are submitting some POST data that I wasn't prepared for. If
this is the case, then I am in trouble if I expected that the user would
enter nothing but 1, 2, or 3 for instance. Remember that all HTTP data
is plaintext unless you've encrypted it, so even a password form is
pretty much wide open to the internet if you're not using SSL. Make
sure that once your data has successfully and safely made the trip from
the client to the server that the data is still secure -- FreeBSD seems
like a reasonably secure box but still needs to be maintained -- shut
down unnecessary services, make sure you have a firewall, don't give any
access to any parts of the box that aren't needed, if your database is
shared then use md5() to encrypt passwords etc etc etc.
There's quite a bit to learn in this topic, and I'm sorry I don't have
any python-specific advice. But I'm sure there are python equivalents
to anything that can be done in PHP, like session management and the
htmlentities() function (translates all user data into entity form so
that they can't 'escape' the input with quotes or > etc). Really,
I've never seen a hard-and-fast list of what to watch for. Just get
into web development and learn as you go. Everyone makes mistakes with
this, and I'm no exception.
Erik