Experiences/guidance on teaching Python as a first programming language

Chris Angelico rosuav at gmail.com
Wed Dec 18 03:51:26 EST 2013


On Wed, Dec 18, 2013 at 7:18 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> You want to know why programs written in C are so often full of security
> holes? One reason is "undefined behaviour". The C language doesn't give a
> damn about writing *correct* code, it only cares about writing
> *efficient* code. Consequently, one little error, and does the compiler
> tell you that you've done something undefined? No, it turns your validator
> into a no-op -- silently:

I disagree about undefined behaviour causing a large proportion of
security holes. Maybe it produces some, but it's more likely to
produce crashes or inoperative codde. The example you cite is a rare
one where it's actually security code; that's why it's a security
vulnerability. If the same operation were flagging, say, that this
value needed to be saved to disk, then the same bug would result in
stuff not getting saved on that platform, which isn't a security risk.

Here's something from CERN about C and security:
https://security.web.cern.ch/security/recommendations/en/codetools/c.shtml

Apart from the last one (file system atomicity, not a C issue at all),
every single issue on that page comes back to one thing: fixed-size
buffers and functions that treat a char pointer as if it were a
string. In fact, that one fundamental issue - the buffer overrun -
comes up directly when I search Google for 'most common security holes
in c code' (second hit, Wikipedia "Buffer overflow" page). Here's
another page listing security concerns:

http://www.makelinux.net/alp/085

First entry: Buffer overruns. Second: File system races. Third:
Improper quoting of shell commands.

Not one of the above pages, nor any other that I came across as I was
skimming, mentioned anything involving undefined behaviour. Every one
of them is issues with properly-defined behaviour - unless you count
specific details of memory layout. If you know that automatic
variables are stored on the stack, then you can blow some buffer and
overwrite the return value. But that's the *attacker* depending on
undefined behaviour, not the *programmer*, who simply has a bug in his
code (something that's able to write more to the buffer than there's
room for).

Python is actually *worse* than C in this respect. I know this
particular one is reasonably well known now, but how likely is it that
you'll still see code like this:

def create_file():
    f = open(".....", "w")
    f.write(".......")
    f.write(".......")
    f.write(".......")

Looks fine, is nice and simple, does exactly what it should. And in
(current versions of) CPython, this will close the file before the
function returns, so it'd be perfectly safe to then immediately read
from that file. But that's undefined behaviour. Python does not
guarantee that this will work, so this might work for years and then
break when it's run under Jython, or it might even work in Jython too,
but there's just one specific situation where the file's read
immediately after being written, and that one case fails. I think
that's used at least as often as any of C's pieces of undefined
behaviour.

ChrisA



More information about the Python-list mailing list