Why Is Escaping Data Considered So Magical?

Fri Jun 25 20:43:51 EDT 2010

In article <mailman.2117.1277511935.32709.python-list at python.org>,
 Ian Kelly <ian.g.kelly at gmail.com> wrote:

> On Fri, Jun 25, 2010 at 5:17 PM, Nobody <nobody at nowhere.com> wrote:
> > To be fair, it isn't actually limited to web developers. I've seen the
> > following in scientific code written in C (or, more likely, ported to C
> > from Fortran) for Unix:
> >
> >        sprintf(buff, "rm -f %s", filename);
> >        system(buff);
> 
> Tsk, tsk.  And it's so easy to fix, too:
> 
>     #define BUFSIZE 1000000
>     char buff[BUFSIZE];
>     if (snprintf(buff, BUFSIZE, "rm -f %s", filename) >= BUFSIZE) {
>         printf("No buffer overflow for you!\n");
>     } else {
>         system(buff);
>     }
> 
> There, that's much more secure.

I recently fixed a bug in some production code.  The programmer was 
careful to use snprintf() to avoid buffer overflows.  The only problem 
is, he wrote something along the lines of:

snprintf(buf, strlen(foo), foo);

I'm sure the code got reviewed originally, and probably looked at dozens 
of times over the years.  Nobody caught the problem until we ran a 
static code analysis tool (Coverity) over it.

To bring this back to something remotely Python related, the point of 
all this is that security is hard.  A lot of the security best practices 
(such as "don't compose SQL queries on the fly with externally tainted 
strings") exist because they address ways that people have gotten burned 
in the past.  It if foolish to think that you're smarter than everybody 
else and have thought of every possibility to avoid getting burned by 
doing the things that have gotten other people in trouble.