Why Is Escaping Data Considered So Magical?

Owen Jacobson angrybaldguy at gmail.com
Sun Jun 27 21:49:11 EDT 2010


On 2010-06-26 22:33:57 -0400, Lawrence D'Oliveiro said:

> In message <2010062522560231540-angrybaldguy at gmailcom>, Owen Jacobson wrote:
> 
>> It's not hard. It's just begging for a visit from the fuckup fairy.
> 
> That’s the same fallacious argument I pointed out earlier.

In the sense that "using correct manual escaping leads to SQL injection 
vulnerabilities", yes, that's a fallacious argument on its own. 
However, as sites like BUGTRAQ amply demonstrate, generating SQL 
through string manipulation is a risky development practice[0]. You can 
continue to justify your choice to do so however you want, and you may 
even be the One True Developer capable of getting it absolutely right 
under all circumstances, but I'd still reject patches that introduced a 
SQLString-like function and ask that you resubmit them using the 
database API's parameterization tools instead.

Assuming for the sake of discussion that your SQLString function 
perfectly captures the transformation required to turn an arbitrary str 
into a MySQL string literal. How do you address the following issues?

1. Other (possibly inexperienced) developers reading your source who 
may not have the skills to correctly implement the same transform 
correctly learn from your programs that writing your own query munger 
is okay.
1a. Other (possibly inexperienced) developers decide to copy and paste 
your function without fully understanding how it works, in tandem with 
any of the other issues below. (If you think this is rare, I invite you 
to visit stackoverflow or roseindia some time.)

2. MySQL changes the quoting and escaping rules to address a 
bug/feature request/developer whim, introducing a new set of corner 
cases into your function and forcing you to re-learn the escaping and 
quoting rules. (For people using DB API parameters, this is a matter of 
upgrading the DB adapter module to a version that supports the modified 
rules.)

3. You decide to switch from MySQL to a more fully-featured RDBMS, 
which may have different quoting and escaping rules around string 
literals.
3a. *Someone else* decides to port your program to a different RDBMS, 
and may not understand that SQLString implements MySQL's quoting and 
escaping rules only.

4. MySQL AB finally get off their collective duffs and adds real 
parameter separation to the MySQL wire protocol, and implements real 
prepared statements to massive speed gains in scenarios that are 
relevant to your interests; string-based query construction gets left 
out in the cold.
4a. As with case 3, except that instead of the rules changing when you 
move to a new RDBMS, it's the relative performance of submitting new 
queries versus reusing a parameterized query that changes.

On top of the obvious issue of completely avoiding quoting bugs, using 
query parameters rather than escaping and string manipulation neatly 
saves you from having to address any of these problems (and a multitude 
of others) -- the DB API implementation will handle things for you, and 
you are propagating good practice in an easy-to-understand form.

I am honestly at a loss trying to understand your position. There is a 
huge body of documentation out there about the weaknesses of 
string-manipulation-based approaches to query construction, and the use 
of query parameters is so compellingly the Right Thing that I have a 
very hard time comprehending why anyone would opt not to use it except 
out of pure ignorance of their existence. Generating executable code -- 
including SQL -- from untrusted user input introduces an large 
vulnerability surface for very little benefit.

You don't handle function parameters by building up python-language 
strs containing the values as literals and eval'ing them, do you?

-o

[0] If you want to be *really* pedantic, string-manipulation-based 
query construction is strongly correlated with the occurrence of SQL 
injection vulnerabilities and bugs, which is in turn not strongly 
correlated with very many other practices. Happy?




More information about the Python-list mailing list