how to replace and string in a "SELECT ... IN ()"

Michael Mabin d3vvnull at gmail.com
Sat Sep 27 19:52:44 EDT 2008


I'm exhausted, so I'll just shut up about this after a few final words.
1.  "edits" is used in data warehousing to describe data scrubbing or
filtering of fields in records that are used as input sources for loading
into data warehouses. It's a term that goes way back to batch processing on
the mainframe, so it's been used this way for a long time. Sometimes we use
'validation' interchangeably with 'edit' but our use of validation usually
involves a check of input data against business rules and not type or range
checking which we consider 'edits'.  So maybe you're not used to hearing the
term used this way, but it is a variation perhaps of what you're used to.

2.  The data warehousing world is quite different from the web world in many
respects.  I appreciate the fact that web application developers must always
be mindful of the 'barbarians at the gates' but our batch environments are
locked down, especially our databases.  We have secured batch IDs. Our
programs are run using an automated schedule.  Our databases are secured to
begin with so not even our batch IDs have alter, create or drop permissions
on the database.  There's no way this exploit would ever enter this
environment because those servers aren't accessible through the web or
online.

3.  When you say 'in python we don not fear that' it's curious, because it
assumes that all Python users must think as you do.  And yet, clearly I
don't think as you do.  I really like this language and praise the
Netherlands each day I wake up that they gave us someone as brilliant as
Guido to make programming fun and not such a chore.  But I think there are
valid reasons for taking different approaches to security, data validation
and dare I say it -- field editing -- despite the claims of many who say
that in Python - TOOWTDI.

4. It's also curious that I was obliged to state my assumptions about batch
programming vs. web programming.  There is nothing in the OP to suggest that
the problem was related to web programming.  It's a question about how to
interpolate the values in a list with format specifiers in a SQL statement.
 Still, if this list is not the result of user input but was derived within
the body of the program, which by itself has no malicious intent, are we
still going to code this as if it were input by a user that might be a
hacker?  This seems retarded and paranoid to me.  And where in that post
does it say that the list is from web input?  Although to be fair maybe most
of the Python community is doing web development.

Thanks for the stimulating and educational discussion.

On Sat, Sep 27, 2008 at 4:30 PM, Tino Wildenhain <tino at wildenhain.de> wrote:

> Hi,
>
> Michael Mabin wrote:
>
>> If the inputs are edited prior to the construction of the string and these
>> fields are used for more than one update then it's not an exploit.  It's
>> simply a matter not repeating yourself when coding.
>>
>
> In python we do not fear that.
>
>  In this particular case too, we're talking about a list of integers that
>> gets inserted into a string.  If the list is validated prior to its
>>
>
> Its a list, if it indeed has integers in it is uncertain. It is so very
> easy to check that that it doesnt even make sense to write such lengthy
> emails about how bad you want to avoid it. Just do it. And even more so
> if you are telling others how to do things make sure they do not so easy
> shoot themselfes in their feet.
>
>  insertion into an SQL statement then there is no exploit.  If I write a
>> batch program (not a web program) that retrieves this list of integers from
>> other sources and validates the data prior to using it in an SQL statement,
>> that should be sufficient.
>>
>
> This might be well true but if you have a look at your original
> contribution you see that all these your asumtions are just not in.
>
>  As far as wrong and right is concerned. I think it's more about doing what
>> is appropriate according to the circumstances.  As a rule you
>>
>
> If its easy to do, why not just doing it correctly (or robust) in all
> circumstances to just avoid overlooking a case?
>
>  should only code what is appropriate for the circumstances.  If it's
>> appropriate to code more simply without introducing unnecessary complexity
>> you should do so.
>>
>
> But you did not tell us about your asumtations about the circumstances.
>
>  I work in the data warehousing ETL world, where we have to perform field
>> edits or transformations to load source data into databases.  If I'm
>>
>
> Thats wrong. You do not "edit" fields. You have a validating type path and
> _always_ the database is most authoritative about what it accepts.
> Any other concept is just wrong and outright dangerous. There are
> many examples of how this works out (just check bugtraq)
>
>  already performing edits on these fields and if these fields are going to
>> be used for more updates downstream, it's wasteful to perform them again
>> when I build the SQL insert with the list and execute it.
>>
>
> I still don't know what you mean by "edit" ;) If you mean filter out
> special chars with for example replace("bad stuff","good stuff") check
> your idea again, this is not going to work. (google for default permit)
>
>  Finally, whatever happened to the practice of granting appropriate
>> privileges to IDs that perform database operations?  Shouldn't the person
>> acting in the capacity of DBA ensure that the user updating or retrieving
>> data from the database does not have DROP, ALTER, or CREATE privileges on
>> that database?
>>
>
> This of course is another layer which should be added - but you would
> not need to - you edited the fields, right? ;)
>
> Sorry, it was not meant to put you to the wall but you insist so much
> on your still dangerous solution while top posting the hell out of
> this thread I just could not ignore it ;)
>
> T.
>



-- 
| _ | * | _ |
| _ | _ | * |
| *  | * | * |
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20080927/b3f2d71a/attachment-0001.html>


More information about the Python-list mailing list