Taint (like in Perl) as a Python module: taint.py

Johann C. Rocholl jcrocholl at googlemail.com
Mon Feb 5 17:13:04 EST 2007


The following is my first attempt at adding a taint feature to Python
to prevent os.system() from being called with untrusted input. What do
you think of it?

# taint.py - Emulate Perl's taint feature in Python
# Copyright (C) 2007 Johann C. Rocholl <johann at rocholl.net>
#
# Permission is hereby granted, free of charge, to any person
# obtaining a copy of this software and associated documentation files
# (the "Software"), to deal in the Software without restriction,
# including without limitation the rights to use, copy, modify, merge,
# publish, distribute, sublicense, and/or sell copies of the Software,
# and to permit persons to whom the Software is furnished to do so,
# subject to the following conditions:
#
# The above copyright notice and this permission notice shall be
# included in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
# BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
# ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
# CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.


"""
Emulate Perl's taint feature in Python

This module replaces all functions in the os module (except stat) with
wrappers that will raise an Exception called TaintError if any of the
parameters is a tainted string.

All strings are tainted by default, and you have to call untaint on a
string to create a safe string from it.

Stripping, zero-filling, and changes to lowercase or uppercase don't
taint a safe string.

If you combine strings with + or join or replace, the result will be a
tainted string unless all its parts are safe.

It is probably a good idea to run some checks on user input before you
call untaint() on it. The safest way is to design a regex that matches
legal input only. A regex that tries to match illegal input is very
hard to prove complete.

You can run the following examples with the command
    python taint.py -v
to test if this module works as designed.

>>> unsafe = 'test'
>>> tainted(unsafe)
True
>>> os.system(unsafe)
Traceback (most recent call last):
TaintError
>>> safe = untaint(unsafe)
>>> tainted(safe)
False
>>> os.system(safe)
256
>>> safe + unsafe
u'testtest'
>>> safe.join([safe, unsafe])
u'testtesttest'
>>> tainted(safe + unsafe)
True
>>> tainted(safe + safe)
False
>>> tainted(unsafe.join([safe, safe]))
True
>>> tainted(safe.join([safe, unsafe]))
True
>>> tainted(safe.join([safe, safe]))
False
>>> tainted(safe.replace(safe, unsafe))
True
>>> tainted(safe.replace(safe, safe))
False
>>> tainted(safe.capitalize()) or tainted(safe.title())
False
>>> tainted(safe.lower()) or tainted(safe.upper())
False
>>> tainted(safe.strip()) or tainted(safe.rstrip()) or tainted(safe.lstrip())
False
>>> tainted(safe.zfill(8))
False
>>> tainted(safe.expandtabs())
True
"""

import os
import types


class TaintError(Exception):
    """
    This exception is raised when you try to call a function in the os
    module with a string parameter that isn't a SafeString.
    """
    pass


class SafeString(unicode):
    """
    A string class that you must use for parameters to functions in
    the os module.
    """

    def __add__(self, other):
        """Create a safe string if the other string is also safe."""
        if tainted(other):
            return unicode.__add__(self, other)
        return untaint(unicode.__add__(self, other))

    def join(self, sequence):
        """Create a safe string if all components are safe."""
        for element in sequence:
            if tainted(element):
                return unicode.join(self, sequence)
        return untaint(unicode.join(self, sequence))

    def replace(self, old, new, *args):
        """Create a safe string if the replacement text is also
safe."""
        if tainted(new):
            return unicode.replace(self, old, new, *args)
        return untaint(unicode.replace(self, old, new, *args))

    def strip(self, *args):
        return untaint(unicode.strip(self, *args))

    def lstrip(self, *args):
        return untaint(unicode.lstrip(self, *args))

    def rstrip(self, *args):
        return untaint(unicode.rstrip(self, *args))

    def zfill(self, *args):
        return untaint(unicode.zfill(self, *args))

    def capitalize(self):
        return untaint(unicode.capitalize(self))

    def title(self):
        return untaint(unicode.title(self))

    def lower(self):
        return untaint(unicode.lower(self))

    def upper(self):
        return untaint(unicode.upper(self))


# Alias to the constructor of SafeString,
# so that untaint('abc') gives you a safe string.
untaint = SafeString


def tainted(param):
    """
    Check if a string is tainted.
    If param is a sequence or dict, all elements will be checked.
    """
    if isinstance(param, (tuple, list)):
        for element in param:
            if tainted(element):
                return True
    elif isinstance(param, dict):
        return tainted(param.values())
    elif isinstance(param, (str, unicode)):
        return not isinstance(param, SafeString)
    else:
        return False


def wrapper(function):
    """Create a new function that checks its parameters first."""
    def check_first(*args, **kwargs):
        """Check all parameters for unsafe strings, then call."""
        if tainted(args) or tainted(kwargs):
            raise TaintError
        return function(*args, **kwargs)
    return check_first


def install_wrappers(module, innocent):
    """
    Replace each function in the os module with a wrapper that checks
    the parameters first, except if the name of the function is in the
    innocent list.
    """
    for name, function in module.__dict__.iteritems():
        if name in innocent:
            continue
        if type(function) in [types.FunctionType,
types.BuiltinFunctionType]:
            module.__dict__[name] = wrapper(function)


install_wrappers(os, innocent = ['stat'])


if __name__ == '__main__':
    import doctest
    doctest.testmod()




More information about the Python-list mailing list