[Python-Dev] Internal namespace proposal

Thu Jul 27 17:18:01 CEST 2006

Armin Rigo wrote:
> Hi David,
> 
> Your proposal is too vague to be useful.  In Python I would not feel
> that any compiler-enforced restrictions are going to be too restrictive,
> and so I believe that your approach is not viable, but I cannot give you
> many concrete examples of why before you come up with a more concrete
> specification.

The intention was not to require the restrictions to be compiler-enforced;
only to *allow* them to be compiler-enforced.

Code like this, for example:

  def someMethod(self, x):
      if self == x:
          foo(x._internal)

should not have to work.

> More importantly, this is going to depend critically on a restricted
> interpreter in the first place, in the sense of Brett, but the safety of
> your proposal depends on the restricted interpreter forbidding many
> operations that it would not otherwise forbid for its original goal.
> For example, in Brett's use case there is no need to prevent reading the
> 'func_globals' attribute of function objects, but if that's allowed,
> then accessing any _attribute of any module is easy.

I disagree that there is no need to prevent reading func_globals.
func_globals is clearly incompatible with capability security (as are
func_dict and func_closure; also the other function attributes should be
read-only). Functions in a capability language should be opaque.

I don't see that there is any problem with the proposal depending on a
restricted interpreter to prevent access via loopholes such as func_globals,
since that is the main intended context of its use. Remember that Brett's
document stated that protection could only be obtained at interpreter
granularity, rather than object granularity, primarily because objects
have no way to prevent access to their private state.

My intention in describing the basic idea of enforcing the PEP 8 convention
for internal attributes/methods, was to get precisely this kind of feedback
on potential problems. I have already obtained useful feedback (including
yours), and will prepare a more concrete proposal based on it.

> About special methods: how do built-in functions like str(), int(), and
> so on, know in which context they are called? Surely you don't propose
> that '2+3' should be invalid because it accesses the hidden attribute '2
> .__add__' ?

This and other examples have convinced me that names starting and ending with
double underscores should not automatically be considered internal. There
are a few such names that should be internal (e.g. __dict__), but it is
reasonable to treat those as special cases.

> How would you formulate a rule in term on Python's attribute look-up
> algorithm to prevent the following trivial attack? :
> 
> x.py:
>     
>     # supposedly secure?
> 
>     _hidden = [1,2,3]
> 
>     class A:
>         def __init__(self):
>             self._authorized = ...
>         def read(self):
>             if not self._authorized:
>                 raise Forbidden
>             return _hidden
> 
> attack.py:
> 
>     import x
>     class B(x.A):
>         def __init__(self):
>             self._authorized = True
> 
>     b = B()
>     print b.read()   # => [1,2,3]

Inheritance should be defined as though the code of inherited methods and
attributes were copied into the subclass (with global accesses updated to
point to the original module). IOW, B acts as though it is defined like this:

attack.py:

  class B(x.A):
      def __init__(self):
          self._authorized = True
      def read(self):
          if not self._authorized:
              raise Forbidden
          return x._hidden

Since x._hidden is not accessible from attack.py, the attack fails.

> On any real-life example I'm sure that hacks like overriding selected
> methods on the instance itself would allow an attacker to confuse the
> remaining methods enough to leak hidden information.

Yes, Java was subject to many attacks of this type. However, a code-copying
semantics for inheritance prevents all of them, by ensuring that a class
cannot do anything by inheritance that it could not do without it.

> Here is a metaclass attack against the rule "self._attr is only allowed
> if syntactically inside the class definition of the exact class of
> self":
> 
>     class SupposedlySecure(object):
>         _hidden = [1,2,3]
> 
>     class MetaAttack(type):
>         def read(self):
>             return self._hidden   # seen as an instance attribute
> 
>     class Attack(SupposedlySecure):
>         __metaclass__ = MetaAttack
> 
>     print Attack.read()

Metaclasses are a reflective feature; almost all such features would have
to be limited in restricted interpreters.

-- 
David Hopwood <david.nospam.hopwood at blueyonder.co.uk>