[Python-bugs-list] [ python-Bugs-471893 ] Security review of pickle/marshal docs

noreply@sourceforge.net noreply@sourceforge.net
Fri, 09 Nov 2001 01:21:59 -0800


Bugs item #471893, was opened at 2001-10-16 15:42
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=471893&group_id=5470

Category: Documentation
Group: Feature Request
Status: Open
Resolution: None
Priority: 4
Submitted By: Tim Peters (tim_one)
Assigned to: Jeremy Hylton (jhylton)
Summary: Security review of pickle/marshal docs

Initial Comment:
Paul Rubin points out that the security implications 
of using marshal and/or pickle aren't clear from the 
docs.  Assigning to Jeremy as he's more sensitive to 
such issues than I am; maybe Barry would like to get 
paranoid too <wink>.

A specific example:  the pickle docs say that pickle 
doesn't support code objects, and "at least this 
avoids the possibility of smuggling Trojan horses into 
a program".  However,

1) The marshal docs don't mention this vulnerability 
at all.

while

2) The pickle docs don't spell out possible dangers 
due to things pickle does that marshal doesn't (like 
importing modules, and running class constructors).

----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2001-11-09 01:21

Message:
Logged In: NO 

Well, Guido and Tim agree with you that it's not a pickle
bug.  I still feel it is one, because its docs currently
make you think you can securely load untrusted pickles, and
because it's a natural, non-obscure thing to want to do
(see pyro and the cookie module), but whatever.  If it's
not a code bug then I feel it's a significant functionality
shortcoming in the python library.

Pyro uses pickle to serialize data for RPC calls over the
internet.  A malicious client could make a hostile pickle
take over the server.  The cookie module lets web
applications store user session state in browser cookies.
Its SerialCookie and SmartCookie classes let you put
arbitrary Python objects into the user session, and
serializes them when pickle.  Again, a malicious client
can make a hostile pickle, send it in a cookie header to
the http server, and take over the server when the 
application unpickles the cookie.

The current documentation for the pickle module makes it
very clear to me that the doc writer thought it was safe
to unpickle untrusted cookies.  If pickle wasn't designed
for that, then there was a communication failure between
the designer and the doc writer.

Yes, I'm willing to help with a PEP for fixing this
situation.

Paul


----------------------------------------------------------------------

Comment By: Jeremy Hylton (jhylton)
Date: 2001-11-08 09:37

Message:
Logged In: YES 
user_id=31392

I don't think of the issue you describe as a bug in the
code.  You're suggesting a new feature for pickle.  As far
as I can tell, the original design requirements for pickle
did not include the ability to securely load a pickle from
an untrusted source.

It may be a legitimate feature request, but it's too late to
make it into Python 2.2.  I suggest we look at the design
issues for Python 2.3 and decide if it's a feature we want
to support.  I imagine a PEP may be necessary to lay out the
issues and the solution.  Do you want to have a hand in that
PEP?

I still don't understand what it means that Pyro and cookie
were bit by a bug.  It sounds like they were using pickle in
ways that pickle was not intended to support.  A careful
analysis of how those two applications use pickle would be
helpful for generating requirements.


----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2001-11-07 15:54

Message:
Logged In: NO 

IMO it's a code bug that you can't unpickle strings from
untrusted sources.  Pyro and the cookie module are examples
of programs that got bitten by this bug.  Whether it's
really a bug is a matter of opinion--I had a big email
exchange with Guido and Tim about it, and they felt it
was enough to fix the pickle documentation.

Pickle has the same problem as cPickle, but with pickle
you can subclass the pickler and override the method that
unpickles class objects, and work around the (IMO) bug.
The workaround doesn't help cPickle since cPickle can't
be subclassed.  See bug #467384 for some related discussion.

Paul


----------------------------------------------------------------------

Comment By: Jeremy Hylton (jhylton)
Date: 2001-11-07 14:02

Message:
Logged In: YES 
user_id=31392

What's the code bug?  Your last message has a lot of gloom
and doom <wink>, but I'm not sure what specific problem
you'd like to see fixed.  Are you saying that something
needs to be fixed in cPickle and not in pickle?

----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2001-11-07 12:08

Message:
Logged In: NO 

Irmen de Jong points out that the standard cookie module
uses pickling for serial and smart cookies.  The 2.1.1
cookie module docs explicitly say not to use those
classes because of the security hole--that they're provided
for backward compatibility only (but with what?!  Any
server that uses those classes on open ports needs to be
changed right away).

Irmen's library, http://pyro.sourceforge.net, also uses
unpickle insecurely (he's aware of this now and is figuring
out a fix).

IMO this is really a code bug rather than a documentation
bug, and should be fixed in the code rather than the docs.
Documenting the bug rather than fixing it leaves a
deficiency in the Python library: obvious uses of pickling,
like Pyro and the cookie module, can't be implemented
using cPickle and have to resort to a slower Python
deserializer, or use marshal and have possible compatibility
problems between versions (and not be able to serialize
class instances).

Paul



----------------------------------------------------------------------

Comment By: paul rubin (phr)
Date: 2001-10-16 16:06

Message:
Logged In: YES 
user_id=72053

Certainly anyone unserializing potentially malicious data
with pickle, marshal, or anything else, should check the
results before doing anything dangerous with them (like
executing code).  However, unpickle can potentially do
damage before it even returns, by creating loading modules
and creating initialized class instances.  So pickle.loads
should never be used on untrusted strings, except possibly
in a rexec wrapper (as proposed by Tim).  Another
possibility (also by Tim) is to override the load_inst
method of the Pickler class, though I don't think you can
do that for cPickle.

A sample exploit for unpickle can be found at
<http://www.nightsong.com/phr/python/pickletest.py>.
Unpickling the test string runs penguin.__init__ contrary
to the doc's saying no initialization unless there's a
__getinitargs__ method in the class def.

The "exploding penguin" class is artificial, but
applications are vulnerable if there's an unsafe
constructor anywhere in any class of the application or in
the python library (example: the NNTP constructor opens an
IP connection to an arbitrary address, so a malicious
imported string can send a message through your firewall
when you import it).


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=471893&group_id=5470