Multiprocessing.connection magic

Chris Torek nospam at torek.net
Fri Jun 3 03:03:51 EDT 2011


In article <mailman.2417.1307082948.9059.python-list at python.org>
Claudiu Popa  <cpopa at bitdefender.com> wrote:
>Hello guys,
>      While  working  at a dispatcher using
>  multiprocessing.connection.Listener  module  I've stumbled upon some
>  sort    of  magic  trick  that  amazed  me. How is this possible and
>  what  does  multiprocessing  library doing in background for this to
>  work?

Most of Python's sharing routines (including multiprocessing
"send", in this case) use the pickle routines to package data
for transport between processes.

Thus, you can "see the magic" pretty simply:

>  Client, Python 2.6
>
>  >>> from multiprocessing.connection import Client
>  >>> client = Client(("localhost", 8080))
>  >>> import shutil
>  >>> client.send(shutil.copy)

Here I just use pickle.dumps() to return (and print, since we are
in the interpreter) the string representation that client.send()
will send:

   >>> import pickle
   >>> import shutil
   >>> pickle.dumps(shutil.copy)
   'cshutil\ncopy\np0\n.'
   >>>

>  Server, 3.2
>  >>> from multiprocessing.connection import Listener
>  >>> listener = Listener(("localhost", 8080))
>  >>> con = listener.accept()
>  >>> data = con.recv()
>  >>> data
>  <function copy at 0x024611E0>
>  >>> help(data)
>  Help on function copy in module shutil:
[snip]

On this end, the (different) version of python simply unpickles the
byte stream.  Starting a new python session (to get rid of any
previous imports):

    $ python
    ...
    >>> import pickle
    >>> pickle.loads('cshutil\ncopy\np0\n.')
    <function copy at 0x86ef0>
    >>> help(_)
    Help on function copy in module shutil:
    ...

The real magic is in the unpickler, which has figured out how to
access shutil.copy without importing shutil into the global namespace:

    >>> shutil
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    NameError: name 'shutil' is not defined
    >>>

but we can expose that magic as well, by feeding pickle.loads()
a "bad" string:

    >>> pickle.loads('cNotAModule\nfunc\np0\n.')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py", line 1374, in loads
        return Unpickler(file).load()
      File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py", line 858, in load
        dispatch[key](self)
      File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py", line 1090, in load_global
        klass = self.find_class(module, name)
      File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py", line 1124, in find_class
        __import__(module)
    ImportError: No module named NotAModule
    >>>

Note the rather total lack of security here -- in the receiver, by
doing con.recv(), you are trusting the sender not to send you a
"dangerous" or invalid pickle-data-stream.  This is why the documentation
includes the following:

    Warning: The Connection.recv() method automatically unpickles
    the data it receives, which can be a security risk unless you
    can trust the process which sent the message.

    Therefore, unless the connection object was produced using Pipe()
    you should only use the recv() and send() methods after performing
    some sort of authentication. See Authentication keys.

(i.e., do that :-) -- see the associated section on authentication)
-- 
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W)  +1 801 277 2603
email: gmail (figure it out)      http://web.torek.net/torek/index.html



More information about the Python-list mailing list