[IPython-dev] [parallel] issue when executing 'import pandas as pd' on engine

Francesco Montesano franz.bergesund at gmail.com
Mon Sep 16 15:28:49 EDT 2013


These codes of mine can be few hundred lines and depend upon other 2 or 3
modules that I have put together.

If the problem is where I think it is, I might be able to write a
standalone program (maybe in a notebook). Stay tuned :D

Fra





2013/9/16 MinRK <benjaminrk at gmail.com>

> Cn you please share your whole code?
>
>
> On Mon, Sep 16, 2013 at 9:03 AM, Francesco Montesano <
> franz.bergesund at gmail.com> wrote:
>
>> Dear all,
>>
>> I'm having an issue with the parallel (v1.1.0, under python 2.7).
>>
>> Some time ago I did build a number of python codes to manipulate
>> catalogues.
>> I can have either thousands of small file or few possibly huge file.
>> So I've written my codes such that I can chose from command whether to
>> use any of the two.
>>
>> Typically my codes have the following structure:
>>
>> import numpy as np
>>> import pandas as pd
>>> def parse(...):  #argparse
>>>     ....
>>> def to_do(fname,...):  #function(s) that do what I need
>>>     ....
>>> if __name__=='__main__':
>>>     args = parse(...)
>>>     if args.paralell == False:
>>>          for fn in file_name_list:
>>>               to_do(fn, ...)
>>
>>      else:  #execute in parallel
>>
>>          parallel_env = Lbv()       # custom class that init a load
>>> ballance view
>>
>>          imports = ['import numpy as np', 'import pandas as pd']
>>
>>          parallel_env.exec_on_engine(imports) # execute the above strings
>>> on all engines (direct view)
>>
>>          #execute in parallel
>>
>>          runs = [parallel_env.apply(to_do, os.path.abspath(fn), ...) for
>>> fn in file_name_list]
>>
>>
>>
>> I build the whole parallel dispatching against ipython 0.13 and
>> everything worked fine.
>> But today I've tried to run one of my scripts enabling parallel and got
>> the following error
>>
>>
>>>   File "code.py", line XXX, in <module>
>>>     parallel_env.exec_on_engine(imports)
>>>   File "XXX/ipython_parallel.py", line 86, in exec_on_engine
>>>     e.raise_exception()
>>>   File
>>> "XXX.local/lib/python2.7/site-packages/IPython/parallel/error.py", line
>>> 199, in raise_exception
>>>     raise RemoteError(en, ev, etb, ei)
>>> RemoteError: NameError(name 'plt' is not defined)
>>
>>
>> The only thing that uses matplotlib is pandas, and modifying
>>
>>  imports = ['import numpy as np', 'import matplotlib.pyplot as plt',
>>> 'import pandas as pd']
>>
>>
>> seems to solve the problem (although at least in one case the first call
>> of my code crashed with the error and the second went through).
>>
>> If I run my code without requiring the ipy parallel I don't have any
>> problem with 'plt'
>>
>> I guess that this is a bug. But I still haven't understood how to debug
>> what happens on the engines, so I can't give more details.
>> Any clues?
>>
>> If needed I can load my 'ipython_parallel.py' module in gist/github
>>
>> Cheers,
>>
>> Fra
>>
>> _______________________________________________
>> IPython-dev mailing list
>> IPython-dev at scipy.org
>> http://mail.scipy.org/mailman/listinfo/ipython-dev
>>
>>
>
> _______________________________________________
> IPython-dev mailing list
> IPython-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20130916/ada8de74/attachment.html>


More information about the IPython-dev mailing list