From aatsnps at gmail.com Tue May 17 20:20:42 2022 From: aatsnps at gmail.com (AJ D) Date: Tue, 17 May 2022 17:20:42 -0700 Subject: [C++-sig] Multiple Stateless Embedded Python Interpreters with BOOST PYTHON In-Reply-To: References: Message-ID: Hello Python Experts, I had a question on embedding Python interpreter in my application. I have an application where: - I need to embed a Python Interpreter - Run different blocks of Python code on this embedded interpreter - However, each block should not see any of the changes made by the previous block of Python code (except for instances of a particular class which, if created in the first block of Python code, should be visible from the 2nd and subsequent block of Python code) - In other words, for the most part, I would need kind of a fresh Python interpreter for running each block of Python code, with the exception that certain class instances will be shared across all blocks I am using BOOST PYTHON at the moment, since we didn?t want to deal with the lower level details like reference count in Python C API. Wanted some advice on what is the best (production suitable quality-wise) way to get multiple Python interpreters selectively preserve some state and forget all other states? I did some research and came up with the following 4 approaches, wanted to hear some opinion in this forum for/against these approaches: 1. Invoke Py_Initialize() / Py_Finalize() multiple times 1. BOOST Python documentation kind of documents that Py_Finalize() should not be called: https://www.boost.org/doc/libs/1_62_0/libs/python/doc/html/tutorial/tutorial/embedding.html 2. So, that flow isn?t quite officially supported it looks like. 2. Invoke multiple Python sub-interpreters 1. BOOST Python documentation didn?t seem to talk about sub-interpreters at all ? which could mean that it is not supported. 2. I tried it any way and keep getting this error message, which probably is indicating that BOOST Python is trying to re-register the to-Python convertors the 2nd and subsequent time ? once again, it didn?t look like a very/officially supported flow. * i. **frozen_importlib:219: RuntimeWarning: to-Python converter for MyClass already registered; second conversion method ignored.* 1. Create the initial Python interpreter in the main process and for executing each Python block fork off a new process and execute each Python block in a child process 1. This would work fine as long as the Python blocks are not writing something back to the data base 2. When Python blocks write something back to the data base, this has to be propagated to the main / parent process i. In multi-threaded environment, this has to be propagated to all active child processes, since eventual consistency isn?t quite allowable in my application domain. 1. Sharing of instances across process can be done using shared memory kind of IPC and serialization using pickle / dill 1. Manually Manage Interpreters 1. Create just 1 interpreter 2. Save its state in some global hash like i. Initial_state = dirs() 1. Use some kind of a lock to allow only 1 block to run at a time (this is acceptable for my application domain) 2. After a block of Python code runs in the interpreter, and before running the next block of Python code, clean up the interpreter using the following approach: i. Walk over globals() and pop everything off / remove from globals() all symbols that is not present (previously saved) in Initial_state ? that way any additions by the previous block will not be reachable any more and thus, should be garbage collected. 1. For sharing state across blocks, we may choose to not pop off symbols, based on some condition like instances of a pre-defined class. Any comments / pros / cons on these approaches that I missed? Is there a 5th approach that I missed thinking of? Any pointers is appreciated. Thanks, AD -------------- next part -------------- An HTML attachment was scrubbed... URL: