[Python-ideas] Updated PEP 432: Simplifying the CPython update sequence
Barry Warsaw
barry at python.org
Sat Jan 5 22:42:20 CET 2013
Hi Nick,
PEP 432 is looking very nice. It'll be fun to watch the implementation come
together. :)
Some comments...
The start up sequences:
> * Pre-Initialization - no interpreter available
> * Initialization - interpreter partially available
What about "Initializing"?
> * Initialized - full interpreter available, __main__ related metadata
> incomplete
> * Main Execution - optional state, __main__ related metadata populated,
> bytecode executing in the __main__ module namespace
What is "optional" about this state? Maybe it should be called "Operational"?
> ... separate system Python (spython) executable ...
I love the idea, but I'm not crazy about the name. What about
`python-minimal` (yes, it's deliberately longer. Symlinks ftw. :)
> <TBD: Did I miss anything?>
What about sys.implementation?
> as it failed to be updated for the virtual environment support added in
> Python 3.3 (detailed in PEP 420).
venv is defined in PEP 405 (there are two cases of mis-referencing).
Note that there may be other important build time settings on some platforms.
An example is Debian/Ubuntu, where we define the multiarch triplet in the
configure script, and pass that through Makefile(.pre.in) to sysmodule.c for
exposure as sys.implementation._multiarch.
> For a command executed with -c, it will be the string "-c"
> For explicitly requested input from stdin, it will be the string "-"
Wow, I couldn't believe it but it's true! That seems crazy useless. :)
> Embedding applications must call Py_SetArgv themselves. The CPython logic
> for doing so is part of Py_Main() and is not exposed separately. However,
> the runpy module does provide roughly equivalent logic in runpy.run_module
> and runpy.run_path.
As I've mentioned before on the python-porting mailing list, this is actually
more difficult than it seems because main() takes char*s but Py_SetArgv() and
Py_SetProgramName() takes wchar_t*s.
Maybe Python's own conversion could be refactored to make this easier either
as part of this PEP or after the PEP is implemented.
> int Py_ReadConfiguration(PyConfig *config);
> The config argument should be a pointer to a Python dictionary. For any
> supported configuration setting already in the dictionary, CPython will
> sanity check the supplied value, but otherwise accept it as correct.
So why not define this to take a PyObject* or a PyDictObject* ?
(also: the Py_Config struct members need the correct concrete type pointers,
e.g. PyDictObject*)
> Alternatively, settings may be overridden after the Py_ReadConfiguration
> call (this can be useful if an embedding application wants to adjust a
> setting rather than replace it completely, such as removing sys.path[0]).
How will setting something after Py_ReadConfiguration() is called change a
value such as sys.path? Or is this the reason why you pass a Py_Config to
Py_EndInitialization()?
(also, see the type typo <wink> in the definition of Py_EndInitialization())
Also, I suggest taking the opportunity to change the sense of flags such as
no_site and dont_write_bytecode. I find it much more difficult to reason that
"dont_write_bytecode = 0" means *do* write bytecode, rather than
"write_bytecode = 1". I.e. positives are better than double-negatives.
> sys.argv[0] may not yet have its final value
> it will be -m when executing a module or package with CPython
Gosh, wouldn't it be nice if this could have a more useful value?
> Initial thought is that hiding the various options behind a single API would
> make that API too complicated, so 3 separate APIs is more likely:
+1
> The interpreter state will be updated to include details of the
> configuration settings supplied during initialization by extending the
> interpreter state object with an embedded copy of the Py_CoreConfig and
> Py_Config structs.
Couldn't it just have a dict with all the values from both structs collapsed
into it?
> For debugging purposes, the configuration settings will be exposed as a
> sys._configuration simple namespace
I suggest un-underscoring the name and making it public. It might be useful
for other than debugging purposes.
> Is Py_IsRunningMain() worth keeping?
Perhaps. Does it provide any additional information above Py_IsInitialized()?
> Should the answers to Py_IsInitialized() and Py_RunningMain() be exposed via
> the sys module?
I can't think of a use case.
> Is the Py_Config struct too unwieldy to be practical? Would a Python
> dictionary be a better choice?
Although I see why you've spec'd it this way, I don't like having *two* config
structures (Py_CoreConfig and Py_Config). Having a dictionary for the latter
would probably be fine, and in fact you could copy the Py_Config values into
it (when possible during the init sequence) and expose it in the sys module.
> Would it be better to manage the flag variables in Py_Config as Python
> integers so the struct can be initialized with a simple memset(&config, 0,
> sizeof(*config))?
Would we even notice the optimization?
> A System Python Executable
This should probably at least mention Christian's idea of the -I flag (which I
think hasn't been PEP'd yet). We can bikeshed about the name of the
executable later. :)
Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130105/32736f7b/attachment.pgp>
More information about the Python-ideas
mailing list