[issue36521] Consider removing docstrings from co_consts in code objects

Inada Naoki report at bugs.python.org
Wed Sep 29 21:12:57 EDT 2021


Inada Naoki <songofacandy at gmail.com> added the comment:

> I'm not in favor of (c) co_doc either any more (for the reasons you state). I would go for (b), a CO_DOCSTRING flag plus co_consts[0]. I expect that co_consts sharing to be a very minor benefit, but you could easily count this with another small change to the count script.

OK. Let's reject (c).
I expect that CO_DOCSTRING benefit is much more minor than co_consts sharing. I will compare (b) with (d).

> Nested function creation could perhaps become a fraction faster if we didn't copy the docstring into the function object, leaving it func_doc NULL, making func.__doc__ a property that falls back on co_consts[0] if the flag is set.

Copying the docstring is way faster than creating annotations. So I don't think nested function creation time is main issue.

> I expect lazy docstrings to be in the distant future (I experimented quite a bit with different marshal formats to support this and it wasn't easy at all) but I don't want to exclude it.

Since code object is immutable/hashable, removing docstring from code object makes this idea easy.

For example, we can store long docstrings in some db (e.g. sqlite, dbm) in the __pycache__ directory and store its id to func.__doc__. When func.__doc__ is accessed, it can load the docstring from db.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue36521>
_______________________________________


More information about the Python-bugs-list mailing list