[Python-Dev] Rough idea for adding introspection information for builtins

Nick Coghlan ncoghlan at gmail.com
Tue Mar 19 15:34:30 CET 2013


On Tue, Mar 19, 2013 at 3:00 AM, Larry Hastings <larry at hastings.org> wrote:
> Why not require it to be there already? Maybe more like
>
>     def foo(arg, b=3, *, kwonly='a'):
>          ...
>
> (i.e. using Ellipsis instead of pass, so that it's clear that it's not an
> empty function but one the implementation of which is hidden)
>
> I like this notion. The groups notation and '/' will still cause the
> parser to choke and require special handling, but OTOH, they have
> deliberately been chosen as potentially acceptable notations for
> providing the same features in actual Python function declarations.
>
>
> I don't see the benefit of including the "def foo" and ":\n    ...".  The
> name doesn't help; inspect.Signature pointedly does not contain the name of
> the function, so it's irrelevant to this purpose.  And why have unnecessary
> boilerplate?

Also, we can already easily produce the extended form through:

    "def {}{}:\n    ...".format(f.__name__, inspect.signature(f))

So, agreed, capturing just the signature info is fine.

> Let me restate what we're talking about.  We're debating what types of data
> should be permissible to use for a datum that so far is not only unused, but
> is required to be unused.  PEP 8 states " The Python standard library will
> not use function annotations".  I don't know who among us has any experience
> using function annotations--or, at least, for their intended purpose.  It's
> hard to debate what are reasonable vs unreasonable restrictions on data we
> might be permitted to specify in the future for uses we don't know about.
> Restricting it to Python's rich set of safe literal values seems entirely
> reasonable; if we get there and need to relax the restriction, we can do so
> there.
>
> Also, you and I discussed this evening whether there was a credible attack
> vector here.  I figured, if you're running an untrustworthy extension, it's
> already game over.  You suggested that a miscreant could easily edit static
> data on a trusted shared library without having to recompile it to achieve
> their naughtiness.  I'm not sure I necessarily buy it, I just wanted to
> point out you were the one making the case for restricting it to
> ast.literal_eval.  ;-)

IIRC, I was arguing against allowing *pickle* because you can't audit
that just by looking at the generated source code. OTOH, I'm a big fan
of locking this kind of thing down by default and letting people make
the case for additional permissiveness, so I agree it's best to start
with literals only.

Here's a thought, though: instead of doing an Argument Clinic specific
hack, let's instead design a proper whitelist API for ast.literal_eval
that lets you accept additional constructs.

As a general sketch, the long if/elif chain in ast.literal_eval could
be replaced by:

    for converter in converters:
        ok, converted = converter(node)
        if ok:
            return converted
    raise ValueError('malformed node or string: ' + repr(node))

The _convert function would need to be lifted out and made public as
"ast.convert_node", so conversion functions could recurse
appropriately.

Both ast.literal_eval and ast.convert_node would accept a keyword-only
"allow" parameter that accepted an iterable of callables that return a
2-tuple to whitelist additional expressions beyond those normally
allowed. So, assuming we don't add it by default, you could allow
empty sets by doing:

    _empty_set = ast.dump(ast.parse("set()").body[0].value)
    def convert_empty_set(node):
        if ast.dump(node) == _empty_set:
            return True, set()
        return False, None

    ast.literal_eval(some_str, allow=[convert_empy_set])

This is quite powerful as a general tool to allow constrained
execution, since it could be used to whitelist builtins that accept
parameters, as well as to process class and function header lines
without executing their bodies. In the case of Argument Clinic, that
would mean writing a converter for the FunctionDef node.

> I certainly don't agree that "remove the slash and reparse" is more
> complicated than "add a new parameter metaphor to the Python language".
> Adding support for it may be worth doing--don't ask me, I'm still nursing my
> "positional-only arguments are part of Python and forever will be" Kool-aid.
> I'm just dealing with cold harsh reality as I understand it.
>
> As for handling optional argument groups, my gut feeling is that we're
> better off not leaking it out of Argument Clinic--don't expose it in this
> string we're talking about, and don't add support for it in the
> inspect.Parameter object.  I'm not going to debate range(), the syntax of
> which predates one of our release managers.  But I suggest option groups are
> simply a misfeature of the curses module.  There are some other possible
> uses in builtins (I forgot to dig those out this evening) but so far we're
> talking adding complexity to an array of technologies (this representation,
> the parser, the Parameter object) to support a handful of uses of something
> we shouldn't have done in the first place, for consumers who I think won't
> care and won't appreciate the added conceptual complexity.

Agreed on both points, but this should be articulated in the PEP.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list