[Distutils] Q about best practices now (or near future)

Sat Jul 20 08:10:13 CEST 2013

On 20 July 2013 01:47, PJ Eby <pje at telecommunity.com> wrote:
> On Fri, Jul 19, 2013 at 9:10 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> Right, I think the reasonable near term solutions are for pip to either:
>>
>> 1. generate zc.buildout style wrappers with absolute paths to avoid
>> the implied runtime dependency
>> 2. interpret use of script entry points as an implied dependency on
>> setuptools and install it even if not otherwise requested
>>
>> Either way, pip would need to do something about its *own* command
>> line script, which heavily favours option 1
>
> Option 1 also would address some or all of the startup performance complaint.
>
> It occurs to me that it might actually be a good idea *not* to put the
> script wrappers in the standard entry points file, even if that's what
> setuptools does right now: if lots of packages use that approach,
> it'll slow down the effective indexing for code that's scanning
> multiple packages for something like a sqlalchemy adapter.
>
> (Alternately, we could use something like
> 'exports-some.group.name.json' so that each export group is a separate
> file; this would keep scripts separate from everything else, and
> optimize plugin searches falling in a particular group.  In fact, the
> files needn't have any contents; it'd be okay to just parse the main
> .json for any distribution that has exports in the group you're
> looking for.  i.e., the real purpose of the separation of entry points
> was always just to avoid loading metadata for distributions that don't
> have the kind of exports you're looking for.  In the old world, few
> distributions exported anything, so just identifying whether a
> distribution had exports was sufficient.  In the new world, more and
> more distributions over time will have some kind of export, so knowing
> *which* exports they have will become more important.)

A not-so-quick sketch of my current thinking:

Two new fields in PEP 426: commands and exports

Like the core dependency metadata, both get generated files:
pydist-commands.json and pydist-exports.json

(As far as the performance concern goes, I think longer term we'll
probably move to a richer installation database format that includes
an SQLite cache file managed by the installers. But near term, I like
the idea of being able to check "has commands or not" and "has exports
or not" with a single stat call for the appropriate file)

Rather than using the "module.name:qualified.name" format (as the PEP
currently does for the install_hooks), "export specifiers" would be
defined as a mapping with the following subfields:

    * module
    * qualname (as per PEP 3155)
    * extra

Both qualname and extra would be optional. "extra" indicates that the
export is only present if that extra is installed.

The top level commands field would have three subfields:
"wrap_console", "wrap_gui" and "prebuilt". The wrap_console and
wrap_gui subfields would both be maps of command names to export
specifiers (i.e. requests for an installer to generate the appropriate
wrappers), while prebuilt would be a mapping of command names to paths
relative to the scripts directory (as strings).

Note that given that Python 2.7+ and 3.2+ can execute packages with a
__main__ submodule, the export specifier for a command entry *may*
just be the module component and it should still work.

The exports field is just a rebranded and slightly rearranged
entry_points structure: the top level keys in the hash map are "export
groups" (defined in the same way as metadata extensions are defined)
and the individual entries in each export group are arbitrary keys
(meaning determined by the export group) mapping to export specifiers.

With this change, I may even move the current top level
"install_hooks" field inside the "exports" field. Even if it stay at
the top level, the values will become export specifiers rather than
using the entry points string format.

Not sure when I'll get that tidied up and incorporated into a new
draft of PEP 426, but I think it covers everything.

For those wondering about my dividing line between "custom string
format" and "structured data": the custom string formats in PEP 426
should be limited to things that are likely to be passed as command
line arguments (like requirement specifiers and their assorted
components), or those where using structured data would be
extraordinarily verbose (like environment markers). If I have any
custom string formats still in there that don't fit either of those
categories, then let me know and I'll see if I can replace them with
structured data.

Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia