From chris.barker at noaa.gov Wed Feb 1 11:07:15 2017 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 1 Feb 2017 08:07:15 -0800 Subject: [Distutils] install questions and help requested. ---pyautogui In-Reply-To: References: Message-ID: <8960273004026282209@unknownmsgid> This is really a list for discussing development of distribution tools, rather than help on basic usage. But; > > >>> pip install pyautogui > SyntaxError: invalid syntax This looks like you are trying to run pip at the Python prompt. Pip is designed to be run st a system command line ("DOS prompt") Try the same command there. -CHB > > Sincerely, > Michael G. Strain Jr. > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig From ncoghlan at gmail.com Mon Feb 6 06:17:05 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 6 Feb 2017 12:17:05 +0100 Subject: [Distutils] Adding the "Description-Content-Type" metadata field Message-ID: Hi folks, Marc Abramowitz has prepared a PR for the Core Metadata section of the specifications page [1] that adds a new "Description-Content-Type" field: https://github.com/pypa/python-packaging-user-guide/pull/258 The draft text has now reached the point where I'm prepared to accept it, so this thread offers folks one last chance to provide feedback before we make it official. Full text of the new subsection ========================================= Description-Content-Type ~~~~~~~~~~~~~~~~~~~~~~~~ A string containing the format of the distribution's description, so that tools can intelligently render the description. Historically, PyPI supported descriptions in plain text and `reStructuredText (reST) `_, and could render reST into HTML. However, it is common for distribution authors to write the description in `Markdown `_ (`RFC 7763 `_) as many code hosting sites render Markdown READMEs, and authors would reuse the file for the description. PyPI didn't recognize the format and so could not render the description correctly. This resulted in many packages on PyPI with poorly-rendered descriptions when Markdown is left as plain text, or worse, was attempted to be rendered as reST. This field allows the distribution author to specify the format of their description, opening up the possibility for PyPI and other tools to be able to render Markdown and other formats. The format of this field is the same as the ``Content-Type`` header in HTTP (e.g.: `RFC 1341 `_). Briefly, this means that it has a ``type/subtype`` part and then it can optionally have a number of parameters: Format:: Description-Content-Type: /; charset=[; = ...] The ``type/subtype`` part has only a few legal values: - ``text/plain`` - ``text/x-rst`` - ``text/markdown`` The ``charset`` parameter can be used to specify whether the character set in use is UTF-8, ASCII, etc. If ``charset`` is not provided, then it is recommended that the implementation (e.g.: PyPI) treat the content as UTF-8. Other parameters might be specific to the chosen subtype. For example, for the ``markdown`` subtype, there is a ``variant`` parameter that allows specifying the variant of Markdown in use, such as: - ``CommonMark`` for `CommonMark` `_ - ``GFM`` for `GitHub Flavored Markdown (GFM) `_ - ``Original`` for `Gruber's original Markdown syntax `_ Example:: Description-Content-Type: text/plain; charset=UTF-8 Example:: Description-Content-Type: text/x-rst; charset=UTF-8 Example:: Description-Content-Type: text/markdown; charset=UTF-8; variant=CommonMark Example:: Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM Example:: Description-Content-Type: text/markdown; charset=UTF-8; variant=Original If a ``Description-Content-Type`` is not specified or it's set to an unrecognized value, then the assumed content type is ``text/x-rst; charset=UTF-8``. If the ``charset`` is not specified or it's set to an unrecognized value, then the assumed ``charset`` is ``UTF-8``. If the subtype is ``markdown`` and ``variant`` is not specified or it's set to an unrecognized value, then the assumed ``variant`` is ``CommonMark``. ========================================= [1] https://packaging.python.org/specifications/#core-metadata Regards, Nick. P.S. I know I still need to update https://www.pypa.io/en/latest/specifications/ to reflect the ability to make small backwards compatible adjustments to the specifications without a PEP, so I'll get that sorted today, since I've been talking about it for approximately forever. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Mon Feb 6 07:14:04 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 6 Feb 2017 13:14:04 +0100 Subject: [Distutils] pypa.io PR to document the actual current spec update process Message-ID: Hi folks, The "Specifications" section in the pypa.io developer's manual had fallen behind the process we've actually been using in recent times, so I've finally submitted a PR to bring it up to date: https://github.com/pypa/pypa.io/pull/19 General questions about the change are best asked on the list, while detailed comments on the specific wording in the PR are best submitted through GitHub. Cheers, Nick. P.S. See https://github.com/pypa/pypa.io/issues/11 for some additional background -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From vmittal05 at gmail.com Mon Feb 6 14:27:49 2017 From: vmittal05 at gmail.com (varun mittal) Date: Tue, 7 Feb 2017 00:57:49 +0530 Subject: [Distutils] bdist_deb always creates 'all' architecture package for me Message-ID: Hi all I am totally new to debian package building. Need to create a deb package from source, for Ubuntu. The package would contain mostly python code and a singular C file. My control file in 'debian' directory reads 'any' for Architecture. But running bdist_deb always creates _all.deb package. How to control that ? I tried forcing it to 'amd64' too, but didn't succeed Thanks n regards Mittal -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas at kluyver.me.uk Tue Feb 7 06:29:30 2017 From: thomas at kluyver.me.uk (Thomas Kluyver) Date: Tue, 07 Feb 2017 11:29:30 +0000 Subject: [Distutils] Indexing modules in Python distributions Message-ID: <1486466970.2092062.872926736.0C8AF205@webmail.messagingengine.com> For a variety of reasons, I would like to build an index of what modules/packages are contained in which distributions ('packages') on PyPI. For instance: - Identifying requirements by static analysis of code: 'import zmq' -> requires pyzmq - Finding corresponding packages from different packaging systems: pyzmq on PyPI corresponds to pyzmq in conda, and python[3]-zmq in Debian repositories. This is an oversimplification, but importable module names provide a common basis to compare packages. I'd like a tool that could pick between different ways of installing a given module. People often assume that the import name is the same as the name on PyPI. This is true in the vast majority of cases, but there's no requirement that they are the same, and there are cases where they're not - pyzmq is one example. The metadata field 'Provides' is, according to PEP 314, intended for this purpose, but the standard packaging tools don't make it easy to use, and consequently very few packages specify it. I have started putting together a tool to index wheels. It reads a .whl file, finds modules inside it, and tries to identify namespace packages. It's still quite rough, but it worked with the wheels I tried. https://github.com/takluyver/wheeldex Is this something that other people are interested in? One thing I'm trying to work out at the moment is how the data would be accessed: as a web service that tools can query online, or more like Linux packaging, where tools download and cache a list to do lookups locally. Or both? There's also, of course, the question of how the index would be built and updated. Thanks, Thomas From steve.dower at python.org Tue Feb 7 09:38:46 2017 From: steve.dower at python.org (Steve Dower) Date: Tue, 7 Feb 2017 06:38:46 -0800 Subject: [Distutils] Indexing modules in Python distributions In-Reply-To: <1486466970.2092062.872926736.0C8AF205@webmail.messagingengine.com> References: <1486466970.2092062.872926736.0C8AF205@webmail.messagingengine.com> Message-ID: I'm interested, and potentially in a position to provide funded infrastructure for this (though perhaps not as soon as you'd like, since things can move slowly at my end). My personal preference would be to download a full list. This is slow moving data that will gzip nicely, and my uses (in IDE) will require many tentative queries. I can also see value in a single-query API, but keep it simple - the value here is in the data, not the lookup. As far as updates go, most packaging systems should have some sort of release notification or update feed, so the work is likely going to be in hooking up to those and turning it into a scan task. Cheers, Steve Top-posted from my Windows Phone -----Original Message----- From: "Thomas Kluyver" Sent: ?2/?7/?2017 3:30 To: "distutils-sig at python.org" Subject: [Distutils] Indexing modules in Python distributions For a variety of reasons, I would like to build an index of what modules/packages are contained in which distributions ('packages') on PyPI. For instance: - Identifying requirements by static analysis of code: 'import zmq' -> requires pyzmq - Finding corresponding packages from different packaging systems: pyzmq on PyPI corresponds to pyzmq in conda, and python[3]-zmq in Debian repositories. This is an oversimplification, but importable module names provide a common basis to compare packages. I'd like a tool that could pick between different ways of installing a given module. People often assume that the import name is the same as the name on PyPI. This is true in the vast majority of cases, but there's no requirement that they are the same, and there are cases where they're not - pyzmq is one example. The metadata field 'Provides' is, according to PEP 314, intended for this purpose, but the standard packaging tools don't make it easy to use, and consequently very few packages specify it. I have started putting together a tool to index wheels. It reads a .whl file, finds modules inside it, and tries to identify namespace packages. It's still quite rough, but it worked with the wheels I tried. https://github.com/takluyver/wheeldex Is this something that other people are interested in? One thing I'm trying to work out at the moment is how the data would be accessed: as a web service that tools can query online, or more like Linux packaging, where tools download and cache a list to do lookups locally. Or both? There's also, of course, the question of how the index would be built and updated. Thanks, Thomas _______________________________________________ Distutils-SIG maillist - Distutils-SIG at python.org https://mail.python.org/mailman/listinfo/distutils-sig -------------- next part -------------- An HTML attachment was scrubbed... URL: From Christopher.Wilcox at microsoft.com Tue Feb 7 11:49:14 2017 From: Christopher.Wilcox at microsoft.com (Chris Wilcox) Date: Tue, 7 Feb 2017 16:49:14 +0000 Subject: [Distutils] Indexing modules in Python distributions In-Reply-To: References: <1486466970.2092062.872926736.0C8AF205@webmail.messagingengine.com> Message-ID: Thanks for cc-ing me Steve. I may be able to help jump-start this a bit and provide a platform for this to run on. I deployed a small service that scans PyPI to figure out statistics on Python 2 vs Python 3 support using PyPI Classifiers. The source is on GitHub: https://github.com/crwilcox/PyPI-Gatherer. It watches the PyPI updates feed and refreshes entries for packages as they show up as modified. It should be possible to add your lib, query, and add an additional row or two to the result. I am happy to work together on this. Also, the data is stored in an Azure Table Storage which has rest endpoints (and a Python SDK) that makes getting the published data straight-forward. Here is an example of using the data provided by the service. This is a Jupyter Notebook analysing Python 3 Adoption: https://notebooks.azure.com/chris/libraries/pypidataanalysis Thanks. Chris From: Steve Dower [mailto:steve.dower at python.org] Sent: Tuesday, 7 February, 2017 6:39 To: Thomas Kluyver ; distutils-sig at python.org Cc: Chris Wilcox Subject: RE: [Distutils] Indexing modules in Python distributions I'm interested, and potentially in a position to provide funded infrastructure for this (though perhaps not as soon as you'd like, since things can move slowly at my end). My personal preference would be to download a full list. This is slow moving data that will gzip nicely, and my uses (in IDE) will require many tentative queries. I can also see value in a single-query API, but keep it simple - the value here is in the data, not the lookup. As far as updates go, most packaging systems should have some sort of release notification or update feed, so the work is likely going to be in hooking up to those and turning it into a scan task. Cheers, Steve Top-posted from my Windows Phone ________________________________ From: Thomas Kluyver Sent: ?2/?7/?2017 3:30 To: distutils-sig at python.org Subject: [Distutils] Indexing modules in Python distributions For a variety of reasons, I would like to build an index of what modules/packages are contained in which distributions ('packages') on PyPI. For instance: - Identifying requirements by static analysis of code: 'import zmq' -> requires pyzmq - Finding corresponding packages from different packaging systems: pyzmq on PyPI corresponds to pyzmq in conda, and python[3]-zmq in Debian repositories. This is an oversimplification, but importable module names provide a common basis to compare packages. I'd like a tool that could pick between different ways of installing a given module. People often assume that the import name is the same as the name on PyPI. This is true in the vast majority of cases, but there's no requirement that they are the same, and there are cases where they're not - pyzmq is one example. The metadata field 'Provides' is, according to PEP 314, intended for this purpose, but the standard packaging tools don't make it easy to use, and consequently very few packages specify it. I have started putting together a tool to index wheels. It reads a .whl file, finds modules inside it, and tries to identify namespace packages. It's still quite rough, but it worked with the wheels I tried. https://github.com/takluyver/wheeldex Is this something that other people are interested in? One thing I'm trying to work out at the moment is how the data would be accessed: as a web service that tools can query online, or more like Linux packaging, where tools download and cache a list to do lookups locally. Or both? There's also, of course, the question of how the index would be built and updated. Thanks, Thomas _______________________________________________ Distutils-SIG maillist - Distutils-SIG at python.org https://mail.python.org/mailman/listinfo/distutils-sig -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas at kluyver.me.uk Wed Feb 8 13:14:38 2017 From: thomas at kluyver.me.uk (Thomas Kluyver) Date: Wed, 08 Feb 2017 18:14:38 +0000 Subject: [Distutils] Indexing modules in Python distributions In-Reply-To: References: <1486466970.2092062.872926736.0C8AF205@webmail.messagingengine.com> Message-ID: <1486577678.268328.874662408.717603DB@webmail.messagingengine.com> Thanks Steve, Chris, On Tue, Feb 7, 2017, at 04:49 PM, Chris Wilcox wrote: > I may be able to help jump-start this a bit and provide a platform for > this to run on. I deployed a small service that scans PyPI to figure > out statistics on Python 2 vs Python 3 support using PyPI Classifiers. > The source is on GitHub: https://github.com/crwilcox/PyPI-Gatherer. > It watches the PyPI updates feed and refreshes entries for packages as > they show up as modified. It should be possible to add your lib, > query, and add an additional row or two to the result. I am happy to > work together on this. Also, the data is stored in an Azure Table > Storage which has rest endpoints (and a Python SDK) that makes getting > the published data straight-forward. I had a quick look through this, and it does look like it should provide a useful framework for scanning PyPI and updating the results. :-) What I'm proposing differs in that it would need to download files from PyPI - basically all of them, if we're thorough about it. I imagine that's going to involve a lot of data transfer. Do we know what order of magnitude we're talking about? Is it so large that we should be thinking of running the scanner in the same data centre as the file storage? Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Wed Feb 8 18:06:28 2017 From: wes.turner at gmail.com (Wes Turner) Date: Wed, 8 Feb 2017 17:06:28 -0600 Subject: [Distutils] Indexing modules in Python distributions In-Reply-To: <1486577678.268328.874662408.717603DB@webmail.messagingengine.com> References: <1486466970.2092062.872926736.0C8AF205@webmail.messagingengine.com> <1486577678.268328.874662408.717603DB@webmail.messagingengine.com> Message-ID: On Wednesday, February 8, 2017, Thomas Kluyver wrote: > Thanks Steve, Chris, > > On Tue, Feb 7, 2017, at 04:49 PM, Chris Wilcox wrote: > > I may be able to help jump-start this a bit and provide a platform for > this to run on. I deployed a small service that scans PyPI to figure out > statistics on Python 2 vs Python 3 support using PyPI Classifiers. The > source is on GitHub: https://github.com/crwilcox/PyPI-Gatherer. It > watches the PyPI updates feed and refreshes entries for packages as they > show up as modified. It should be possible to add your lib, query, and add > an additional row or two to the result. I am happy to work together on > this. Also, the data is stored in an Azure Table Storage which has rest > endpoints (and a Python SDK) that makes getting the published data > straight-forward. > > > I had a quick look through this, and it does look like it should provide a > useful framework for scanning PyPI and updating the results. :-) > > What I'm proposing differs in that it would need to download files from > PyPI - basically all of them, if we're thorough about it. I imagine that's > going to involve a lot of data transfer. Do we know what order of magnitude > we're talking about? Is it so large that we should be thinking of running > the scanner in the same data centre as the file storage? > So, IIUC, you're looking to emit ((URL, release, platform), namespaces_odict) for each new and all existing packages; by uncompressing every package and running every setup.py (hopefully in a container)? https://github.com/python/pypi-salt/blob/master/provisioning/salt/roots/pillar/top.sls https://github.com/python/pypi-salt/blob/master/provisioning/salt/roots/pillar/warehouse-deploys/warehouse-dev.sls https://github.com/python/pypi-salt/blob/master/provisioning/salt/roots/salt/warehouse/web.sls - https://github.com/pypa/warehouse/blob/master/warehouse/packaging/search.py - elasticsearch_dsl - https://github.com/pypa/warehouse/blob/master/warehouse/packaging/models.py - SQLAlchemy - https://github.com/pypa/warehouse/blob/master/warehouse/celery.py - celery - https://github.com/pypa/warehouse/blob/master/warehouse/legacy/api/json.py - namespaces are useful metadata (worth adding to the spec) - https://github.com/pypa/interoperability-peps/issues/31 - JSONLD - https://github.com/python/psf-salt/blob/master/pillar/prod/top.sls - https://github.com/python/psf-salt/blob/master/pillar/prod/roles.sls - One CI project (container FROM python: (debian)) per python package with additional metadata per project? - conda-forge solves for this case - and then how to post the extra metadata (build artifact) back from the CI build and mark the task as done Could this (namespace extraction) be added to 'setup.py build' for the future? > > Thomas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pradyunsg at gmail.com Wed Feb 8 21:15:29 2017 From: pradyunsg at gmail.com (Pradyun Gedam) Date: Thu, 09 Feb 2017 02:15:29 +0000 Subject: [Distutils] GSoC 2017 - Working on pip Message-ID: Hello Everyone! Ralf Gommers suggested that I put this proposal here on this list, for feedback and for seeing if anyone would be willing to mentor me. So, here it is. ----- My name is Pradyun Gedam. I'm currently a first year student VIT University in India. I would like to apply for GSoC 2017 under PSF. I currently have a project in mind - the "pip needs a dependency resolver" issue [1]. I would like to take on this specific project but am willing to do some other project as well. For some background, around mid 2016, I started contributing to pip. The first issue I tackled was #59 [2] - a request for upgrade command and an upgrade-all command that has been open for over 5.5 years. Over the months following that, I've have had the opportunity to work with and understand multiple parts of pip's codebase while working on this issue and a few others. This search on GitHub issues [3] also provides a good summary of what work I've done on pip. [2]: https://github.com/pypa/pip/issues/988 [2]: https://github.com/pypa/pip/issues/59 [3]: https://github.com/pypa/pip/issues?q=author%3Apradyunsg Eagerly-waiting-for-a-response-ly, Pradyun Gedam -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas at kluyver.me.uk Thu Feb 9 05:33:27 2017 From: thomas at kluyver.me.uk (Thomas Kluyver) Date: Thu, 09 Feb 2017 10:33:27 +0000 Subject: [Distutils] Indexing modules in Python distributions In-Reply-To: References: <1486466970.2092062.872926736.0C8AF205@webmail.messagingengine.com> <1486577678.268328.874662408.717603DB@webmail.messagingengine.com> Message-ID: <1486636407.1437380.875436168.70CE2BE5@webmail.messagingengine.com> On Wed, Feb 8, 2017, at 11:06 PM, Wes Turner wrote: > So, IIUC, > you're looking to emit > ((URL, release, platform), namespaces_odict) > for each new and all existing packages; > by uncompressing every package and running every setup.py (hopefully > in a container)? Something like that, yes. For packages that publish wheels, we can analyse those directly without needing to run setup.py. Of course there are many packages with only sdists published. > Could this (namespace extraction) be added to 'setup.py build' for > the future? Potentially. As I mentioned, there is a place in the metadata to put this information - the 'Provides' field. However, relying on package uploaders would take a long time to build up decent coverage of the available packages, so I'm inclined to focus on scanning PyPI, similar to the tool Chris already showed. Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From xav.fernandez at gmail.com Thu Feb 9 07:51:58 2017 From: xav.fernandez at gmail.com (Xavier Fernandez) Date: Thu, 9 Feb 2017 13:51:58 +0100 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: Message-ID: That's would be a great news :) On Thu, Feb 9, 2017 at 3:15 AM, Pradyun Gedam wrote: > Hello Everyone! > > Ralf Gommers suggested that I put this proposal here on this list, for > feedback and for seeing if anyone would be willing to mentor me. So, here > it is. > > ----- > > My name is Pradyun Gedam. I'm currently a first year student VIT > University in India. > > I would like to apply for GSoC 2017 under PSF. > > I currently have a project in mind - the "pip needs a dependency resolver" > issue [1]. I would like to take on this specific project but am willing to > do some other project as well. > > For some background, around mid 2016, I started contributing to pip. The > first issue I tackled was #59 [2] - a request for upgrade command and an > upgrade-all command that has been open for over 5.5 years. Over the months > following that, I've have had the opportunity to work with and understand > multiple parts of pip's codebase while working on this issue and a few > others. This search on GitHub issues [3] also provides a good summary of > what work I've done on pip. > > [2]: https://github.com/pypa/pip/issues/988 > [2]: https://github.com/pypa/pip/issues/59 > [3]: https://github.com/pypa/pip/issues?q=author%3Apradyunsg > > Eagerly-waiting-for-a-response-ly, > Pradyun Gedam > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Feb 9 09:20:19 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 9 Feb 2017 15:20:19 +0100 Subject: [Distutils] Indexing modules in Python distributions In-Reply-To: <1486577678.268328.874662408.717603DB@webmail.messagingengine.com> References: <1486466970.2092062.872926736.0C8AF205@webmail.messagingengine.com> <1486577678.268328.874662408.717603DB@webmail.messagingengine.com> Message-ID: On 8 February 2017 at 19:14, Thomas Kluyver wrote: > What I'm proposing differs in that it would need to download files from PyPI > - basically all of them, if we're thorough about it. I imagine that's going > to involve a lot of data transfer. Do we know what order of magnitude we're > talking about? Is it so large that we should be thinking of running the > scanner in the same data centre as the file storage? Last time I asked Donald about doing things like this, he noted that a full mirror is ~215 GiB. That was a year or two ago so I assume the number has gone up since then, but it should still be in the same order of magnitude. >From an ecosystem resilience point of view, there's also a lot to be said for having copies of the full PyPI bulk artifact store in both AWS S3 (which is where the production PyPI data lives) and in Azure :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From donald at stufft.io Thu Feb 9 09:53:00 2017 From: donald at stufft.io (Donald Stufft) Date: Thu, 9 Feb 2017 09:53:00 -0500 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: Message-ID: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io> I?ve never done it before, but I?m happy to provide mentoring on this. > On Feb 8, 2017, at 9:15 PM, Pradyun Gedam wrote: > > Hello Everyone! > > Ralf Gommers suggested that I put this proposal here on this list, for feedback and for seeing if anyone would be willing to mentor me. So, here it is. > > ----- > > My name is Pradyun Gedam. I'm currently a first year student VIT University in India. > > I would like to apply for GSoC 2017 under PSF. > > I currently have a project in mind - the "pip needs a dependency resolver" issue [1]. I would like to take on this specific project but am willing to do some other project as well. > > For some background, around mid 2016, I started contributing to pip. The first issue I tackled was #59 [2] - a request for upgrade command and an upgrade-all command that has been open for over 5.5 years. Over the months following that, I've have had the opportunity to work with and understand multiple parts of pip's codebase while working on this issue and a few others. This search on GitHub issues [3] also provides a good summary of what work I've done on pip. > > [2]: https://github.com/pypa/pip/issues/988 > [2]: https://github.com/pypa/pip/issues/59 > [3]: https://github.com/pypa/pip/issues?q=author%3Apradyunsg > > Eagerly-waiting-for-a-response-ly, > Pradyun Gedam > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Thu Feb 9 17:18:22 2017 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 9 Feb 2017 22:18:22 +0000 Subject: [Distutils] Indexing modules in Python distributions In-Reply-To: <1486577678.268328.874662408.717603DB@webmail.messagingengine.com> References: <1486466970.2092062.872926736.0C8AF205@webmail.messagingengine.com> <1486577678.268328.874662408.717603DB@webmail.messagingengine.com> Message-ID: <20170209221822.GS12827@yuggoth.org> On 2017-02-08 18:14:38 +0000 (+0000), Thomas Kluyver wrote: [...] > What I'm proposing differs in that it would need to download files from > PyPI - basically all of them, if we're thorough about it. I imagine > that's going to involve a lot of data transfer. Do we know what order of > magnitude we're talking about? [...] The crowd I run with uses https://pypi.org/project/bandersnatch/ to maintain a full PyPI mirror for our project's distributed CI system, and du says the current aggregate size is 488GiB. Also if you want to initialize a full mirror this way, plan for it to take several days to populate. -- Jeremy Stanley From pradyunsg at gmail.com Fri Feb 10 13:20:03 2017 From: pradyunsg at gmail.com (Pradyun Gedam) Date: Fri, 10 Feb 2017 18:20:03 +0000 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io> References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io> Message-ID: Yay! Thank you so much for a prompt and positive response! I'm pretty excited and looking forward to this. On Thu, Feb 9, 2017, 20:23 Donald Stufft wrote: I?ve never done it before, but I?m happy to provide mentoring on this. On Feb 8, 2017, at 9:15 PM, Pradyun Gedam wrote: Hello Everyone! Ralf Gommers suggested that I put this proposal here on this list, for feedback and for seeing if anyone would be willing to mentor me. So, here it is. ----- My name is Pradyun Gedam. I'm currently a first year student VIT University in India. I would like to apply for GSoC 2017 under PSF. I currently have a project in mind - the "pip needs a dependency resolver" issue [1]. I would like to take on this specific project but am willing to do some other project as well. For some background, around mid 2016, I started contributing to pip. The first issue I tackled was #59 [2] - a request for upgrade command and an upgrade-all command that has been open for over 5.5 years. Over the months following that, I've have had the opportunity to work with and understand multiple parts of pip's codebase while working on this issue and a few others. This search on GitHub issues [3] also provides a good summary of what work I've done on pip. [2]: https://github.com/pypa/pip/issues/988 [2]: https://github.com/pypa/pip/issues/59 [3]: https://github.com/pypa/pip/issues?q=author%3Apradyunsg Eagerly-waiting-for-a-response-ly, Pradyun Gedam _______________________________________________ Distutils-SIG maillist - Distutils-SIG at python.org https://mail.python.org/mailman/listinfo/distutils-sig ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Fri Feb 10 13:59:32 2017 From: wes.turner at gmail.com (Wes Turner) Date: Fri, 10 Feb 2017 12:59:32 -0600 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io> Message-ID: >From the discussion on https://github.com/pypa/pip/issues/988#issuecomment-279033079: - https://github.com/ContinuumIO/pycosat (picosat) - https://github.com/ContinuumIO/pycosat/blob/master/pycosat.c (C) - https://github.com/ContinuumIO/pycosat/blob/master/picosat.c - https://github.com/ContinuumIO/pycosat/tree/master/examples - https://github.com/enthought/sat-solver (MiniSat) - https://github.com/enthought/sat-solver/tree/master/simplesat/tests - https://github.com/enthought/sat-solver/blob/master/requirements.txt (PyYAML, enum34) Is there a better way than SAT? On Fri, Feb 10, 2017 at 12:20 PM, Pradyun Gedam wrote: > Yay! Thank you so much for a prompt and positive response! I'm pretty > excited and looking forward to this. > > On Thu, Feb 9, 2017, 20:23 Donald Stufft wrote: > > I?ve never done it before, but I?m happy to provide mentoring on this. > > On Feb 8, 2017, at 9:15 PM, Pradyun Gedam wrote: > > Hello Everyone! > > Ralf Gommers suggested that I put this proposal here on this list, for > feedback and for seeing if anyone would be willing to mentor me. So, here > it is. > > ----- > > My name is Pradyun Gedam. I'm currently a first year student VIT > University in India. > > I would like to apply for GSoC 2017 under PSF. > > I currently have a project in mind - the "pip needs a dependency resolver" > issue [1]. I would like to take on this specific project but am willing to > do some other project as well. > > For some background, around mid 2016, I started contributing to pip. The > first issue I tackled was #59 [2] - a request for upgrade command and an > upgrade-all command that has been open for over 5.5 years. Over the months > following that, I've have had the opportunity to work with and understand > multiple parts of pip's codebase while working on this issue and a few > others. This search on GitHub issues [3] also provides a good summary of > what work I've done on pip. > > [2]: https://github.com/pypa/pip/issues/988 > [2]: https://github.com/pypa/pip/issues/59 > [3]: https://github.com/pypa/pip/issues?q=author%3Apradyunsg > > Eagerly-waiting-for-a-response-ly, > Pradyun Gedam > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > > > ? > > Donald Stufft > > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcappos at nyu.edu Fri Feb 10 14:33:47 2017 From: jcappos at nyu.edu (Justin Cappos) Date: Fri, 10 Feb 2017 14:33:47 -0500 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io> Message-ID: Yes, don't use a SAT solver. It requires all metadata from all packages (~30MB uncompressed) and gives hard to predict results in some cases. Also the lack of fixed dependencies is a substantial problem for a SAT solver. Overall, we think it makes more sense to use a simple backtracking dependency resolution algorithm. Sebastien Awwad (CCed) has been looking at a bunch of data around the speed and other tradeoffs of the different algos. Sebastien: Sometime next week, can you write it up in a way that is suitable for sharing? Justin On Fri, Feb 10, 2017 at 1:59 PM, Wes Turner wrote: > From the discussion on https://github.com/pypa/pip/ > issues/988#issuecomment-279033079: > > > - https://github.com/ContinuumIO/pycosat (picosat) > - https://github.com/ContinuumIO/pycosat/blob/master/pycosat.c (C) > - https://github.com/ContinuumIO/pycosat/blob/master/picosat.c > - https://github.com/ContinuumIO/pycosat/tree/master/examples > - https://github.com/enthought/sat-solver (MiniSat) > - https://github.com/enthought/sat-solver/tree/master/ > simplesat/tests > - https://github.com/enthought/sat-solver/blob/master/ > requirements.txt (PyYAML, enum34) > > > Is there a better way than SAT? > > On Fri, Feb 10, 2017 at 12:20 PM, Pradyun Gedam > wrote: > >> Yay! Thank you so much for a prompt and positive response! I'm pretty >> excited and looking forward to this. >> >> On Thu, Feb 9, 2017, 20:23 Donald Stufft wrote: >> >> I?ve never done it before, but I?m happy to provide mentoring on this. >> >> On Feb 8, 2017, at 9:15 PM, Pradyun Gedam wrote: >> >> Hello Everyone! >> >> Ralf Gommers suggested that I put this proposal here on this list, for >> feedback and for seeing if anyone would be willing to mentor me. So, here >> it is. >> >> ----- >> >> My name is Pradyun Gedam. I'm currently a first year student VIT >> University in India. >> >> I would like to apply for GSoC 2017 under PSF. >> >> I currently have a project in mind - the "pip needs a dependency >> resolver" issue [1]. I would like to take on this specific project but am >> willing to do some other project as well. >> >> For some background, around mid 2016, I started contributing to pip. The >> first issue I tackled was #59 [2] - a request for upgrade command and an >> upgrade-all command that has been open for over 5.5 years. Over the months >> following that, I've have had the opportunity to work with and understand >> multiple parts of pip's codebase while working on this issue and a few >> others. This search on GitHub issues [3] also provides a good summary of >> what work I've done on pip. >> >> [2]: https://github.com/pypa/pip/issues/988 >> [2]: https://github.com/pypa/pip/issues/59 >> [3]: https://github.com/pypa/pip/issues?q=author%3Apradyunsg >> >> Eagerly-waiting-for-a-response-ly, >> Pradyun Gedam >> >> _______________________________________________ >> Distutils-SIG maillist - Distutils-SIG at python.org >> https://mail.python.org/mailman/listinfo/distutils-sig >> >> >> >> ? >> >> Donald Stufft >> >> >> _______________________________________________ >> Distutils-SIG maillist - Distutils-SIG at python.org >> https://mail.python.org/mailman/listinfo/distutils-sig >> >> > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastienawwad at gmail.com Fri Feb 10 14:53:41 2017 From: sebastienawwad at gmail.com (Sebastien Awwad) Date: Fri, 10 Feb 2017 19:53:41 +0000 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io> Message-ID: While there may be some clever way of delivering changes to dependency metadata that consumes less bandwidth (The first delivery for a client will be somewhat large, but versioned metadata information plus compressed deltas could move clients from one version of the full metadata set to another, for example?), the larger problem, I think, is the lack of fixed dependencies. Even with a moderately small percentage of distributions having variable immediate dependencies, this expands out substantially when you consider all distributions that depend on those and so on, meaning that the full set of installed distributions when you run `pip install xyz==a.b.c` is surprisingly variable. In a series of install attempts run over about 400,000 of the package versions on PyPI last year, I found that simply changing the version of Python employed in an otherwise identical virtual environment results in pip installing different packages or package versions, for 16% of the distributions. If dependencies were knowable in static metadata, there would be a decent case for SAT solving. I'll try to get back to a write-up after the current rush on my main project subsides. On Fri, Feb 10, 2017 at 2:34 PM Justin Cappos wrote: > Yes, don't use a SAT solver. It requires all metadata from all packages > (~30MB uncompressed) and gives hard to predict results in some cases. > Also the lack of fixed dependencies is a substantial problem for a SAT > solver. Overall, we think it makes more sense to use a simple backtracking > dependency resolution algorithm. > > Sebastien Awwad (CCed) has been looking at a bunch of data around the > speed and other tradeoffs of the different algos. Sebastien: Sometime > next week, can you write it up in a way that is suitable for sharing? > > > Justin > > On Fri, Feb 10, 2017 at 1:59 PM, Wes Turner wrote: > > From the discussion on > https://github.com/pypa/pip/issues/988#issuecomment-279033079: > > > - https://github.com/ContinuumIO/pycosat (picosat) > - https://github.com/ContinuumIO/pycosat/blob/master/pycosat.c (C) > - https://github.com/ContinuumIO/pycosat/blob/master/picosat.c > - https://github.com/ContinuumIO/pycosat/tree/master/examples > - https://github.com/enthought/sat-solver (MiniSat) > - > https://github.com/enthought/sat-solver/tree/master/simplesat/tests > - > https://github.com/enthought/sat-solver/blob/master/requirements.txt (PyYAML, > enum34) > > > Is there a better way than SAT? > > On Fri, Feb 10, 2017 at 12:20 PM, Pradyun Gedam > wrote: > > Yay! Thank you so much for a prompt and positive response! I'm pretty > excited and looking forward to this. > > On Thu, Feb 9, 2017, 20:23 Donald Stufft wrote: > > I?ve never done it before, but I?m happy to provide mentoring on this. > > On Feb 8, 2017, at 9:15 PM, Pradyun Gedam wrote: > > Hello Everyone! > > Ralf Gommers suggested that I put this proposal here on this list, for > feedback and for seeing if anyone would be willing to mentor me. So, here > it is. > > ----- > > My name is Pradyun Gedam. I'm currently a first year student VIT > University in India. > > I would like to apply for GSoC 2017 under PSF. > > I currently have a project in mind - the "pip needs a dependency resolver" > issue [1]. I would like to take on this specific project but am willing to > do some other project as well. > > For some background, around mid 2016, I started contributing to pip. The > first issue I tackled was #59 [2] - a request for upgrade command and an > upgrade-all command that has been open for over 5.5 years. Over the months > following that, I've have had the opportunity to work with and understand > multiple parts of pip's codebase while working on this issue and a few > others. This search on GitHub issues [3] also provides a good summary of > what work I've done on pip. > > [2]: https://github.com/pypa/pip/issues/988 > [2]: https://github.com/pypa/pip/issues/59 > [3]: https://github.com/pypa/pip/issues?q=author%3Apradyunsg > > Eagerly-waiting-for-a-response-ly, > Pradyun Gedam > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > > > ? > > Donald Stufft > > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Fri Feb 10 15:52:03 2017 From: cournape at gmail.com (David Cournapeau) Date: Fri, 10 Feb 2017 15:52:03 -0500 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io> Message-ID: On Fri, Feb 10, 2017 at 2:33 PM, Justin Cappos wrote: > Yes, don't use a SAT solver. It requires all metadata from all packages > (~30MB uncompressed) and gives hard to predict results in some cases. > I doubt there exists an algorithm where this is not the case. Also the lack of fixed dependencies is a substantial problem for a SAT > solver. Overall, we think it makes more sense to use a simple backtracking > dependency resolution algorithm. > As soon as you want to deal with version ranges and ensure consistency of the installed packages, backtracking stops being simple rather quickly. I agree lack of fixed dependencies is an issue, but I doubt it is specific to a SAT solver. SAT solvers have been used successfully in many cases now: composer (php), dnf (Red Hat/Fedora), conda or our own packages manager at Enthought in python, 0install. I would certainly be interested in seeing a proper comparison with other algorithms. David > Sebastien Awwad (CCed) has been looking at a bunch of data around the > speed and other tradeoffs of the different algos. Sebastien: Sometime > next week, can you write it up in a way that is suitable for sharing? > > Justin > > On Fri, Feb 10, 2017 at 1:59 PM, Wes Turner wrote: > >> From the discussion on https://github.com/pypa/pip/is >> sues/988#issuecomment-279033079: >> >> >> - https://github.com/ContinuumIO/pycosat (picosat) >> - https://github.com/ContinuumIO/pycosat/blob/master/pycosat.c (C) >> - https://github.com/ContinuumIO/pycosat/blob/master/picosat.c >> - https://github.com/ContinuumIO/pycosat/tree/master/examples >> - https://github.com/enthought/sat-solver (MiniSat) >> - https://github.com/enthought/sat-solver/tree/master/simplesa >> t/tests >> - https://github.com/enthought/sat-solver/blob/master/requirem >> ents.txt (PyYAML, enum34) >> >> >> Is there a better way than SAT? >> >> On Fri, Feb 10, 2017 at 12:20 PM, Pradyun Gedam >> wrote: >> >>> Yay! Thank you so much for a prompt and positive response! I'm pretty >>> excited and looking forward to this. >>> >>> On Thu, Feb 9, 2017, 20:23 Donald Stufft wrote: >>> >>> I?ve never done it before, but I?m happy to provide mentoring on this. >>> >>> On Feb 8, 2017, at 9:15 PM, Pradyun Gedam wrote: >>> >>> Hello Everyone! >>> >>> Ralf Gommers suggested that I put this proposal here on this list, for >>> feedback and for seeing if anyone would be willing to mentor me. So, here >>> it is. >>> >>> ----- >>> >>> My name is Pradyun Gedam. I'm currently a first year student VIT >>> University in India. >>> >>> I would like to apply for GSoC 2017 under PSF. >>> >>> I currently have a project in mind - the "pip needs a dependency >>> resolver" issue [1]. I would like to take on this specific project but am >>> willing to do some other project as well. >>> >>> For some background, around mid 2016, I started contributing to pip. The >>> first issue I tackled was #59 [2] - a request for upgrade command and an >>> upgrade-all command that has been open for over 5.5 years. Over the months >>> following that, I've have had the opportunity to work with and understand >>> multiple parts of pip's codebase while working on this issue and a few >>> others. This search on GitHub issues [3] also provides a good summary of >>> what work I've done on pip. >>> >>> [2]: https://github.com/pypa/pip/issues/988 >>> [2]: https://github.com/pypa/pip/issues/59 >>> [3]: https://github.com/pypa/pip/issues?q=author%3Apradyunsg >>> >>> Eagerly-waiting-for-a-response-ly, >>> Pradyun Gedam >>> >>> _______________________________________________ >>> Distutils-SIG maillist - Distutils-SIG at python.org >>> https://mail.python.org/mailman/listinfo/distutils-sig >>> >>> >>> >>> ? >>> >>> Donald Stufft >>> >>> >>> _______________________________________________ >>> Distutils-SIG maillist - Distutils-SIG at python.org >>> https://mail.python.org/mailman/listinfo/distutils-sig >>> >>> >> >> _______________________________________________ >> Distutils-SIG maillist - Distutils-SIG at python.org >> https://mail.python.org/mailman/listinfo/distutils-sig >> >> > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Fri Feb 10 16:03:33 2017 From: cournape at gmail.com (David Cournapeau) Date: Fri, 10 Feb 2017 16:03:33 -0500 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io> Message-ID: On Fri, Feb 10, 2017 at 3:52 PM, David Cournapeau wrote: > > > On Fri, Feb 10, 2017 at 2:33 PM, Justin Cappos wrote: > >> Yes, don't use a SAT solver. It requires all metadata from all packages >> (~30MB uncompressed) and gives hard to predict results in some cases. >> > > I doubt there exists an algorithm where this is not the case. > > Also the lack of fixed dependencies is a substantial problem for a SAT >> solver. Overall, we think it makes more sense to use a simple backtracking >> dependency resolution algorithm. >> > > As soon as you want to deal with version ranges and ensure consistency of > the installed packages, backtracking stops being simple rather quickly. > > I agree lack of fixed dependencies is an issue, but I doubt it is specific > to a SAT solver. SAT solvers have been used successfully in many cases now: > composer (php), dnf (Red Hat/Fedora), conda or our own packages manager at > Enthought in python, 0install. > > I would certainly be interested in seeing a proper comparison with other > algorithms. > I don't have experience implementing non SAT dependency solvers, but I suspect that whatever algorithm you end up using, the "core" is the simple part, and tweaking heuristics will be the hard, developer-time consuming part. David > > David > > >> Sebastien Awwad (CCed) has been looking at a bunch of data around the >> speed and other tradeoffs of the different algos. Sebastien: Sometime >> next week, can you write it up in a way that is suitable for sharing? >> >> Justin >> >> On Fri, Feb 10, 2017 at 1:59 PM, Wes Turner wrote: >> >>> From the discussion on https://github.com/pypa/pip/is >>> sues/988#issuecomment-279033079: >>> >>> >>> - https://github.com/ContinuumIO/pycosat (picosat) >>> - https://github.com/ContinuumIO/pycosat/blob/master/pycosat.c (C) >>> - https://github.com/ContinuumIO/pycosat/blob/master/picosat.c >>> - https://github.com/ContinuumIO/pycosat/tree/master/examples >>> - https://github.com/enthought/sat-solver (MiniSat) >>> - https://github.com/enthought/sat-solver/tree/master/simplesa >>> t/tests >>> - https://github.com/enthought/sat-solver/blob/master/requirem >>> ents.txt (PyYAML, enum34) >>> >>> >>> Is there a better way than SAT? >>> >>> On Fri, Feb 10, 2017 at 12:20 PM, Pradyun Gedam >>> wrote: >>> >>>> Yay! Thank you so much for a prompt and positive response! I'm pretty >>>> excited and looking forward to this. >>>> >>>> On Thu, Feb 9, 2017, 20:23 Donald Stufft wrote: >>>> >>>> I?ve never done it before, but I?m happy to provide mentoring on this. >>>> >>>> On Feb 8, 2017, at 9:15 PM, Pradyun Gedam wrote: >>>> >>>> Hello Everyone! >>>> >>>> Ralf Gommers suggested that I put this proposal here on this list, for >>>> feedback and for seeing if anyone would be willing to mentor me. So, here >>>> it is. >>>> >>>> ----- >>>> >>>> My name is Pradyun Gedam. I'm currently a first year student VIT >>>> University in India. >>>> >>>> I would like to apply for GSoC 2017 under PSF. >>>> >>>> I currently have a project in mind - the "pip needs a dependency >>>> resolver" issue [1]. I would like to take on this specific project but am >>>> willing to do some other project as well. >>>> >>>> For some background, around mid 2016, I started contributing to pip. >>>> The first issue I tackled was #59 [2] - a request for upgrade command and >>>> an upgrade-all command that has been open for over 5.5 years. Over the >>>> months following that, I've have had the opportunity to work with and >>>> understand multiple parts of pip's codebase while working on this issue and >>>> a few others. This search on GitHub issues [3] also provides a good summary >>>> of what work I've done on pip. >>>> >>>> [2]: https://github.com/pypa/pip/issues/988 >>>> [2]: https://github.com/pypa/pip/issues/59 >>>> [3]: https://github.com/pypa/pip/issues?q=author%3Apradyunsg >>>> >>>> Eagerly-waiting-for-a-response-ly, >>>> Pradyun Gedam >>>> >>>> _______________________________________________ >>>> Distutils-SIG maillist - Distutils-SIG at python.org >>>> https://mail.python.org/mailman/listinfo/distutils-sig >>>> >>>> >>>> >>>> ? >>>> >>>> Donald Stufft >>>> >>>> >>>> _______________________________________________ >>>> Distutils-SIG maillist - Distutils-SIG at python.org >>>> https://mail.python.org/mailman/listinfo/distutils-sig >>>> >>>> >>> >>> _______________________________________________ >>> Distutils-SIG maillist - Distutils-SIG at python.org >>> https://mail.python.org/mailman/listinfo/distutils-sig >>> >>> >> >> _______________________________________________ >> Distutils-SIG maillist - Distutils-SIG at python.org >> https://mail.python.org/mailman/listinfo/distutils-sig >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcappos at nyu.edu Fri Feb 10 16:22:56 2017 From: jcappos at nyu.edu (Justin Cappos) Date: Fri, 10 Feb 2017 16:22:56 -0500 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io> Message-ID: On Fri, Feb 10, 2017 at 3:52 PM, David Cournapeau wrote: > > > On Fri, Feb 10, 2017 at 2:33 PM, Justin Cappos wrote: > >> Yes, don't use a SAT solver. It requires all metadata from all packages >> (~30MB uncompressed) and gives hard to predict results in some cases. >> > > I doubt there exists an algorithm where this is not the case. > Okay, so there was a discussion about the pros and cons (including algorithms like backtracking dependency resolution which do not require all metadata) a while back on the mailing list: https://mail.python.org/pipermail/distutils-sig/2015-April/026157.html (I believe you may have seen this before because you replied to a message further down in the thread.) > Also the lack of fixed dependencies is a substantial problem for a SAT >> solver. Overall, we think it makes more sense to use a simple backtracking >> dependency resolution algorithm. >> > > As soon as you want to deal with version ranges and ensure consistency of > the installed packages, backtracking stops being simple rather quickly. > Can you explain why you think this is true? I agree lack of fixed dependencies is an issue, but I doubt it is specific > to a SAT solver. SAT solvers have been used successfully in many cases now: > composer (php), dnf (Red Hat/Fedora), conda or our own packages manager at > Enthought in python, 0install. > > I would certainly be interested in seeing a proper comparison with other > algorithms. > Sure, there are different tradeoffs which make sense in different domains. Certainly, if you have a relatively small set of packages with statically defined dependencies and already are distributing all package metadata to clients, a SAT solver will be faster at resolving complex dependency issues. We can provide the data we gathered (maybe others provide get some data too?) and then the discussion will be more grounded with numbers. Thanks, Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcappos at nyu.edu Fri Feb 10 16:28:03 2017 From: jcappos at nyu.edu (Justin Cappos) Date: Fri, 10 Feb 2017 16:28:03 -0500 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io> Message-ID: So, there aren't "heuristics" to tweak here. The algorithm just encodes the rules for trying package combinations (usually, latest version first) and then backtracks to a previous point when an unresolvable conflict is found. This is quite different from something like a SAT solver where it does use heuristics to come up with a matching scenario quickly. I don't think developers need to tweak heuristics in either case. You just pick your SAT solver and it has reasonable heuristics built in, right? Thanks, Justin On Fri, Feb 10, 2017 at 4:03 PM, David Cournapeau wrote: > > > On Fri, Feb 10, 2017 at 3:52 PM, David Cournapeau > wrote: > >> >> >> On Fri, Feb 10, 2017 at 2:33 PM, Justin Cappos wrote: >> >>> Yes, don't use a SAT solver. It requires all metadata from all packages >>> (~30MB uncompressed) and gives hard to predict results in some cases. >>> >> >> I doubt there exists an algorithm where this is not the case. >> >> Also the lack of fixed dependencies is a substantial problem for a SAT >>> solver. Overall, we think it makes more sense to use a simple backtracking >>> dependency resolution algorithm. >>> >> >> As soon as you want to deal with version ranges and ensure consistency of >> the installed packages, backtracking stops being simple rather quickly. >> >> I agree lack of fixed dependencies is an issue, but I doubt it is >> specific to a SAT solver. SAT solvers have been used successfully in many >> cases now: composer (php), dnf (Red Hat/Fedora), conda or our own packages >> manager at Enthought in python, 0install. >> >> I would certainly be interested in seeing a proper comparison with other >> algorithms. >> > > I don't have experience implementing non SAT dependency solvers, but I > suspect that whatever algorithm you end up using, the "core" is the simple > part, and tweaking heuristics will be the hard, developer-time consuming > part. > > David > >> >> David >> >> >>> Sebastien Awwad (CCed) has been looking at a bunch of data around the >>> speed and other tradeoffs of the different algos. Sebastien: Sometime >>> next week, can you write it up in a way that is suitable for sharing? >>> >>> Justin >>> >>> On Fri, Feb 10, 2017 at 1:59 PM, Wes Turner >>> wrote: >>> >>>> From the discussion on https://github.com/pypa/pip/is >>>> sues/988#issuecomment-279033079: >>>> >>>> >>>> - https://github.com/ContinuumIO/pycosat (picosat) >>>> - https://github.com/ContinuumIO/pycosat/blob/master/pycosat.c >>>> (C) >>>> - https://github.com/ContinuumIO/pycosat/blob/master/picosat.c >>>> - https://github.com/ContinuumIO/pycosat/tree/master/examples >>>> - https://github.com/enthought/sat-solver (MiniSat) >>>> - https://github.com/enthought/sat-solver/tree/master/simplesa >>>> t/tests >>>> - https://github.com/enthought/sat-solver/blob/master/requirem >>>> ents.txt (PyYAML, enum34) >>>> >>>> >>>> Is there a better way than SAT? >>>> >>>> On Fri, Feb 10, 2017 at 12:20 PM, Pradyun Gedam >>>> wrote: >>>> >>>>> Yay! Thank you so much for a prompt and positive response! I'm pretty >>>>> excited and looking forward to this. >>>>> >>>>> On Thu, Feb 9, 2017, 20:23 Donald Stufft wrote: >>>>> >>>>> I?ve never done it before, but I?m happy to provide mentoring on this. >>>>> >>>>> On Feb 8, 2017, at 9:15 PM, Pradyun Gedam wrote: >>>>> >>>>> Hello Everyone! >>>>> >>>>> Ralf Gommers suggested that I put this proposal here on this list, for >>>>> feedback and for seeing if anyone would be willing to mentor me. So, here >>>>> it is. >>>>> >>>>> ----- >>>>> >>>>> My name is Pradyun Gedam. I'm currently a first year student VIT >>>>> University in India. >>>>> >>>>> I would like to apply for GSoC 2017 under PSF. >>>>> >>>>> I currently have a project in mind - the "pip needs a dependency >>>>> resolver" issue [1]. I would like to take on this specific project but am >>>>> willing to do some other project as well. >>>>> >>>>> For some background, around mid 2016, I started contributing to pip. >>>>> The first issue I tackled was #59 [2] - a request for upgrade command and >>>>> an upgrade-all command that has been open for over 5.5 years. Over the >>>>> months following that, I've have had the opportunity to work with and >>>>> understand multiple parts of pip's codebase while working on this issue and >>>>> a few others. This search on GitHub issues [3] also provides a good summary >>>>> of what work I've done on pip. >>>>> >>>>> [2]: https://github.com/pypa/pip/issues/988 >>>>> [2]: https://github.com/pypa/pip/issues/59 >>>>> [3]: https://github.com/pypa/pip/issues?q=author%3Apradyunsg >>>>> >>>>> Eagerly-waiting-for-a-response-ly, >>>>> Pradyun Gedam >>>>> >>>>> _______________________________________________ >>>>> Distutils-SIG maillist - Distutils-SIG at python.org >>>>> https://mail.python.org/mailman/listinfo/distutils-sig >>>>> >>>>> >>>>> >>>>> ? >>>>> >>>>> Donald Stufft >>>>> >>>>> >>>>> _______________________________________________ >>>>> Distutils-SIG maillist - Distutils-SIG at python.org >>>>> https://mail.python.org/mailman/listinfo/distutils-sig >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> Distutils-SIG maillist - Distutils-SIG at python.org >>>> https://mail.python.org/mailman/listinfo/distutils-sig >>>> >>>> >>> >>> _______________________________________________ >>> Distutils-SIG maillist - Distutils-SIG at python.org >>> https://mail.python.org/mailman/listinfo/distutils-sig >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Fri Feb 10 16:36:14 2017 From: donald at stufft.io (Donald Stufft) Date: Fri, 10 Feb 2017 16:36:14 -0500 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io> Message-ID: <465A9CA2-CACE-40DE-B20A-47058FDDEDD2@stufft.io> > On Feb 10, 2017, at 2:53 PM, Sebastien Awwad wrote: > > If dependencies were knowable in static metadata, there would be a decent case for SAT solving. I'll try to get back to a write-up after the current rush on my main project subsides. The differences between backtracking and SAT solvers and such is perhaps a bit of of my depth, but just FWIW when installing from Wheel it?s basically just waiting on a new API to get this information in a static form. Installing from sdist still has the problem (and likely will forever) but I think it?s not *unreasonable* to say that using wheels is what you need to do to get fast dep solving and if people aren?t providing wheels it will be slow(er?). ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Fri Feb 10 16:58:15 2017 From: cournape at gmail.com (David Cournapeau) Date: Fri, 10 Feb 2017 16:58:15 -0500 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io> Message-ID: On Fri, Feb 10, 2017 at 4:28 PM, Justin Cappos wrote: > So, there aren't "heuristics" to tweak here. The algorithm just encodes > the rules for trying package combinations (usually, latest version first) > and then backtracks to a previous point when an unresolvable conflict is > found. > > This is quite different from something like a SAT solver where it does use > heuristics to come up with a matching scenario quickly. > > I don't think developers need to tweak heuristics in either case. You > just pick your SAT solver and it has reasonable heuristics built in, right? > Right, so there are 2 set of heuristics: the heuristics to make SAT solvers more efficient, and heuristics to make it more useful as a dependency resolution algorithm. I am only interested in the 2nd set of heuristics here. So for SAT solvers at least, you need heuristics to tweak the search space toward something more likely solutions (from a dependency POV). E.g. composer will favor already installed packages if they match the problem. That's also why it is rather hard to use a SAT solver as a black box and then wrap it to resolve dependencies, and you instead want to have access to the SAT solver "internals". Don't you need the same kind of heuristics to make backtracking actually useful ? I agree comparing on actual problems is the best way to move this discussion forward, to compare speed, solution quality, feasibility in pip's/pypi context. If you have access to "scenarios", I would be happy to run our own SAT solver on it to compare solver's output. David -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcappos at nyu.edu Fri Feb 10 17:04:57 2017 From: jcappos at nyu.edu (Justin Cappos) Date: Fri, 10 Feb 2017 17:04:57 -0500 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: <465A9CA2-CACE-40DE-B20A-47058FDDEDD2@stufft.io> References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io> <465A9CA2-CACE-40DE-B20A-47058FDDEDD2@stufft.io> Message-ID: I think the difference Sebastien is trying to say is that you need info from *all* pieces of static metadata. Not just that from the packages you will end up installing. Backtracking dependency resolution will be much more like the wheel model. If one does not backtrack (which is true most of the time), it only needs the metadata from the things you end up install. Justin On Fri, Feb 10, 2017 at 4:36 PM, Donald Stufft wrote: > > On Feb 10, 2017, at 2:53 PM, Sebastien Awwad > wrote: > > If dependencies were knowable in static metadata, there would be a decent > case for SAT solving. I'll try to get back to a write-up after the current > rush on my main project subsides. > > > > The differences between backtracking and SAT solvers and such is perhaps a > bit of of my depth, but just FWIW when installing from Wheel it?s basically > just waiting on a new API to get this information in a static form. > Installing from sdist still has the problem (and likely will forever) but I > think it?s not *unreasonable* to say that using wheels is what you need to > do to get fast dep solving and if people aren?t providing wheels it will be > slow(er?). > > ? > Donald Stufft > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bussonniermatthias at gmail.com Fri Feb 10 17:06:57 2017 From: bussonniermatthias at gmail.com (Matthias Bussonnier) Date: Fri, 10 Feb 2017 14:06:57 -0800 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: <465A9CA2-CACE-40DE-B20A-47058FDDEDD2@stufft.io> References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io> <465A9CA2-CACE-40DE-B20A-47058FDDEDD2@stufft.io> Message-ID: Hi all, Assuming that all the requirements are wheels and coming from PyPI. Installed using a recent pip. How often do you think the resolution will be the same for all clients, and mostly be "pull everything from latest" ? If so, would it make sense to pre-compute thing on PyPI/warehouse at package publication time, and provide a resolution "hint" as an API endpoint ? If this "hint" is correct, it should avoid clientside work most of time. And the resolution can probably be efficiently updated as you only have to re-solve by looking as the dependees of previous version. -- M On Fri, Feb 10, 2017 at 1:36 PM, Donald Stufft wrote: > > On Feb 10, 2017, at 2:53 PM, Sebastien Awwad > wrote: > > If dependencies were knowable in static metadata, there would be a decent > case for SAT solving. I'll try to get back to a write-up after the current > rush on my main project subsides. > > > > The differences between backtracking and SAT solvers and such is perhaps a > bit of of my depth, but just FWIW when installing from Wheel it?s basically > just waiting on a new API to get this information in a static form. > Installing from sdist still has the problem (and likely will forever) but I > think it?s not *unreasonable* to say that using wheels is what you need to > do to get fast dep solving and if people aren?t providing wheels it will be > slow(er?). > > ? > Donald Stufft > > > > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > From ncoghlan at gmail.com Sat Feb 11 02:35:16 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 11 Feb 2017 08:35:16 +0100 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io> <465A9CA2-CACE-40DE-B20A-47058FDDEDD2@stufft.io> Message-ID: On 10 Feb 2017 23:05, "Justin Cappos" wrote: I think the difference Sebastien is trying to say is that you need info from *all* pieces of static metadata. Not just that from the packages you will end up installing. Backtracking dependency resolution will be much more like the wheel model. If one does not backtrack (which is true most of the time), it only needs the metadata from the things you end up install. This is key for PyPI I think - for the yum -> dnf transition, one of the biggest still unsolved problems is the increase in the amount of metadata that needs to be transferred (although the file lists used for install-by-filename are a big contributing factor to that). You can fairly readily see this in Docker container builds that rely on dnf - even on a fast connection, you may spend more than a minute downloading dependency metadata. It would take a *lot* of server round trips for per-package metadata retrieval to start comparing to bulk download times for the metadata for 90k+ packages. Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokoproject at gmail.com Sat Feb 11 01:57:41 2017 From: gokoproject at gmail.com (John Wong) Date: Sat, 11 Feb 2017 01:57:41 -0500 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io> <465A9CA2-CACE-40DE-B20A-47058FDDEDD2@stufft.io> Message-ID: On Fri, Feb 10, 2017 at 5:06 PM, Matthias Bussonnier < bussonniermatthias at gmail.com> wrote: > Hi all, > > Assuming that all the requirements are wheels and coming from PyPI. > Installed using a recent pip. > > How often do you think the resolution will be the same for all > clients, and mostly be "pull everything from latest" ? I don't think there is anyway around not precompute IMO. But perhaps I am complicating things. what about non-pypi packages like git source? -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Sun Feb 12 19:12:24 2017 From: cournape at gmail.com (David Cournapeau) Date: Sun, 12 Feb 2017 19:12:24 -0500 Subject: [Distutils] PyCon Colombia 2017 keynote on packaging Message-ID: Hi, I was invited to give a talk at PyCon Colombia 2017, and I did it on packaging. I thought people here would be interested to know about it. I insisted on the need for packaging to get software into as many hands as possible, gave a history of the packaging ecosystem, advised people to use packaging.python.org suggestions, and mentioned the manylinux effort. I tried to be as objective as possible there and mention the key people involved. I also talked a bit about what can still be improved, and focused on 3 aspects, none of which are new nor particularly insightful for people here: infrastructure for automatic wheel building, better decoupling of packaging and build, and maybe more controversially, the need for tools to remove python from the equation. https://speakerdeck.com/cournape/python-packaging-in-2017 David -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Feb 13 05:01:36 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 13 Feb 2017 11:01:36 +0100 Subject: [Distutils] PyCon Colombia 2017 keynote on packaging In-Reply-To: References: Message-ID: On 13 Feb 2017 1:20 am, "David Cournapeau" wrote: Hi, I was invited to give a talk at PyCon Colombia 2017, and I did it on packaging. I thought people here would be interested to know about it. Thanks for the heads up! I also talked a bit about what can still be improved, and focused on 3 aspects, none of which are new nor particularly insightful for people here: infrastructure for automatic wheel building, better decoupling of packaging and build, and maybe more controversially, the need for tools to remove python from the equation. https://speakerdeck.com/cournape/python-packaging-in-2017 Yeah, I think that's a good summary of where things are right now, and where we'd like to go next. Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From radomir at dopieralski.pl Mon Feb 13 07:17:28 2017 From: radomir at dopieralski.pl (Radomir Dopieralski) Date: Mon, 13 Feb 2017 13:17:28 +0100 Subject: [Distutils] Trove classifiers for MicroPython? In-Reply-To: <20161020114316.0b587052@ghostwheel> References: <20161020114316.0b587052@ghostwheel> Message-ID: <20170213131728.340bf9dd@ghostwheel> Is this the right place to ask for this? It has been over four months already, and there is no action on this. Perhaps there is some more official way to request this that I am missing? On Thu, 20 Oct 2016 11:43:16 +0200 Radomir Dopieralski wrote: > Hello everyone, > > I'm not sure this is the right place to write to propose new trove > classifiers for PyPi -- if it's not, what would be the right place? > If this is it, then please read below. > > The MicroPython project is quickly growing and becoming more mature, > and as that happens, the number of 3rd-party libraries for it grows. > Many of those libraries get uploaded to PyPi, as you can check by > searching for "micropython". MicroPython has even its own version of > "pip", called "upip", that can be used to install those libraries. > > However, there is as of yet no way to mark that a library is written > for that particular flavor of Python, as there are no trove > classifiers for it. I would like to propose adding a number of > classifiers to amend that situation: > > For the MicroPython itself: > > Programming Language :: Python :: Implementation :: MicroPython > > For the hardware it runs on: > > Operating System :: Baremetal > Environment :: Microcontroller > Environment :: Microcontroller :: PyBoard > Environment :: Microcontroller :: ESP8266 > Environment :: Microcontroller :: Micro:bit > Environment :: Microcontroller :: WiPy > Environment :: Microcontroller :: LoPy > Environment :: Microcontroller :: OpenMV > > I'm not sure if the latter makes sense, but it would certainly be > nice to be able to indicate in a machine-parseable way on which > platforms the code works. > > What do you think? -- Radomir Dopieralski From thomas at kluyver.me.uk Mon Feb 13 12:25:37 2017 From: thomas at kluyver.me.uk (Thomas Kluyver) Date: Mon, 13 Feb 2017 17:25:37 +0000 Subject: [Distutils] Indexing modules in Python distributions In-Reply-To: <20170209221822.GS12827@yuggoth.org> References: <1486466970.2092062.872926736.0C8AF205@webmail.messagingengine.com> <1486577678.268328.874662408.717603DB@webmail.messagingengine.com> <20170209221822.GS12827@yuggoth.org> Message-ID: <1487006737.1298666.879579024.66649E72@webmail.messagingengine.com> Thanks. So the current size is about 0.5 TB, and presumably if people are maintaining full mirrors, PyPI itself can cope with that much outgoing bandwidth being used. Steve & Chris: does downloading & scanning that volume of data sound like something you'd want to do on Azure? Does anyone there have some time to put in to move this forwards? Thomas On Thu, Feb 9, 2017, at 10:18 PM, Jeremy Stanley wrote: > On 2017-02-08 18:14:38 +0000 (+0000), Thomas Kluyver wrote: > [...] > > What I'm proposing differs in that it would need to download files from > > PyPI - basically all of them, if we're thorough about it. I imagine > > that's going to involve a lot of data transfer. Do we know what order of > > magnitude we're talking about? > [...] > > The crowd I run with uses https://pypi.org/project/bandersnatch/ to > maintain a full PyPI mirror for our project's distributed CI system, > and du says the current aggregate size is 488GiB. Also if you want > to initialize a full mirror this way, plan for it to take several > days to populate. > -- > Jeremy Stanley > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig From donald at stufft.io Mon Feb 13 12:35:37 2017 From: donald at stufft.io (Donald Stufft) Date: Mon, 13 Feb 2017 12:35:37 -0500 Subject: [Distutils] Indexing modules in Python distributions In-Reply-To: <1487006737.1298666.879579024.66649E72@webmail.messagingengine.com> References: <1486466970.2092062.872926736.0C8AF205@webmail.messagingengine.com> <1486577678.268328.874662408.717603DB@webmail.messagingengine.com> <20170209221822.GS12827@yuggoth.org> <1487006737.1298666.879579024.66649E72@webmail.messagingengine.com> Message-ID: > On Feb 13, 2017, at 12:25 PM, Thomas Kluyver wrote: > > Thanks. So the current size is about 0.5 TB, and presumably if people > are maintaining full mirrors, PyPI itself can cope with that much > outgoing bandwidth being used. > Yea, PyPI does something like 16TB a day of bandwidth :) ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From jim at jimfulton.info Mon Feb 13 15:51:09 2017 From: jim at jimfulton.info (Jim Fulton) Date: Mon, 13 Feb 2017 15:51:09 -0500 Subject: [Distutils] Announcing experimental wheel support in Buildout Message-ID: I've just released zc.buildout 2.8.0 and the buildout.wheel extension. If you have zc.buildout 2.8.0 or later, and you include: extensions = buildout.wheel In the buildout section of your buildout configuration, then buildout should be able to install distributions as wheels. This allowed me to install numpy using buildout, which wasn't possible before. This is a someone experimental version, which uses humpty to convert wheels to eggs. humpty in term uses uses distlib which seems to mishandle wheel metadata. (For example, it chokes if there's extra distribution meta and makes it impossible for buildout to install python-dateutil from a wheel.) Jim -- Jim Fulton http://jimfulton.info -------------- next part -------------- An HTML attachment was scrubbed... URL: From vinay_sajip at yahoo.co.uk Tue Feb 14 13:10:06 2017 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 14 Feb 2017 18:10:06 +0000 (UTC) Subject: [Distutils] distlib and wheel metadata References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> Message-ID: <2019192621.7718748.1487095806195@mail.yahoo.com> >?humpty in term uses uses distlib which seems to mishandle wheel> metadata. (For example, it chokes if there's extra distribution meta and > makes it impossible for buildout to install python-dateutil from a wheel.) I looked into the "mishandling". It's that the other tools don't adhere to [the current state of] PEP 426 as closely as distlib does. For example, wheel writes JSON metadata to metadata.json in the .dist-info directory, whereas PEP 426 calls for that data to be in pydist.json. The non-JSON metadata in the wheel (the METADATA file) does not strictly adhere to any of the metadata PEPs 241, 314, 345 or 426 (it has a mixture of incompatible fields). I can change distlib to look for metadata.json, and relax the rules to be more liberal regarding which fields to accept, but adhering to the PEP isn't mishandling things, as I see it. Work on distlib has slowed right down since around the time when PEP 426 was deferred indefinitely, and there seems to be little interest in progressing via metadata or other standardisation - we have to go by what the de facto tools (setuptools, wheel) choose to do. It's not an ideal situation, and incompatibilities can crop up, as you've seen. Regards, Vinay Sajip -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Feb 14 13:15:59 2017 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 14 Feb 2017 10:15:59 -0800 Subject: [Distutils] distlib and wheel metadata In-Reply-To: <2019192621.7718748.1487095806195@mail.yahoo.com> References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> Message-ID: On Tue, Feb 14, 2017 at 10:10 AM, Vinay Sajip via Distutils-SIG wrote: >> humpty in term uses uses distlib which seems to mishandle wheel >> metadata. (For example, it chokes if there's extra distribution meta and >> makes it impossible for buildout to install python-dateutil from a wheel.) > > I looked into the "mishandling". It's that the other tools don't adhere to > [the current state of] PEP 426 as closely as distlib does. For example, > wheel writes JSON metadata to metadata.json in the .dist-info directory, > whereas PEP 426 calls for that data to be in pydist.json. The non-JSON > metadata in the wheel (the METADATA file) does not strictly adhere to any of > the metadata PEPs 241, 314, 345 or 426 (it has a mixture of incompatible > fields). > > I can change distlib to look for metadata.json, and relax the rules to be > more liberal regarding which fields to accept, but adhering to the PEP isn't > mishandling things, as I see it. I thought the current status was that it's called metadata.json exactly *because* it's not standardized, and you *shouldn't* look at it? It's too bad that the JSON thing didn't work out, but I think we're better off working on better specifying the one source of truth everything already uses (METADATA) instead of bringing in *new* partially-incompatible-and-poorly-specified formats. -n -- Nathaniel J. Smith -- https://vorpus.org From jim at jimfulton.info Tue Feb 14 13:36:47 2017 From: jim at jimfulton.info (Jim Fulton) Date: Tue, 14 Feb 2017 13:36:47 -0500 Subject: [Distutils] distlib and wheel metadata In-Reply-To: <2019192621.7718748.1487095806195@mail.yahoo.com> References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> Message-ID: On Tue, Feb 14, 2017 at 1:10 PM, Vinay Sajip wrote: > > humpty in term uses uses distlib which seems to mishandle wheel > > metadata. (For example, it chokes if there's extra distribution meta and > > makes it impossible for buildout to install python-dateutil from a > wheel.) > > I looked into the "mishandling". It's that the other tools don't adhere to > [the current state of] PEP 426 as closely as distlib does. For example, > wheel writes JSON metadata to metadata.json in the .dist-info directory, > whereas PEP 426 calls for that data to be in pydist.json. The non-JSON > metadata in the wheel (the METADATA file) does not strictly adhere to any > of the metadata PEPs 241, 314, 345 or 426 (it has a mixture of incompatible > fields). > > I can change distlib to look for metadata.json, and relax the rules to be > more liberal regarding which fields to accept, but adhering to the PEP > isn't mishandling things, as I see it. > Fair enough. Notice that I said "seems to". :-] I suppose whether to be strict or not depends on use case. In my case, I was just trying to install a wheel as an egg, so permissive is definately what *I* want. Other use cases might want to be more strict. > > Work on distlib has slowed right down since around the time when PEP 426 > was deferred indefinitely, and there seems to be little interest in > progressing via metadata or other standardisation - we have to go by what > the de facto tools (setuptools, wheel) choose to do. It's not an ideal > situation, and incompatibilities can crop up, as you've seen. > Nope. Honestly, though, I wish there was *one* *library* that defined the standard, which was the case for setuptools for a while (yeah, I know, the warts, really, I know) because I really don't think there's a desire to innovate or a reason for competition at this level. In the case of wheel, perhaps it makes sense for that implementation to be authoritative. Thanks. Jim -- Jim Fulton http://jimfulton.info -------------- next part -------------- An HTML attachment was scrubbed... URL: From dholth at gmail.com Tue Feb 14 13:38:01 2017 From: dholth at gmail.com (Daniel Holth) Date: Tue, 14 Feb 2017 18:38:01 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> Message-ID: I would accept a pull request to stop generating metadata.json in bdist_wheel. On Tue, Feb 14, 2017 at 1:16 PM Nathaniel Smith wrote: > On Tue, Feb 14, 2017 at 10:10 AM, Vinay Sajip via Distutils-SIG > wrote: > >> humpty in term uses uses distlib which seems to mishandle wheel > >> metadata. (For example, it chokes if there's extra distribution meta and > >> makes it impossible for buildout to install python-dateutil from a > wheel.) > > > > I looked into the "mishandling". It's that the other tools don't adhere > to > > [the current state of] PEP 426 as closely as distlib does. For example, > > wheel writes JSON metadata to metadata.json in the .dist-info directory, > > whereas PEP 426 calls for that data to be in pydist.json. The non-JSON > > metadata in the wheel (the METADATA file) does not strictly adhere to > any of > > the metadata PEPs 241, 314, 345 or 426 (it has a mixture of incompatible > > fields). > > > > I can change distlib to look for metadata.json, and relax the rules to be > > more liberal regarding which fields to accept, but adhering to the PEP > isn't > > mishandling things, as I see it. > > I thought the current status was that it's called metadata.json > exactly *because* it's not standardized, and you *shouldn't* look at > it? > > It's too bad that the JSON thing didn't work out, but I think we're > better off working on better specifying the one source of truth > everything already uses (METADATA) instead of bringing in *new* > partially-incompatible-and-poorly-specified formats. > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Tue Feb 14 14:40:17 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 14 Feb 2017 19:40:17 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> Message-ID: On 14 February 2017 at 18:36, Jim Fulton wrote: > I wish there was *one* *library* that defined the standard packaging should be that library, but it doesn't cover metadata precisely because that PEP 426 hasn't been accepted (it doesn't try to cover the historical metadata 1.x standards, or "de facto" standards that aren't backed by a PEP AIUI). Paul From vinay_sajip at yahoo.co.uk Tue Feb 14 14:40:56 2017 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 14 Feb 2017 19:40:56 +0000 (UTC) Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> Message-ID: <108655704.7820975.1487101256733@mail.yahoo.com> > Nope. Honestly, though, I wish there was *one* *library* that defined the standard, > which was the case for setuptools for a while (yeah, I know, the warts, really, I know) > because I really don't think there's a desire to innovate or a reason for competition > at this level. In the case of wheel, perhaps it makes sense for that implementation to > be authoritative. The problem, to me, is not whether it is authoritative - it's more that it's ad hoc, just like setuptools in some areas. For example, the decision to use "metadata.json" rather than "pydist.json" is arbitrary, and could change in the future, and anyone who relies on how things work now will have to play catch-up when that happens. That's sometimes just too much work for volunteer activity - dig into what the problem is, put through a fix (for now), rinse and repeat - all the while, little or no value is really added. In theory this is an "infrastructure" area where a single blessed implementation might be OK, but these de facto tools don't do everything one wants, so interoperability remains important. There's no reason why we shouldn't look to innovate even in this area - there's some talk of a GSoC project now to look at dependency resolution for pip - something that I had sort-of working in the distil tool long ago (as a proof of concept) [1]. We've gotten so used to how pip and setuptools work, and because they are "good enough", there is a real failure of imagination to see how things might be done better. Regards, Vinay Sajip [1] https://distil.readthedocs.io/en/0.1.0/overview.html#actual-improvements From jim at jimfulton.info Tue Feb 14 15:10:03 2017 From: jim at jimfulton.info (Jim Fulton) Date: Tue, 14 Feb 2017 15:10:03 -0500 Subject: [Distutils] distlib and wheel metadata In-Reply-To: <108655704.7820975.1487101256733@mail.yahoo.com> References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <108655704.7820975.1487101256733@mail.yahoo.com> Message-ID: On Tue, Feb 14, 2017 at 2:40 PM, Vinay Sajip wrote: > > Nope. Honestly, though, I wish there was *one* *library* that defined > the standard, > > which was the case for setuptools for a while (yeah, I know, the warts, > really, I know) > > because I really don't think there's a desire to innovate or a reason > for competition > > at this level. In the case of wheel, perhaps it makes sense for that > implementation to > > be authoritative. > > The problem, to me, is not whether it is authoritative - it's more that > it's ad hoc, just like > setuptools in some areas. For example, the decision to use "metadata.json" > rather than > "pydist.json" is arbitrary, and could change in the future, and anyone who > relies on how things > work now will have to play catch-up when that happens. Unless they depend on a public API provided by the wheel package. Of course, you could argue that the name of a file could be part of the API. In many ways, depending and building on a working implementation is better that drafting a standard from scratch. Packaging has moved forward largely by people who built things pragmatically that worked and solved every-day problems: setuptools/easy_install, buildout, pip, wheel... > That's sometimes just too much work for > volunteer activity - dig into what the problem is, put through a fix (for > now), rinse and > repeat - all the while, little or no value is really added. > > In theory this is an "infrastructure" area where a single blessed > implementation might be OK, > I think so. > but these de facto tools don't do everything one wants, so > interoperability remains important. > Or collaboration to improve the tool. That *should* have worked for setuptools, but sadly didn't, for various reasons. > There's no reason why we shouldn't look to innovate even in this area - > there's some talk of a > GSoC project now to look at dependency resolution Yay! (I saw that.) > for pip Gaaaa. Why can't this be in a library? (Hopefully it will be.) - something that I had sort-of working > in the distil tool long ago (as a proof of concept) [1]. Almost is a hard sell. If this was usable as a library, I'd be interested in trying to integrate it with buildout. If it worked, many buildout users would be greatful. Perhaps the GSoC project could use it as a reference or starting point. We've gotten so used to how pip and setuptools work, and because they are "good enough", there is a real > failure of imagination > to see how things might be done better. > I think there is a failure of energy. Packaging should largely be boring and most people don't want to work on it. I certainly don't, even though I have. But you picked a good example. There are major differences (I almost said competition) between pip and buildout. They provide two different models (traditional Python system installs vs Java-like component/path installs) that address different use cases. IMO, these systems should complement each other and build on common foundations. Maybe there are more cases for innovation at lower levels than I'm aware of. Jim -- Jim Fulton http://jimfulton.info -------------- next part -------------- An HTML attachment was scrubbed... URL: From vinay_sajip at yahoo.co.uk Tue Feb 14 15:21:12 2017 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 14 Feb 2017 20:21:12 +0000 (UTC) Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> Message-ID: <425841221.7853973.1487103672849@mail.yahoo.com> > I thought the current status was that it's called metadata.json > exactly *because* it's not standardized, and you *shouldn't* look at > it? Well, it was work-in-progress-standardised according to PEP 426 (since sometimes implementations have to work in parallel with working out the details of specifications). Given that PEP 426 wasn't done and dusted but being progressed, I would have thought it perfectly acceptable to use "pydist.json", as the only things that would be affected would be packaging tools working to the PEP. > It's too bad that the JSON thing didn't work out, but I think we're > better off working on better specifying the one source of truth > everything already uses (METADATA) instead of bringing in *new* > partially-incompatible-and-poorly-specified formats. When you say "everything already uses", do you mean setuptools and wheel? If nobody else is allowed to play, that's one thing. But otherwise, there need to be standards for interoperability. The METADATA file, now - exactly which standard does it follow? The one in the dateutil wheel that Jim referred to doesn't appear to conform to any of the metadata PEPs. It was rejected by old metadata code in distlib (which came of out the Python 3.3 era "packaging" package - not to be confused with Donald's of the same name - which is strict in its interpretation of those earlier PEPs). The METADATA format (key-value) is not really flexible enough for certain things which were in PEP 426 (e.g. dependency descriptions), and for these JSON seems a reasonable fit. There's no technical reason why "the JSON thing didn't work out", as far as I can see - it was just given up on for a more incremental approach (which has got no new PEPs other than 440, AFAICT). I understand that social reasons are often more important than technical reasons when it comes to success or failure of an approach; I'm just not sure that in this case, it wasn't given up on too early. Regards, Vinay Sajip From wes.turner at gmail.com Tue Feb 14 15:28:41 2017 From: wes.turner at gmail.com (Wes Turner) Date: Tue, 14 Feb 2017 14:28:41 -0600 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> Message-ID: On Tue, Feb 14, 2017 at 12:15 PM, Nathaniel Smith wrote: > On Tue, Feb 14, 2017 at 10:10 AM, Vinay Sajip via Distutils-SIG > wrote: > >> humpty in term uses uses distlib which seems to mishandle wheel > >> metadata. (For example, it chokes if there's extra distribution meta and > >> makes it impossible for buildout to install python-dateutil from a > wheel.) > > > > I looked into the "mishandling". It's that the other tools don't adhere > to > > [the current state of] PEP 426 as closely as distlib does. For example, > > wheel writes JSON metadata to metadata.json in the .dist-info directory, > > whereas PEP 426 calls for that data to be in pydist.json. The non-JSON > > metadata in the wheel (the METADATA file) does not strictly adhere to > any of > > the metadata PEPs 241, 314, 345 or 426 (it has a mixture of incompatible > > fields). > > > > I can change distlib to look for metadata.json, and relax the rules to be > > more liberal regarding which fields to accept, but adhering to the PEP > isn't > > mishandling things, as I see it. > > I thought the current status was that it's called metadata.json > exactly *because* it's not standardized, and you *shouldn't* look at > it? > > It's too bad that the JSON thing didn't work out, but I think we're > better off working on better specifying the one source of truth > everything already uses (METADATA) instead of bringing in *new* > partially-incompatible-and-poorly-specified formats. > JSON-LD https://www.google.com/search?q=python+package+metadata+jsonld https://www.google.com/search?q="pep426jsonld" PEP426 (Deferred) Switching to a JSON compatible format https://www.python.org/dev/peps/pep-0426/#switching-to-a-json-compatible-format PEP 426: Define a JSON-LD context as part of the proposal https://github.com/pypa/interoperability-peps/issues/31 This doesn't work with JSON-LD 1.0: ```json releases = { "v0.0.1": {"url": ... }, "v1.0.0": {"url": ...}, } This does work with JSON-LD 1.0: ```json releases = [ {"version": "v0.0.1", "url": ...}, {"version": "v1.0.0", "url": ...}, ] ... Then adding custom attributes could be as easy as defining a URI namespace and additional attribute names; because {distutils, setuptools, pip, pipenv(?)} only need to validate the properties necessary for the relevant packaging operation. Without any JSON-LD normalization, these aren't equal: {"url": "#here"} {"schema:url": "#here"} {"http://schema.org/url", "#here"} This is the JSON downstream tools currently have/want to consume (en masse, for SAT solving, etc): https://pypi.python.org/pypi/ipython/json - It's a graph. - JSON-LD is for graphs. - There are normalizations and signatures for JSON-LD (ld-signatures != JWS) - Downstream tools need not do anything with the @context. ("JSON-LD unaware") - Downstream tools which generate pydist.jsonld should validate schema in tests Downstream tools: - https://github.com/pypa/pip/issues/988 "Pip needs a dependency resolver" (-> JSON) - https://github.com/pypa/warehouse/issues/1638 "API to get checksums" (-> JSON) Q: How do we get this (platform and architecture-specific) metadata to warehouse, where it can be hosted? A JSONLD entrypoint in warehouse (for each project, for every project, for {my_subset}): https://pypi.python.org/pypi/ipython/jsonld > I would accept a pull request to stop generating metadata.json in bdist_wheel. What about a pull request to start generating metadata.jsonld or pydist.jsonld instead? - [ ] "@context": { }, - [ ] "@graph": { }, # optional #PEP426JSONLD > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Feb 14 16:21:36 2017 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 14 Feb 2017 13:21:36 -0800 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On Feb 14, 2017 12:21, "Vinay Sajip" wrote: > I thought the current status was that it's called metadata.json > exactly *because* it's not standardized, and you *shouldn't* look at > it? Well, it was work-in-progress-standardised according to PEP 426 (since sometimes implementations have to work in parallel with working out the details of specifications). Given that PEP 426 wasn't done and dusted but being progressed, I would have thought it perfectly acceptable to use "pydist.json", as the only things that would be affected would be packaging tools working to the PEP. > It's too bad that the JSON thing didn't work out, but I think we're > better off working on better specifying the one source of truth > everything already uses (METADATA) instead of bringing in *new* > partially-incompatible-and-poorly-specified formats. When you say "everything already uses", do you mean setuptools and wheel? If nobody else is allowed to play, that's one thing. But otherwise, there need to be standards for interoperability. The METADATA file, now - exactly which standard does it follow? The one in the dateutil wheel that Jim referred to doesn't appear to conform to any of the metadata PEPs. It was rejected by old metadata code in distlib (which came of out the Python 3.3 era "packaging" package - not to be confused with Donald's of the same name - which is strict in its interpretation of those earlier PEPs). That's why I said we need to fix the standards to bring them back in sync with reality. I'm not arguing that there's no problem, I'm saying that replacing one serialization format with another won't actually address the problem, but does cause new complications. The METADATA format (key-value) is not really flexible enough for certain things which were in PEP 426 (e.g. dependency descriptions), and for these JSON seems a reasonable fit. There's no technical reason why "the JSON thing didn't work out", as far as I can see - it was just given up on for a more incremental approach (which has got no new PEPs other than 440, AFAICT). I understand that social reasons are often more important than technical reasons when it comes to success or failure of an approach; I'm just not sure that in this case, it wasn't given up on too early. The technical problem with PEP 426 is that unless you want to throw away pypi and start over, all tools need to understand the old METADATA files regardless. So it still needs to be specified, all the same code needs to be kept around, etc. Plus the most pressing issues are like "what does the field actually mean", which is totally independent of the serialization format. If there are particular fields that need more structured data, then there are options: we could have fields in METADATA whose values are JSON, or a sidecar file that supplements the main METADATA file with extra information. But adding a new way to specify fields like Name and Version really doesn't help anybody. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Tue Feb 14 16:36:41 2017 From: donald at stufft.io (Donald Stufft) Date: Tue, 14 Feb 2017 16:36:41 -0500 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> Message-ID: <49FBDC8D-DAE4-469D-ABB4-AC8FE3415545@stufft.io> > On Feb 14, 2017, at 1:15 PM, Nathaniel Smith wrote: > > It's too bad that the JSON thing didn't work out, but I think we're > better off working on better specifying the one source of truth > everything already uses (METADATA) instead of bringing in *new* > partially-incompatible-and-poorly-specified formats. > TBH I don?t think we?re going to stick with METADATA forever and it?s likely we, at some point, get to a JSON representation for this information but that is not today. We have far more pressing issues to deal with besides whether things are in one format or another. Yes, we still have a fair amount of behavior that is defined as ?whatever setuptools/distutils does?, but we?re slowly trying to break away from that. WRT to ?standard implementations? versus ?standards?, the idea of a ?standard implementation? being the source of truth and no longer needing to do all the work to define standards is a nice idea, but I think it is an idea that is never actually going to work out as well as real standardization. There is *always* going to be a need for tools that aren?t the blessed tools to interact with these items. Even if you can authoritatively say that this one Python implementation is the only implementation that any Python program will ever need, there is still the problem that people need to consume this information in languages that aren?t Python. Another problem there is it becomes incredibly difficult to know what is something that is supported as an actual feature and what is something that just sort of works because of that way that something was implemented. My goal with the packaging library is to more or less strictly implement accepted PEPs (and while I will make in progress PRs for PEPs that are currently being worked on, I won?t actually land a PR until the PEP is accepted). The only other real code there is extra utilities that make the realities of working with the specified PEPs easier (for example, we have a Version object which implements PEP 440 versions, but we also have a LegacyVersion object that implements what setuptools used to do). This not only gives us the benefit of a single implementation for people who just want to use that single blessed implementation, but it gives us the benefit of standards. This has already been useful in the packaging library where an implementation defect caused versions to get parsed slightly wrong, and we had the extensively documented PEP 440 to declare what the expected behavior was. I do not think the problem is "We've gotten so used to how pip and setuptools work, and because they are "good enough", there is a real failure of imagination to see how things might be done better.?. The hard work of doing this isn?t in writing an implementation that achieves it for 80% of projects, it?s for doing it in a way that achieves it for 95% of projects. Managing backwards compatibility is probably the single most important thing we can do here. There are almost 800,000 files on PyPI that someone can download and install, telling all of them they need to switch to some new system or things are going to break for them is simply not tenable. That being said, I don?t think there is anything stopping us from getting to a better point besides time and effort. ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Feb 14 17:26:17 2017 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 14 Feb 2017 14:26:17 -0800 Subject: [Distutils] distlib and wheel metadata In-Reply-To: <49FBDC8D-DAE4-469D-ABB4-AC8FE3415545@stufft.io> References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <49FBDC8D-DAE4-469D-ABB4-AC8FE3415545@stufft.io> Message-ID: On Tue, Feb 14, 2017 at 1:36 PM, Donald Stufft wrote: > WRT to ?standard implementations? versus ?standards?, the idea of a > ?standard implementation? being the source of truth and no longer needing to > do all the work to define standards is a nice idea, but I think it is an > idea that is never actually going to work out as well as real > standardization. There is *always* going to be a need for tools that aren?t > the blessed tools to interact with these items. Even if you can > authoritatively say that this one Python implementation is the only > implementation that any Python program will ever need, there is still the > problem that people need to consume this information in languages that > aren?t Python. Another even more fundamental reason that standards are important is to document semantics. Like, distlib or packaging or whatever can expose the "provides" field, but what does that actually mean? As a user of distlib/packaging, how should I change what I'm doing when I see that field? As a package author when should I set it? (I'm intentionally picking an example where the answer is "well the PEP says something about this but in reality it was never implemented and maybe has some security issues and no-one really knows" :-).) A "standard implementation" can abstract away some things, but by definition these are mostly the boring bits... -n -- Nathaniel J. Smith -- https://vorpus.org From vinay_sajip at yahoo.co.uk Tue Feb 14 18:01:48 2017 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 14 Feb 2017 23:01:48 +0000 (UTC) Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: <1372092083.7912708.1487113308989@mail.yahoo.com> > The technical problem with PEP 426 is that unless you want to throw away pypi and start over, > all tools need to understand the old METADATA files regardless. It might not be as bad as that. For example, that IMO was the mistake behind the original concept of distutils2 - it was never going to fly as it required everyone to switch over to distutils2's way of doing things, and wouldn't be able to deal with old releases etc. For a time, I maintained a pretty extensive parallel set of metadata, based on just the data passed to setup() by packages using distutils/setuptools. This included not just the data for installation but even the data for package build, where it was purely declarative at the arguments-to-setup() level. Where a package didn't do completely bespoke things in setup() - like create new files, move files around etc. then the parallel set of metadata would allow installation of even old releases, without executing any setuptools code at all. I've not had the bandwidth to keep working on distlib and the metadata (example [1]), and the volume of new stuff going onto PyPI meant I didn't have time to keep on top of it. But the approach had some promise, in my view, and certainly showed that purely declarative packages (which didn't use e.g. custom build and install distutils/setuptools commands) could be installed using a completely different tool [than distutils/setuptools] without package authors having to change anything (beyond staying purely declarative). The distil documentation [2] shows installing a number of distributions (existing releases) from PyPI with better dependency resolution than pip does now, and without "throwing away PyPI". Anyway, I guess it's water under the bridge. Regards, Vinay Sajip [1] https://www.red-dove.com/pypi/projects/J/Jinja2/package-2.7.3.json [2] https://distil.readthedocs.io/en/0.1.0/installing.html#installing-distributions From vinay_sajip at yahoo.co.uk Tue Feb 14 18:12:57 2017 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 14 Feb 2017 23:12:57 +0000 (UTC) Subject: [Distutils] distlib and wheel metadata In-Reply-To: <49FBDC8D-DAE4-469D-ABB4-AC8FE3415545@stufft.io> References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <49FBDC8D-DAE4-469D-ABB4-AC8FE3415545@stufft.io> Message-ID: <1966892435.7938940.1487113977182@mail.yahoo.com> > Managing backwards compatibility is probably the single most important thing we can do here. > There are almost 800,000 files on PyPI that someone can download and install, telling all > of them they need to switch to some new system or things are going to break for them is > simply not tenable. I agree. But if packaging is going at some point to break out of allowing completely bespoke code to run at installation time (i.e. executable code like a free-for-all setup.py, vs. something declarative and thus more restrictive) then IMO you have to sacrifice 100% backwards compatibility. See my comment in my other post about the ability to install old releases - I made that a goal of my experiments with the parallel metadata, to not require anything other than a declarative setup() in order to be able to install stuff using just the metadata, so that nobody has to switch anything in a big-bang style, but could transition over to a newer system at their leisure. Regards, Vinay Sajip From ncoghlan at gmail.com Wed Feb 15 06:33:41 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 15 Feb 2017 12:33:41 +0100 Subject: [Distutils] distlib and wheel metadata In-Reply-To: <425841221.7853973.1487103672849@mail.yahoo.com> References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 14 February 2017 at 21:21, Vinay Sajip via Distutils-SIG wrote: > > >> I thought the current status was that it's called metadata.json >> exactly *because* it's not standardized, and you *shouldn't* look at >> it? > > > Well, it was work-in-progress-standardised according to PEP 426 (since > sometimes implementations have to work in parallel with working out the > details of specifications). Given that PEP 426 wasn't done and dusted > but being progressed, I would have thought it perfectly acceptable to > use "pydist.json", as the only things that would be affected would be > packaging tools working to the PEP. I asked Daniel to *stop* using pydist.json, since wheel was emitting a point-in-time snapshot of PEP 426 (which includes a lot of potentially-nice-to-have things that nobody has actually implemented so far, like the semantic dependency declarations and the enhancements to the extras syntax), rather than the final version of the spec. >> It's too bad that the JSON thing didn't work out, but I think we're >> better off working on better specifying the one source of truth >> everything already uses (METADATA) instead of bringing in *new* >> partially-incompatible-and-poorly-specified formats. > > When you say "everything already uses", do you mean setuptools and wheel? > If nobody else is allowed to play, that's one thing. But otherwise, there > need to be standards for interoperability. The METADATA file, now - exactly > which standard does it follow? The one in the dateutil wheel that Jim > referred to doesn't appear to conform to any of the metadata PEPs. It was > rejected by old metadata code in distlib (which came of out the Python 3.3 > era "packaging" package - not to be confused with Donald's of the same name - > which is strict in its interpretation of those earlier PEPs). > > The METADATA format (key-value) is not really flexible enough for certain > things which were in PEP 426 (e.g. dependency descriptions), and for these > JSON seems a reasonable fit. The current de facto standard set by setuptools and bdist_wheel is: - dist-info/METADATA as defined at https://packaging.python.org/specifications/#package-distribution-metadata - dist-info/requires.txt runtime dependencies as defined at http://setuptools.readthedocs.io/en/latest/formats.html#requires-txt - dist-info/setup_requires.txt build time dependencies as defined at http://setuptools.readthedocs.io/en/latest/formats.html#setup-requires-txt The dependency fields in METADATA itself unfortunately aren't really useful for anything. There's definitely still a place for a pydist.json created by going through PEP 426, comparing it to what bdist_wheel already does to populate metadata.json, and either changing the PEP to match the existing practice, or else agreeing that we prefer what the PEP recommends, that we want to move in that direction, and that there's a definite commitment to implement the changes in at least setuptools and bdist_wheel (plus a migration strategy that allows for reasonably sensible consumption of old metadata). Such an update would necessarily be a fairly ruthless process, where we defer everything that can possibly be deferred. I already made one pass at that when I split out the metadata extensions into PEP 459, but at least one more such pass is needed before we can sign off on the spec as metadata 2.0 - even beyond any "open for discussion" questions, there are still things in there which were extracted and standardised separately in PEP 508. > There's no technical reason why "the JSON thing > didn't work out", as far as I can see - it was just given up on for a more > incremental approach (which has got no new PEPs other than 440, AFAICT). Yep, it's a logistical problem rather than a technical problem per se - new metadata formats need software publisher adoption to ensure the design is sensible before we commit to them long term, but software publishers are understandably reluctant to rely on new formats that limit their target audience to folks running the latest versions of the installation tools (outside constrained cases where the software publisher is also the main consumer of that software). For PEP 440 (version specifiers) and PEP 508 (dependency specifiers), this was handled by focusing on documenting practices that people already used (and checking existing PyPI projects for compatibility), rather than trying to actively change those practices. For pyproject.toml (e.g. enscons), the idea is to provide a setup.py shim that can take care of bootstrapping the new approach for the benefit of older tools that assume the use of setup.py (similar to what was done with setup.cfg and d2to1). The equivalent for PEP 426 would probably be legacy-to-pydist and pydist-to-legacy converters that setuptools, bdist_wheel and other publishing tools can use to ship legacy metadata alongside the standardised format (and I believe Daniel already has at least the former in order to generate metadata.json in bdist_wheel). With PEP 426 as currently written, a pydist-to-legacy converter isn't really feasible, since pydist proposes new concepts that can't be readily represented in the old format. > I understand that social reasons are often more important than technical reasons > when it comes to success or failure of an approach; I'm just not sure that > in this case, it wasn't given up on too early. I think of PEP 426 as "deferred indefinitely pending specific practical problems to provide clearer design constraints" rather than abandoned :) There are two recent developments that I think may provide those missing design constraints and hence motivation to finalise a metadata 2.0 specification: 1. the wheel-to-egg support in humpty (and hence zc.buiidout). That makes humpty a concrete non-traditional installer that would benefit from both a modernised standard metadata format, as well as common tools both to convert legacy metadata to the agreed modern format and to convert the modern format back to the legacy format for inclusion in the generated egg files (as then humpty could just re-use the shared tools, rather than having to maintain those capabilities itself). 2. the new pipenv project to provide a simpler alternative to the pip+virtualenv+pip-tools combination for environment management in web service development (and similar layered application architectures). As with the "install vs setup" split in setuptools, pipenv settled on an "only two kinds of requirement (deployment and development)" model for usability reasons, but it also distinguishes abstract dependencies stored in Pipfile from pinned concrete dependencies stored in Pipfile.lock. If we put those together with the existing interest in automating generation of policy compliant operating system distribution packages, it makes it easier to go through the proposed semantic dependency model in PEP 426 and ask "How would we populate these fields based on the metadata that projects *already* publish?". - "run requires": straightforward, as these are the standard dependencies used in most projects. Not entirely clear how to gently (or strongly!) discourage dependency pinning when publishing to PyPI (although the Pipfile and Pipfile.lock model used in pipenv may help with this) - "meta requires": not clear at all, as this was added to handle cases like PyObjC, where the main package is just a metapackage that makes a particular set of versioned subpackages easy to install. This may be better modeled as a separate "integrates" field, using a declaration syntax more akin to that used for Pipfile.lock rather than that used for normal requirements declarations. - "dev requires": corresponds to "dev-packages" in pipenv - "build requires": corresponds to "setup_requires" in setuptools, "build-system.requires" + any dynamic build dependencies in PEP 518 - "test requires": corresponds to "test" extra in https://packaging.python.org/specifications/#provides-extra-multiple-use The "doc" extra in https://packaging.python.org/specifications/#provides-extra-multiple-use would map to "build requires", but there's potential benefit to redistributors in separating it out, as we often split the docs out from the built software components (since there's little reason to install documentation on headless servers that are only going to be debugged remotely). The main argument against "test requires" and "doc requires" is that the extras system already works fine for those - "pip install MyProject[test]" and "pip install MyProject[doc]" are both already supported, so metadata 2.0 just needs to continue to reserve those as semantically significant extras names. "dev" requires could be handled the same way - anything you actually need to *build* an sdist or wheel archive from a source repository should be in "setup_requires" (setuptools) or "build-system.requires" (pyproject.toml), so "dev" would just be a conventional extra name rather than a top level field. That just leaves "build_requires", which turns out to interact awkwardly with the "extras" system: if you write "pip install MyProject[test]", does it install all the "test" dependencies, regardless of whether they're listed in run_requires or build_requires? If yes: then why are run_requires and build_requires separate? If no: then how do you request installation of the "test" build extra? Or are build extras prohibited entirely? That suggests that perhaps "build" should just be a conventional extra as well, and considered orthogonal to the other conventional extras. (I'm sure this idea has been suggested before, but I don't recall who suggested it or when) And if build, test, doc, and dev are all handled as extras, then the top level name "run_requires" no longer makes sense, and the field name should go back to just being "requires". Under that evaluation, we'd be left with only the following top level fields defined for dependency declarations: - "requires": list where entries are either a string containing a PEP 508 dependency specifier or else a hash map contain a "requires" key plus "extra" or "environment" fields as qualifiers - "integrates": replacement for "meta_requires" that only allows pinned dependencies (i.e. hash maps with "name" & "version" fields, or direct URL references, rather than a general PEP 508 specifier as a string) For converting old metadata, any concrete dependencies that are compatible with the "integrates" field format would be mapped that way, while everything else would be converted to "requires" entries. The semantic differences between normal runtime dependencies and "dev", "test", "doc" and "build" requirements would be handled as extras, regardless of whether you were using the old metadata format or the new one. Going the other direction would be similarly straightforward since (excluding extensions) the set of required conceptual entities has been reduced back to the set that already exists in the current metadata formats. While "requires" and "integrates" would be distinct fields in pydist.json, the decomposed fields in the latter would map back to their string-based counterparts in PEP 508 when converted to the legacy metadata formats. Cheers, Nick. P.S. I'm definitely open to a PR that amends the PEP 426 draft along these lines. I'll get to it eventually myself, but there are some other things I see as higher priority for my open source time at the moment (specifically the C locale handling behaviour of Python 3.6 in Fedora 26 and the related upstream proposal for Python 3.7 in PEP 538) -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From njs at pobox.com Wed Feb 15 06:58:51 2017 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 15 Feb 2017 03:58:51 -0800 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On Wed, Feb 15, 2017 at 3:33 AM, Nick Coghlan wrote: > - "requires": list where entries are either a string containing a PEP > 508 dependency specifier or else a hash map contain a "requires" key > plus "extra" or "environment" fields as qualifiers > - "integrates": replacement for "meta_requires" that only allows > pinned dependencies (i.e. hash maps with "name" & "version" fields, or > direct URL references, rather than a general PEP 508 specifier as a > string) What's accomplished by separating these? I really think we should strive to have fewer more orthogonal concepts whenever possible... -n -- Nathaniel J. Smith -- https://vorpus.org From dc_isar at yahoo.co.uk Tue Feb 14 00:27:15 2017 From: dc_isar at yahoo.co.uk (Chitra Dewan) Date: Tue, 14 Feb 2017 05:27:15 +0000 (UTC) Subject: [Distutils] Python installation not working References: <2038221091.6741722.1487050035224.ref@mail.yahoo.com> Message-ID: <2038221091.6741722.1487050035224@mail.yahoo.com> Hello, I am beginner in Python?I am facing problems in installing Python 3.5 ?on my windows vista x32 machine.I downloaded?python-3.5.2.exe from Python.org. It is downloaded as an exe. When I try to install it via ?"Run as administrator" , nothing happens. ?Same behavior with 3.6 version? kindly advise? ?Regards & Thanks, Chitra Dewan -------------- next part -------------- An HTML attachment was scrubbed... URL: From VenkatRamReddy.k at hcl.com Tue Feb 14 01:48:12 2017 From: VenkatRamReddy.k at hcl.com (Venkat Ram Reddy K) Date: Tue, 14 Feb 2017 06:48:12 +0000 Subject: [Distutils] py2exe package for 2.7 Message-ID: Hi Good Afternoon, This is Venkat from HCL Technologies. Actually I have created executable file(test.exe) by using py2exe package on python 2.7 version on Windows. After that I have ran my application from the path C:\Python27\dist\test.exe, It was executed and working properly. But the problem is, when I have copied test.exe to other folder(other than "C:\Python27\dist\")and tried to run the test.exe, it is not executing. Could you please help me in resolving the issue. Thanks, Venkat. ::DISCLAIMER:: ---------------------------------------------------------------------------------------------------------------------------------------------------- The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of authorized representative of HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any email and/or attachments, please check them for viruses and other defects. ---------------------------------------------------------------------------------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Wed Feb 15 08:00:59 2017 From: wes.turner at gmail.com (Wes Turner) Date: Wed, 15 Feb 2017 07:00:59 -0600 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On Wed, Feb 15, 2017 at 5:33 AM, Nick Coghlan wrote: > On 14 February 2017 at 21:21, Vinay Sajip via Distutils-SIG > wrote: > > > > > >> I thought the current status was that it's called metadata.json > >> exactly *because* it's not standardized, and you *shouldn't* look at > >> it? > > > > > > Well, it was work-in-progress-standardised according to PEP 426 (since > > sometimes implementations have to work in parallel with working out the > > details of specifications). Given that PEP 426 wasn't done and dusted > > but being progressed, I would have thought it perfectly acceptable to > > use "pydist.json", as the only things that would be affected would be > > packaging tools working to the PEP. > > I asked Daniel to *stop* using pydist.json, since wheel was emitting a > point-in-time snapshot of PEP 426 (which includes a lot of > potentially-nice-to-have things that nobody has actually implemented > so far, like the semantic dependency declarations and the enhancements > to the extras syntax), rather than the final version of the spec. > Would you send a link to the source for this? > > >> It's too bad that the JSON thing didn't work out, but I think we're > >> better off working on better specifying the one source of truth > >> everything already uses (METADATA) instead of bringing in *new* > >> partially-incompatible-and-poorly-specified formats. > > > > When you say "everything already uses", do you mean setuptools and wheel? > > If nobody else is allowed to play, that's one thing. But otherwise, there > > need to be standards for interoperability. The METADATA file, now - > exactly > > which standard does it follow? The one in the dateutil wheel that Jim > > referred to doesn't appear to conform to any of the metadata PEPs. It was > > rejected by old metadata code in distlib (which came of out the Python > 3.3 > > era "packaging" package - not to be confused with Donald's of the same > name - > > which is strict in its interpretation of those earlier PEPs). > > > > The METADATA format (key-value) is not really flexible enough for certain > > things which were in PEP 426 (e.g. dependency descriptions), and for > these > > JSON seems a reasonable fit. > > The current de facto standard set by setuptools and bdist_wheel is: > > - dist-info/METADATA as defined at > https://packaging.python.org/specifications/#package-distribution-metadata > - dist-info/requires.txt runtime dependencies as defined at > http://setuptools.readthedocs.io/en/latest/formats.html#requires-txt > - dist-info/setup_requires.txt build time dependencies as defined at > http://setuptools.readthedocs.io/en/latest/formats.html#setup-requires-txt > > The dependency fields in METADATA itself unfortunately aren't really > useful for anything. > Graph: Nodes and edges. > > There's definitely still a place for a pydist.json created by going > through PEP 426, comparing it to what bdist_wheel already does to > populate metadata.json, and either changing the PEP to match the > existing practice, or else agreeing that we prefer what the PEP > recommends, that we want to move in that direction, and that there's a > definite commitment to implement the changes in at least setuptools > and bdist_wheel (plus a migration strategy that allows for reasonably > sensible consumption of old metadata). > Which function reads metadata.json? Which function reads pydist.json? > > Such an update would necessarily be a fairly ruthless process, where > we defer everything that can possibly be deferred. I already made one > pass at that when I split out the metadata extensions into PEP 459, > but at least one more such pass is needed before we can sign off on > the spec as metadata 2.0 - even beyond any "open for discussion" > questions, there are still things in there which were extracted and > standardised separately in PEP 508. > > > There's no technical reason why "the JSON thing > > didn't work out", as far as I can see - it was just given up on for a > more > > incremental approach (which has got no new PEPs other than 440, AFAICT). > > Yep, it's a logistical problem rather than a technical problem per se > - new metadata formats need software publisher adoption to ensure the > design is sensible before we commit to them long term, but software > publishers are understandably reluctant to rely on new formats that > limit their target audience to folks running the latest versions of > the installation tools (outside constrained cases where the software > publisher is also the main consumer of that software). > An RDFS Vocabulary contains Classes and Properties with rdfs:ranges and rdfs:domains. There are many representations for RDF: RDF/XML, Turtle/N3, JSONLD. RDF is implementation-neutral. JSONLD is implementation-neutral. > > For PEP 440 (version specifiers) and PEP 508 (dependency specifiers), > this was handled by focusing on documenting practices that people > already used (and checking existing PyPI projects for compatibility), > rather than trying to actively change those practices. > > For pyproject.toml (e.g. enscons), the idea is to provide a setup.py > shim that can take care of bootstrapping the new approach for the > benefit of older tools that assume the use of setup.py (similar to > what was done with setup.cfg and d2to1). > > The equivalent for PEP 426 would probably be legacy-to-pydist and > pydist-to-legacy converters that setuptools, bdist_wheel and other > publishing tools can use to ship legacy metadata alongside the > standardised format (and I believe Daniel already has at least the > former in order to generate metadata.json in bdist_wheel). With PEP > 426 as currently written, a pydist-to-legacy converter isn't really > feasible, since pydist proposes new concepts that can't be readily > represented in the old format. > pydist-to-legacy would be a lossy transformation. > > > I understand that social reasons are often more important than technical > reasons > > when it comes to success or failure of an approach; I'm just not sure > that > > in this case, it wasn't given up on too early. > > I think of PEP 426 as "deferred indefinitely pending specific > practical problems to provide clearer design constraints" rather than > abandoned :) > Is it too late to request lowercased property names without dashes? If we're (I'm?) going to create @context URIs, compare: https://schema.python.org/v1#Provides-Extra { "@context": { "default": "https://schema.python.org/#", "schema": "http://schema.org/", # "name": "http://schema.org/name", # "url": "http://schema.org/url", # "verstr": # "extra": # "requirements" # "requirementstr" }, "@typeof": [ "py:PythonPackage"], "name": "IPython", "url": ["https://pypi.python.org/pypi/IPython", "https://pypi.org/project/ IPython"], "Provides-Extra": [ {"@typeof": "Requirement", "name": "notebook", "extra": ["notebook"], "requirements": [], #TODO "requirementstr": "extra == 'notebook'" }, {"name": "numpy", "extra": ["test"], "requirements": #TODO, "requirementstr": "python_version >= \"3.4\" and extra == 'test'" }, ... ] } > There are two recent developments that I think may provide those > missing design constraints and hence motivation to finalise a metadata > 2.0 specification: > > 1. the wheel-to-egg support in humpty (and hence zc.buiidout). That > makes humpty a concrete non-traditional installer that would benefit > from both a modernised standard metadata format, as well as common > tools both to convert legacy metadata to the agreed modern format and > to convert the modern format back to the legacy format for inclusion > in the generated egg files (as then humpty could just re-use the > shared tools, rather than having to maintain those capabilities > itself). class PackageMetadata def __init__(): self.data = collections.OrderedDict() @staticmethod def read_legacy() def read_metadata_json() def read_pydist_json() def read_pyproject_toml() def read_jsonld() def to_legacy(): def to_metadata_json() def to_pydist_json() def to_pyproject_toml() def to_jsonld() @classmethod def Legacy() def MetadataJson() def PydistJson() def PyprojectToml() def Jsonld(cls, *args, **kwargs) obj = cls(*args, **kwargs) obj.read_jsonld(*args, **kwargs) return obj @classmethod def from(cls, path, format='legacy|metadatajson|pydistjson|pyprojecttoml|jsonld'): # or this ... for maximum reusability, we really shouldn't need an adapter registry here; > 2. the new pipenv project to provide a simpler alternative to the > pip+virtualenv+pip-tools combination for environment management in web > service development (and similar layered application architectures). > As with the "install vs setup" split in setuptools, pipenv settled on > an "only two kinds of requirement (deployment and development)" model > for usability reasons, but it also distinguishes abstract dependencies > stored in Pipfile from pinned concrete dependencies stored in > Pipfile.lock. > Does the Pipfile/Pipfile.lock distinction overlap with 'integrates' as a replacement for meta_requires? > > If we put those together with the existing interest in automating > generation of policy compliant operating system distribution packages, > Downstream OS packaging could easily (and without permission) include extra attributes (properties specified with full URIS) in JSONLD metadata. > it makes it easier to go through the proposed semantic dependency > model in PEP 426 and ask "How would we populate these fields based on > the metadata that projects *already* publish?". > See 'class PackageMetadata' > > - "run requires": straightforward, as these are the standard > dependencies used in most projects. Not entirely clear how to gently > (or strongly!) discourage dependency pinning when publishing to PyPI > (although the Pipfile and Pipfile.lock model used in pipenv may help > with this) > - "meta requires": not clear at all, as this was added to handle cases > like PyObjC, where the main package is just a metapackage that makes a > particular set of versioned subpackages easy to install. This may be > better modeled as a separate "integrates" field, using a declaration > syntax more akin to that used for Pipfile.lock rather than that used > for normal requirements declarations. > - "dev requires": corresponds to "dev-packages" in pipenv > - "build requires": corresponds to "setup_requires" in setuptools, > "build-system.requires" + any dynamic build dependencies in PEP 518 > - "test requires": corresponds to "test" extra in > https://packaging.python.org/specifications/#provides-extra-multiple-use > > The "doc" extra in > https://packaging.python.org/specifications/#provides-extra-multiple-use > would map to "build requires", but there's potential benefit to > redistributors in separating it out, as we often split the docs out > from the built software components (since there's little reason to > install documentation on headless servers that are only going to be > debugged remotely). > > The main argument against "test requires" and "doc requires" is that > the extras system already works fine for those - "pip install > MyProject[test]" and "pip install MyProject[doc]" are both already > supported, so metadata 2.0 just needs to continue to reserve those as > semantically significant extras names. > > "dev" requires could be handled the same way - anything you actually > need to *build* an sdist or wheel archive from a source repository > should be in "setup_requires" (setuptools) or "build-system.requires" > (pyproject.toml), so "dev" would just be a conventional extra name > rather than a top level field. > > That just leaves "build_requires", which turns out to interact > awkwardly with the "extras" system: if you write "pip install > MyProject[test]", does it install all the "test" dependencies, > regardless of whether they're listed in run_requires or > build_requires? > > If yes: then why are run_requires and build_requires separate? > If no: then how do you request installation of the "test" build extra? > Or are build extras prohibited entirely? > > That suggests that perhaps "build" should just be a conventional extra > as well, and considered orthogonal to the other conventional extras. > (I'm sure this idea has been suggested before, but I don't recall who > suggested it or when) > > And if build, test, doc, and dev are all handled as extras, then the > top level name "run_requires" no longer makes sense, and the field > name should go back to just being "requires". > Under that evaluation, we'd be left with only the following top level > fields defined for dependency declarations: > > - "requires": list where entries are either a string containing a PEP > 508 dependency specifier or else a hash map contain a "requires" key > plus "extra" or "environment" fields as qualifiers > +1 > - "integrates": replacement for "meta_requires" that only allows > pinned dependencies (i.e. hash maps with "name" & "version" fields, or > direct URL references, rather than a general PEP 508 specifier as a > string) > Pipfile.lock? What happens here when something is listed in both requires and integrates? Where/do these get merged on the "name" attr as a key, given a presumed namespace URI prefix (https://pypi.org/project/)? > > For converting old metadata, any concrete dependencies that are > compatible with the "integrates" field format would be mapped that > way, while everything else would be converted to "requires" entries. > What heuristic would help identify compatibility with the integrates field? > The semantic differences between normal runtime dependencies and > "dev", "test", "doc" and "build" requirements would be handled as > extras, regardless of whether you were using the old metadata format > or the new one. > +1 from me. I can't recall whether I've used {"dev", "test", "doc", and "build"} as extras names in the past; though I can remember thinking "wouldn't it be more intuitive to do it [that way]" Is this backward compatible? Extras still work as extras? > > Going the other direction would be similarly straightforward since > (excluding extensions) the set of required conceptual entities has > been reduced back to the set that already exists in the current > metadata formats. While "requires" and "integrates" would be distinct > fields in pydist.json, the decomposed fields in the latter would map > back to their string-based counterparts in PEP 508 when converted to > the legacy metadata formats. > > Cheers, > Nick. > > P.S. I'm definitely open to a PR that amends the PEP 426 draft along > these lines. I'll get to it eventually myself, but there are some > other things I see as higher priority for my open source time at the > moment (specifically the C locale handling behaviour of Python 3.6 in > Fedora 26 and the related upstream proposal for Python 3.7 in PEP 538) > I need to find a job; my time commitment here is inconsistent. I'm working on a project (nbmeta) for generating, displaying, and embedding RDFa and JSONLD in Jupyter notebooks (w/ _repr_html_() and an OrderedDict) which should refresh the JSONLD @context-writing skills necessary to define the RDFS vocabulary we could/should have at https://schema.python.org/ . - [ ] JSONLD PEP (<- PEP426) - [ ] examples / test cases - I've referenced IPython as an example package; are there other hard test cases for python packaging metadata conversion? (i.e. one that uses every feature of each metadata format)? - [ ] JSONLD @context - [ ] class PackageMetadata - [ ] wheel: (additionally) generate JSONLD metadata - [ ] schema.python.org: master, gh-pages (or e.g. " https://www.pypa.io/ns#") - [ ] warehouse: add a ./jsonld view (to elgacy?) https://github.com/pypa/interoperability-peps/issues/31 > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Feb 15 08:27:10 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 15 Feb 2017 14:27:10 +0100 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 15 February 2017 at 12:58, Nathaniel Smith wrote: > On Wed, Feb 15, 2017 at 3:33 AM, Nick Coghlan wrote: >> - "requires": list where entries are either a string containing a PEP >> 508 dependency specifier or else a hash map contain a "requires" key >> plus "extra" or "environment" fields as qualifiers >> - "integrates": replacement for "meta_requires" that only allows >> pinned dependencies (i.e. hash maps with "name" & "version" fields, or >> direct URL references, rather than a general PEP 508 specifier as a >> string) > > What's accomplished by separating these? I really think we should > strive to have fewer more orthogonal concepts whenever possible... It's mainly a matter of incorporating https://caremad.io/posts/2013/07/setup-vs-requirement/ into the core data model, as this distinction between abstract development dependencies and concrete deployment dependencies is incredibly important for any scenario that involves publisher-redistributor-consumer chains, but is entirely non-obvious to folks that are only familiar with the publisher-consumer case that comes up during development-for-personal-and-open-source-use. One particular area where this is problematic is in the widespread advice "always pin your dependencies" which is usually presented without the all important "for application or service deployment" qualifier. As a first approximation: pinning-for-app-or-service-deployment == good, pinning-for-local-testing == good, pinning-for-library-or-framework-publication-to-PyPI == bad. pipenv borrows the Ruby solution to modeling this by having Pipfile for abstract dependency declarations and Pipfile.lock for concrete integration testing ones, so the idea here is to propagate that model to pydist.json by separating the "requires" field with abstract development dependencies from the "integrates" field with concrete deployment dependencies. In the vast majority of publication-to-PyPi cases people won't need the "integrates" field, since what they're publishing on PyPI will just be their abstract dependencies, and any warning against using "==" will recommend using "~=" or ">=" instead. But there *are* legitimate uses of pinning-for-publication (like the PyObjC metapackage bundling all its subcomponents, or when building for private deployment infastructure), so there needs to be a way to represent "Yes, I'm pinning this dependency for publication, and I'm aware of the significance of doing so" Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From njs at pobox.com Wed Feb 15 09:11:47 2017 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 15 Feb 2017 06:11:47 -0800 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On Wed, Feb 15, 2017 at 5:27 AM, Nick Coghlan wrote: > On 15 February 2017 at 12:58, Nathaniel Smith wrote: >> On Wed, Feb 15, 2017 at 3:33 AM, Nick Coghlan wrote: >>> - "requires": list where entries are either a string containing a PEP >>> 508 dependency specifier or else a hash map contain a "requires" key >>> plus "extra" or "environment" fields as qualifiers >>> - "integrates": replacement for "meta_requires" that only allows >>> pinned dependencies (i.e. hash maps with "name" & "version" fields, or >>> direct URL references, rather than a general PEP 508 specifier as a >>> string) >> >> What's accomplished by separating these? I really think we should >> strive to have fewer more orthogonal concepts whenever possible... > > It's mainly a matter of incorporating > https://caremad.io/posts/2013/07/setup-vs-requirement/ into the core > data model, as this distinction between abstract development > dependencies and concrete deployment dependencies is incredibly > important for any scenario that involves > publisher-redistributor-consumer chains, but is entirely non-obvious > to folks that are only familiar with the publisher-consumer case that > comes up during development-for-personal-and-open-source-use. Maybe I'm just being dense but, umm. I don't know what any of these words mean :-). I'm not unfamiliar with redistributors; part of my confusion is that this is a concept that AFAIK distro package systems don't have. Maybe it would help if you have a concrete example of a scenario where they would benefit from having this distinction? > One particular area where this is problematic is in the widespread > advice "always pin your dependencies" which is usually presented > without the all important "for application or service deployment" > qualifier. As a first approximation: > pinning-for-app-or-service-deployment == good, > pinning-for-local-testing == good, > pinning-for-library-or-framework-publication-to-PyPI == bad. > > pipenv borrows the Ruby solution to modeling this by having Pipfile > for abstract dependency declarations and Pipfile.lock for concrete > integration testing ones, so the idea here is to propagate that model > to pydist.json by separating the "requires" field with abstract > development dependencies from the "integrates" field with concrete > deployment dependencies. What's the benefit of putting this in pydist.json? I feel like for the usual deployment cases (a) going straight from Pipfile.lock -> venv is pretty much sufficient, with no need to put this into a package, but (b) if you really do want to put it into a package, then the natural approach would be to make an empty wheel like "my-django-app-deploy.whl" whose dependencies were the contents of Pipfile.lock. There's certainly a distinction to be made between the abstract dependencies and the exact locked dependencies, but to me the natural way to model that distinction is by re-using the distinction we already have been source packages and binary packages. The build process for this placeholder wheel is to "compile down" the abstract dependencies into concrete dependencies, and the resulting wheel encodes the result of this compilation. Again, no new concepts needed. > In the vast majority of publication-to-PyPi cases people won't need > the "integrates" field, since what they're publishing on PyPI will > just be their abstract dependencies, and any warning against using > "==" will recommend using "~=" or ">=" instead. But there *are* > legitimate uses of pinning-for-publication (like the PyObjC > metapackage bundling all its subcomponents, or when building for > private deployment infastructure), so there needs to be a way to > represent "Yes, I'm pinning this dependency for publication, and I'm > aware of the significance of doing so" Why can't PyObjC just use regular dependencies? That's what distro metapackages have done for decades, right? -n -- Nathaniel J. Smith -- https://vorpus.org From dholth at gmail.com Wed Feb 15 09:24:41 2017 From: dholth at gmail.com (Daniel Holth) Date: Wed, 15 Feb 2017 14:24:41 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: Wheel puts everything important in METADATA, except entry_points.txt. The requirements expressed there under 'Requires-Dist' are reliable, and the full METADATA format is documented in the pre-JSON revision of PEP 426. At runtime, once pkg_resources parses it, *.egg-info and *.dist-info look identical, because it's just a different way to represent the same data. Wheel's version of METADATA exists as the simplest way to add the critical 'extras' feature to distutils2-era *.dist-info/METADATA, necessary to losslessly represent setuptools packages in a more PEP-standard way. I could have completely redesigned the METADATA format instead of extending it, but then I would have run out of time and wheel would not exist. This function converts egg-info metadata to METADATA https://bitbucket.org/pypa/wheel/src/54ddbcc9cec25e1f4d111a142b8bfaa163130a61/wheel/metadata.py?at=default&fileviewer=file-view-default#metadata.py-240 This one converts to the JSON format. It looks like it might work with PKG-INFO or METADATA. https://bitbucket.org/pypa/wheel/src/54ddbcc9cec25e1f4d111a142b8bfaa163130a61/wheel/metadata.py?at=default&fileviewer=file-view-default#metadata.py-98 On Wed, Feb 15, 2017 at 8:27 AM Nick Coghlan wrote: > On 15 February 2017 at 12:58, Nathaniel Smith wrote: > > On Wed, Feb 15, 2017 at 3:33 AM, Nick Coghlan > wrote: > >> - "requires": list where entries are either a string containing a PEP > >> 508 dependency specifier or else a hash map contain a "requires" key > >> plus "extra" or "environment" fields as qualifiers > >> - "integrates": replacement for "meta_requires" that only allows > >> pinned dependencies (i.e. hash maps with "name" & "version" fields, or > >> direct URL references, rather than a general PEP 508 specifier as a > >> string) > > > > What's accomplished by separating these? I really think we should > > strive to have fewer more orthogonal concepts whenever possible... > > It's mainly a matter of incorporating > https://caremad.io/posts/2013/07/setup-vs-requirement/ into the core > data model, as this distinction between abstract development > dependencies and concrete deployment dependencies is incredibly > important for any scenario that involves > publisher-redistributor-consumer chains, but is entirely non-obvious > to folks that are only familiar with the publisher-consumer case that > comes up during development-for-personal-and-open-source-use. > > One particular area where this is problematic is in the widespread > advice "always pin your dependencies" which is usually presented > without the all important "for application or service deployment" > qualifier. As a first approximation: > pinning-for-app-or-service-deployment == good, > pinning-for-local-testing == good, > pinning-for-library-or-framework-publication-to-PyPI == bad. > > pipenv borrows the Ruby solution to modeling this by having Pipfile > for abstract dependency declarations and Pipfile.lock for concrete > integration testing ones, so the idea here is to propagate that model > to pydist.json by separating the "requires" field with abstract > development dependencies from the "integrates" field with concrete > deployment dependencies. > > In the vast majority of publication-to-PyPi cases people won't need > the "integrates" field, since what they're publishing on PyPI will > just be their abstract dependencies, and any warning against using > "==" will recommend using "~=" or ">=" instead. But there *are* > legitimate uses of pinning-for-publication (like the PyObjC > metapackage bundling all its subcomponents, or when building for > private deployment infastructure), so there needs to be a way to > represent "Yes, I'm pinning this dependency for publication, and I'm > aware of the significance of doing so" > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Feb 15 09:55:53 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 15 Feb 2017 15:55:53 +0100 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 15 February 2017 at 14:00, Wes Turner wrote: > On Wed, Feb 15, 2017 at 5:33 AM, Nick Coghlan wrote: >> I asked Daniel to *stop* using pydist.json, since wheel was emitting a >> point-in-time snapshot of PEP 426 (which includes a lot of >> potentially-nice-to-have things that nobody has actually implemented >> so far, like the semantic dependency declarations and the enhancements >> to the extras syntax), rather than the final version of the spec. > > Would you send a link to the source for this? It came up when Vinay reported a problem with the way bdist_wheel was handling combined extras and environment marker definitions: https://bitbucket.org/pypa/wheel/issues/103/problem-with-currently-generated >> - dist-info/METADATA as defined at >> https://packaging.python.org/specifications/#package-distribution-metadata >> - dist-info/requires.txt runtime dependencies as defined at >> http://setuptools.readthedocs.io/en/latest/formats.html#requires-txt >> - dist-info/setup_requires.txt build time dependencies as defined at >> http://setuptools.readthedocs.io/en/latest/formats.html#setup-requires-txt >> >> The dependency fields in METADATA itself unfortunately aren't really >> useful for anything. > > Graph: Nodes and edges. Unfortunately, it's not that simple, since: - dependency declarations refer to time dependent node *sets*, not to specific edges - node resolution is not only time dependent, but also DNS and client configuration dependent - this is true even for "pinned" dependencies due to the way "==" handles post-releases and local build IDs - the legacy module based declarations are inconsistently populated and don't refer to nodes by a useful name - the new distribution package based declarations refer to nodes by a useful name, but largely aren't populated By contrast, METADATA *does* usefully define nodes in the graph, while requires.txt and setup_requires.txt can be used to extract edges when combined with suitable additional data sources (primarily a nominated index server or set of index servers to use for dependency specifier resolution). >> There's definitely still a place for a pydist.json created by going >> through PEP 426, comparing it to what bdist_wheel already does to >> populate metadata.json, and either changing the PEP to match the >> existing practice, or else agreeing that we prefer what the PEP >> recommends, that we want to move in that direction, and that there's a >> definite commitment to implement the changes in at least setuptools >> and bdist_wheel (plus a migration strategy that allows for reasonably >> sensible consumption of old metadata). > > Which function reads metadata.json? Likely eventually nothing, since anything important that it contains will be readable from either pydist.json or from the other legacy metadata files. > Which function reads pydist.json? Eventually everything, with tools falling back to dynamically generating it from legacy metadata formats as a transition plan to handle component releases made with older toolchains. > An RDFS Vocabulary contains Classes and Properties with rdfs:ranges and > rdfs:domains. > > There are many representations for RDF: RDF/XML, Turtle/N3, JSONLD. > > RDF is implementation-neutral. JSONLD is implementation-neutral. While true, both of these are still oriented towards working with a *resolved* graph snapshot, rather than a deliberately underspecified graph description that requires subsequent resolution within the node set of a particular index server (or set of index servers). Just incorporating the time dimension is already messy, even before accounting for the fact that the metadata carried with along the artifacts is designed to be independent of the particular server that happens to be hosting it. Tangent: if anyone is looking for an open source stack for working with distributed graph storage manipulation from Python, the combination of http://janusgraph.org/ and https://pypi.org/project/gremlinpython/ is well worth a look ;) >> The equivalent for PEP 426 would probably be legacy-to-pydist and >> pydist-to-legacy converters that setuptools, bdist_wheel and other >> publishing tools can use to ship legacy metadata alongside the >> standardised format (and I believe Daniel already has at least the >> former in order to generate metadata.json in bdist_wheel). With PEP >> 426 as currently written, a pydist-to-legacy converter isn't really >> feasible, since pydist proposes new concepts that can't be readily >> represented in the old format. > > pydist-to-legacy would be a lossy transformation. Given appropriate use of the "extras" system and a couple of new METADATA fields, it doesn't have to be, at least for the initial version - that's the new design constraint I'm proposing for everything that isn't defined as a metadata extension. The rationale being that if legacy dependency metadata can be reliably generated from the new format, that creates an incentive for *new* tools to adopt it ("generate the new format, get the legacy formats for free"), while also offering a clear migration path for existing publishing tools (refactor their metadata generation to produce the new format only, then derive the legacy metadata files from that) and consumption tools (consume the new fields immediately, look at consuming the new files later). >> > I understand that social reasons are often more important than technical >> > reasons >> > when it comes to success or failure of an approach; I'm just not sure >> > that >> > in this case, it wasn't given up on too early. >> >> I think of PEP 426 as "deferred indefinitely pending specific >> practical problems to provide clearer design constraints" rather than >> abandoned :) > > Is it too late to request lowercased property names without dashes? That's already the case in PEP 426 as far as I know. > class PackageMetadata > def __init__(): > self.data = collections.OrderedDict() > @staticmethod > def read_legacy() > def read_metadata_json() > def read_pydist_json() > def read_pyproject_toml() > def read_jsonld() > > def to_legacy(): > def to_metadata_json() > def to_pydist_json() > def to_pyproject_toml() > def to_jsonld() > > @classmethod > def Legacy() > def MetadataJson() > def PydistJson() > def PyprojectToml() > def Jsonld(cls, *args, **kwargs) > obj = cls(*args, **kwargs) > obj.read_jsonld(*args, **kwargs) > return obj > > @classmethod > def from(cls, path, > format='legacy|metadatajson|pydistjson|pyprojecttoml|jsonld'): > # or this > > > ... for maximum reusability, we really shouldn't need an adapter registry > here; I'm not really worried about the Python API at this point, I'm interested in the isomorphism of the data formats to help streamline the migration (as that's the current main problem with PEP 426). But yes, just as packaging grew "LegacyVersion" *after* PEP 440 defined the strict forward looking semantics, it will likely grow some additional tools for reading and converting the legacy formats once there's a clear pydist.json specification to document the semantics of the translated fields. >> 2. the new pipenv project to provide a simpler alternative to the >> pip+virtualenv+pip-tools combination for environment management in web >> service development (and similar layered application architectures). >> As with the "install vs setup" split in setuptools, pipenv settled on >> an "only two kinds of requirement (deployment and development)" model >> for usability reasons, but it also distinguishes abstract dependencies >> stored in Pipfile from pinned concrete dependencies stored in >> Pipfile.lock. > > Does the Pipfile/Pipfile.lock distinction overlap with 'integrates' as a > replacement for meta_requires? Somewhat - the difference is that where the concrete dependencies in Pipfile.lock are derived from the abstract dependencies in Pipfile, the separation in pydist.json would be a declaration of "Yes, I really did mean to publish this with a concrete dependency, it's not an accident". >> If we put those together with the existing interest in automating >> generation of policy compliant operating system distribution packages, > > > Downstream OS packaging could easily (and without permission) include extra > attributes (properties specified with full URIS) in JSONLD metadata. We can already drop arbitrary files into dist-info directories if we really want to, but in practice that extra metadata tends to end up in the system level package database rather than in the Python metadata. >> - "integrates": replacement for "meta_requires" that only allows >> pinned dependencies (i.e. hash maps with "name" & "version" fields, or >> direct URL references, rather than a general PEP 508 specifier as a >> string) > > > Pipfile.lock? > > What happens here when something is listed in both requires and integrates? Simplest would be to treat it the same way that tools treat mentioning the same component in multiple requirements entries (since that's really what you'd be doing). > Where/do these get merged on the "name" attr as a key, given a presumed > namespace URI prefix (https://pypi.org/project/)? For installation purposes, they'd be combined into a single requirements set. >> For converting old metadata, any concrete dependencies that are >> compatible with the "integrates" field format would be mapped that >> way, while everything else would be converted to "requires" entries. > > What heuristic would help identify compatibility with the integrates field? PEP 440 version matching (==), arbitrary equality (===), and direct references (@...), with the latter being disallowed on PyPI (but fine when using a private index server). >> The semantic differences between normal runtime dependencies and >> "dev", "test", "doc" and "build" requirements would be handled as >> extras, regardless of whether you were using the old metadata format >> or the new one. > > +1 from me. > > I can't recall whether I've used {"dev", "test", "doc", and "build"} as > extras names in the past; though I can remember thinking "wouldn't it be > more intuitive to do it [that way]" > > Is this backward compatible? Extras still work as extras? Yeah, this is essentially the way Provide-Extra ended up being documented in https://packaging.python.org/specifications/#provides-extra-multiple-use That already specifies the expected semantics for "test" and "doc", so it would be a matter of adding "dev" and "build" (as well as surveying PyPI for components that already defined those extras) >> P.S. I'm definitely open to a PR that amends the PEP 426 draft along >> these lines. I'll get to it eventually myself, but there are some >> other things I see as higher priority for my open source time at the >> moment (specifically the C locale handling behaviour of Python 3.6 in >> Fedora 26 and the related upstream proposal for Python 3.7 in PEP 538) > > I need to find a job; my time commitment here is inconsistent. Yeah, I assume work takes precedence for everyone, which is why I spend time needling redistributors and major end users about the disparity between "level of use" and "level of investment" when it comes to the upstream Python packaging ecosystem. While progress on that front isn't particularly visible yet, the nature of the conversations are changing in a good > I'm working on a project (nbmeta) for generating, displaying, and embedding > RDFa and JSONLD in Jupyter notebooks (w/ _repr_html_() and an OrderedDict) > which should refresh the JSONLD @context-writing skills necessary to define > the RDFS vocabulary we could/should have at https://schema.python.org/ . I'm definitely open to ensuring the specs are RDF/JSONLD friendly, especially as some of the characteristics of that are beneficial in other kinds of mappings as well (e.g. lists-of-hash-maps-with-fixed-key-names are easier to work with than hash-maps-with-data-dependent-key-names for a whole lot of reasons). > - [ ] JSONLD PEP (<- PEP426) > - [ ] examples / test cases > - I've referenced IPython as an example package; are there other hard > test cases for python packaging metadata conversion? (i.e. one that uses > every feature of each metadata format)? PyObjC is my standard example for legitimate version pinning in a public project (it's a metapackage where each release just depends on particular versions of the individual components) django-mezzanine is one I like as a decent example of a reasonably large dependency tree for something that still falls short of a complete application setuptools is a decent example for basic use of environment markers I haven't found great examples for defining lots of extras or using complex environment marker options (but I also haven't really gone looking) > - [ ] JSONLD @context > - [ ] class PackageMetadata > - [ ] wheel: (additionally) generate JSONLD metadata > - [ ] schema.python.org: master, gh-pages (or e.g. > "https://www.pypa.io/ns#") > > - [ ] warehouse: add a ./jsonld view (to elgacy?) This definitely won't be an option for the legacy service, but it could be an interesting addition to Warehouse. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From p.f.moore at gmail.com Wed Feb 15 09:58:01 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 15 Feb 2017 14:58:01 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 15 February 2017 at 14:11, Nathaniel Smith wrote: >> It's mainly a matter of incorporating >> https://caremad.io/posts/2013/07/setup-vs-requirement/ into the core >> data model, as this distinction between abstract development >> dependencies and concrete deployment dependencies is incredibly >> important for any scenario that involves >> publisher-redistributor-consumer chains, but is entirely non-obvious >> to folks that are only familiar with the publisher-consumer case that >> comes up during development-for-personal-and-open-source-use. > > Maybe I'm just being dense but, umm. I don't know what any of these > words mean :-). I'm not unfamiliar with redistributors; part of my > confusion is that this is a concept that AFAIK distro package systems > don't have. Maybe it would help if you have a concrete example of a > scenario where they would benefit from having this distinction? I'm also finding this discussion bafflingly complex. I understand that distributions need a way to work with Python packages, but the issues involved seem completely divorced from the basic process of a user using pip to install a package with the dependencies it needs to work in their program. The package metadata standardisation process seems to be falling foul of a quest for perfection. Is there no 80% solution that covers the bulk of use cases (which, in my mind, are all around some user wanting to say "pip install" to grab some stuff off PyPI to build his project)? Or is the 80% solution precisely what we have at the moment, in which case can't we standardise what we have, and *then* look to extend to cover the additional requirements? I'm sure I'm missing something - but honestly, I'm not sure what it is. If I write something to go on PyPI, I assume that makes me a "publisher"? IMO, my audience is people who use my software (the "consumers" in your terms, I guess). While I'd be pleased if a distributor like Ubuntu or Fedora or Anaconda wanted to include my package in their distribution, I wouldn't see them as my end users - so while I'd be OK with tweaking my code/metadata to accommodate their needs, it's not a key goal for me. And learning all the metadata concepts related to packaging my project for distributors wouldn't appeal to me at all. I'd be happy for the distributions to to that and send me PRs, but the burden should be on them to do that. The complexities we're debating here seem to be based on the idea that *I* should understand the distributor's role in order to package my code "correctly". I'm not at all sure I agree with that. Maybe this is all a consequence of Python now being used in "big business", and the old individual developer scratching his or her own itch model is gone. And maybe that means PyPI is no longer a suitable place for such "use at your own risk" code But if that's the case, maybe we need to acknowledge that fact, before we end up with people getting the idea that "Python packaging is too complex for the average developer". Because it's starting to feel that way :-( Paul From vinay_sajip at yahoo.co.uk Wed Feb 15 10:31:04 2017 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Wed, 15 Feb 2017 15:31:04 +0000 (UTC) Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: <159421251.8715496.1487172664114@mail.yahoo.com> > the full METADATA format is documented in the pre-JSON revision of PEP 426. Can you confirm which exact revision in the PEPs repo you mean? I could guess at 0451397. That version does not refer to a field "Requires" (rather, the more recent "Requires-Dist"). Your conversion function reads the existing PKG-INFO, updates the Metadata-Version, and adds "Provides-Dist" and "Requires-Dist". It does not check whether the result conforms to that version of the PEP. For example, in the presence of "Requires" in PKG-INFO, you add "Requires-Dist", possibly leading to an ambiguity, because they sort of mean the same thing but could contain conflicting information (for example, different version constraints). The python-dateutils wheel which Jim referred to contained both "Requires" and "Requires-Dist" fields in its METADATA file, and, faced with a metadata set with both fields, the old packaging code used by distlib to handle the different metadata versions raised a "Unknown metadata set" error. In the face of ambiguity, it's refusing the temptation to guess :-) If the conversion function adds "Requires-Dist" but doesn't remove "Requires", I'm not sure it conforms to that version of the PEP. Regards, Vinay Sajip From dholth at gmail.com Wed Feb 15 10:40:47 2017 From: dholth at gmail.com (Daniel Holth) Date: Wed, 15 Feb 2017 15:40:47 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: <159421251.8715496.1487172664114@mail.yahoo.com> References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> <159421251.8715496.1487172664114@mail.yahoo.com> Message-ID: IIUC PEP 345, the predecessor of PEP 426, replaced Requires with Requires-Dist because the former was never very well specified, easier to re-name the field rather than redefine it. bdist_wheel's egg-info conversion assumes the only useful requirements are in the setuptools requires.txt. It would make sense to go ahead and delete the obsolete fields, I'm sure they were overlooked because they are not common in the wild. >From PEP 345: - Deprecated fields: - Requires (in favor of Requires-Dist) - Provides (in favor of Provides-Dist) - Obsoletes (in favor of Obsoletes-Dist) On Wed, Feb 15, 2017 at 10:31 AM Vinay Sajip wrote: > > the full METADATA format is documented in the pre-JSON revision of PEP > 426. > > Can you confirm which exact revision in the PEPs repo you mean? I could > guess at > 0451397. That version does not refer to a field "Requires" (rather, the > more recent > "Requires-Dist"). Your conversion function reads the existing PKG-INFO, > updates the > Metadata-Version, and adds "Provides-Dist" and "Requires-Dist". It does > not check > whether the result conforms to that version of the PEP. For example, in > the presence > of "Requires" in PKG-INFO, you add "Requires-Dist", possibly leading to an > ambiguity, > because they sort of mean the same thing but could contain conflicting > information > (for example, different version constraints). The python-dateutils wheel > which Jim > referred to contained both "Requires" and "Requires-Dist" fields in its > METADATA > file, and, faced with a metadata set with both fields, the old packaging > code used > by distlib to handle the different metadata versions raised a "Unknown > metadata set" > error. In the face of ambiguity, it's refusing the temptation to guess :-) > > If the conversion function adds "Requires-Dist" but doesn't remove > "Requires", I'm not > sure it conforms to that version of the PEP. > > Regards, > > Vinay Sajip > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Feb 15 10:41:48 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 15 Feb 2017 16:41:48 +0100 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 15 February 2017 at 15:11, Nathaniel Smith wrote: > On Wed, Feb 15, 2017 at 5:27 AM, Nick Coghlan wrote: >> It's mainly a matter of incorporating >> https://caremad.io/posts/2013/07/setup-vs-requirement/ into the core >> data model, as this distinction between abstract development >> dependencies and concrete deployment dependencies is incredibly >> important for any scenario that involves >> publisher-redistributor-consumer chains, but is entirely non-obvious >> to folks that are only familiar with the publisher-consumer case that >> comes up during development-for-personal-and-open-source-use. > > Maybe I'm just being dense but, umm. I don't know what any of these > words mean :-). I'm not unfamiliar with redistributors; part of my > confusion is that this is a concept that AFAIK distro package systems > don't have. Maybe it would help if you have a concrete example of a > scenario where they would benefit from having this distinction? It's about error messages and nudges in the UX: if PyPI rejects version pinning in "requires" by default, then that creates an opportunity to nudge people towards using "~=" or ">=" instead (as in the vast majority of cases, that will be a better option than pinning-for-publication). The inclusion of "integrates" then adds back the support for legitimate version pinning use cases in pydist.json in a way that makes it clear that it is a conceptually distinct operation from a normal dependency declaration. >> pipenv borrows the Ruby solution to modeling this by having Pipfile >> for abstract dependency declarations and Pipfile.lock for concrete >> integration testing ones, so the idea here is to propagate that model >> to pydist.json by separating the "requires" field with abstract >> development dependencies from the "integrates" field with concrete >> deployment dependencies. > > What's the benefit of putting this in pydist.json? I feel like for the > usual deployment cases (a) going straight from Pipfile.lock -> venv is > pretty much sufficient, with no need to put this into a package, but > (b) if you really do want to put it into a package, then the natural > approach would be to make an empty wheel like > "my-django-app-deploy.whl" whose dependencies were the contents of > Pipfile.lock. My goal with the split is to get to a state where: - exactly zero projects on PyPI use "==" or "===" in their requires metadata (because PyPI explicitly prohibits it) - the vast majority of projects on PyPI *don't* have an "integrates" section - those projects that do have an `integrates` section have a valid reason for it (like PyObjC) For anyone making the transition from application and web service development to library and framework development, the transition from "always pin exact versions of your dependencies for deployment" to "when publishing a library or framework, only rule out the combinations that you're pretty sure *won't* work" is one of the trickiest to deal with as current tools *don't alert you to the fact that there's a difference to be learned*. Restricting what can go into requires creates an opportunity to ask users whether they're publishing a pre-integrated project or not: if yes, then they add the "integrates" field and put their pinned dependencies there; if not, then they relax the "==" constraints to "~=" or ">=". Either way, PyPI will believe your answer, it's just refusing the temptation to guess that using "==" or "===" in the requires section is sufficient to indicate that you're deliberately publishing a pre-integrated project. > There's certainly a distinction to be made between the abstract > dependencies and the exact locked dependencies, but to me the natural > way to model that distinction is by re-using the distinction we > already have been source packages and binary packages. The build > process for this placeholder wheel is to "compile down" the abstract > dependencies into concrete dependencies, and the resulting wheel > encodes the result of this compilation. Again, no new concepts needed. Source vs binary isn't where the distinction applies, though. For example, it's legitimate for PyObjC to have pinned dependencies even when distributed in source form, as it's a metapackage used solely to integrate the various PyObjC subprojects into a single "release". >> In the vast majority of publication-to-PyPi cases people won't need >> the "integrates" field, since what they're publishing on PyPI will >> just be their abstract dependencies, and any warning against using >> "==" will recommend using "~=" or ">=" instead. But there *are* >> legitimate uses of pinning-for-publication (like the PyObjC >> metapackage bundling all its subcomponents, or when building for >> private deployment infastructure), so there needs to be a way to >> represent "Yes, I'm pinning this dependency for publication, and I'm >> aware of the significance of doing so" > > Why can't PyObjC just use regular dependencies? That's what distro > metapackages have done for decades, right? If PyObjC uses regular dependencies then there's no opportunity for PyPI to ask "Did you really mean that?" when people pin dependencies in "requires". That makes it likely we'll end up with a lot of unnecessarily restrictive "==" constraints in PyPI packages ("Works on my machine!"), which creates problems when attempting to auto-generate distro packages from upstream ones. The distro case isn't directly analagous, since there are a few key differences: - open publication platform rather than a pre-approved set of package maintainers - no documented packaging policies with related human review & approval processes - a couple of orders magnitude difference in the number of packages involved - at least in RPM, you can have a spec file with no source tarball, which makes it obvious it's a metapackage Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Wed Feb 15 10:48:44 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 15 Feb 2017 16:48:44 +0100 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 15 February 2017 at 15:58, Paul Moore wrote: > On 15 February 2017 at 14:11, Nathaniel Smith wrote: >>> It's mainly a matter of incorporating >>> https://caremad.io/posts/2013/07/setup-vs-requirement/ into the core >>> data model, as this distinction between abstract development >>> dependencies and concrete deployment dependencies is incredibly >>> important for any scenario that involves >>> publisher-redistributor-consumer chains, but is entirely non-obvious >>> to folks that are only familiar with the publisher-consumer case that >>> comes up during development-for-personal-and-open-source-use. >> >> Maybe I'm just being dense but, umm. I don't know what any of these >> words mean :-). I'm not unfamiliar with redistributors; part of my >> confusion is that this is a concept that AFAIK distro package systems >> don't have. Maybe it would help if you have a concrete example of a >> scenario where they would benefit from having this distinction? > > I'm also finding this discussion bafflingly complex. I understand that > distributions need a way to work with Python packages, but the issues > involved seem completely divorced from the basic process of a user > using pip to install a package with the dependencies it needs to work > in their program. As simple as I can make it: * pinning dependencies when publishing to PyPI is presumptively bad * PyPI itself (not client tools) should warn you that it's a bad idea * however, there are legitimate use cases for pinning in PyPI packages * so there should be a way to do it, but it should involve telling PyPI "I am an integration project, this is OK" Most people should never touch the "integrates" field, they should just change "==" to "~=" or ">=" to allow for future releases of their dependencies. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From p.f.moore at gmail.com Wed Feb 15 10:49:44 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 15 Feb 2017 15:49:44 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 15 February 2017 at 15:41, Nick Coghlan wrote: > My goal with the split is to get to a state where: > > - exactly zero projects on PyPI use "==" or "===" in their requires > metadata (because PyPI explicitly prohibits it) > - the vast majority of projects on PyPI *don't* have an "integrates" section > - those projects that do have an `integrates` section have a valid > reason for it (like PyObjC) So how many projects on PyPI currently have == or === in their requires? I've never seen one (although my sample size isn't large - but it does cover major packages in a large-ish range of application areas). I'm curious as to how major this problem is in practice. I (now) understand the theoretical argument for the proposal. Paul From freddyrietdijk at fridh.nl Wed Feb 15 10:50:18 2017 From: freddyrietdijk at fridh.nl (Freddy Rietdijk) Date: Wed, 15 Feb 2017 16:50:18 +0100 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: > Maybe it would help if you have a concrete example of a scenario where they would benefit from having this distinction? In the Nix package manager (source distribution with binary substitutes) and Nixpkgs package set we typically require the filename and hash of a package. In our expressions we typically pass an url (that includes the name), and the hash. The url is only needed when the file isn't in our store. This is convenient, because if an url is optional this allows you to pre-fetch or work with mirrors. All we care about is that we get the file, not how it is provided. This applies for source archives, but behind the scenes also for binary substitutes. With Nix, functions build a package, and dependencies are passed as function arguments with names that typically, but not necessarily, resemble the dependency name. Now, a function that builds a package, a package builder, only needs to be provided with abstract dependencies; it just needs to know what it should look for, "we need 'a' numpy, 'a' scipy, 'a compiler that has a certain interface and can do this job'", etc.. Version numbers can help in order to fail prematurely, but generally only bounds, not a pinned value. Its up to another tool to provide the builder with the actual packages, the concrete dependencies to the builder. And this tool might fetch it from PyPI, or from GitHub, or... The same goes for building, distributing and installing Python packages. Setuptools shouldn't bother with versions (except the constraints in case of libraries) or wherever a source comes from but just build or fail. Pip should just fetch/resolve and pass concrete dependencies to whatever builder (Setuptools, Flit), or whatever environment (virtualenv) needs it. It's quite frustrating as a downstream having to deal with packages where versions are pinned unnecessarily and therefore I've also requested on the Setuptools tracker a flag that ignores constraints [1] (though I fear I would have to pull up my sleeves for this one :) ) . [1] https://github.com/pypa/setuptools/issues/894 On Wed, Feb 15, 2017 at 3:11 PM, Nathaniel Smith wrote: > On Wed, Feb 15, 2017 at 5:27 AM, Nick Coghlan wrote: > > On 15 February 2017 at 12:58, Nathaniel Smith wrote: > >> On Wed, Feb 15, 2017 at 3:33 AM, Nick Coghlan > wrote: > >>> - "requires": list where entries are either a string containing a PEP > >>> 508 dependency specifier or else a hash map contain a "requires" key > >>> plus "extra" or "environment" fields as qualifiers > >>> - "integrates": replacement for "meta_requires" that only allows > >>> pinned dependencies (i.e. hash maps with "name" & "version" fields, or > >>> direct URL references, rather than a general PEP 508 specifier as a > >>> string) > >> > >> What's accomplished by separating these? I really think we should > >> strive to have fewer more orthogonal concepts whenever possible... > > > > It's mainly a matter of incorporating > > https://caremad.io/posts/2013/07/setup-vs-requirement/ into the core > > data model, as this distinction between abstract development > > dependencies and concrete deployment dependencies is incredibly > > important for any scenario that involves > > publisher-redistributor-consumer chains, but is entirely non-obvious > > to folks that are only familiar with the publisher-consumer case that > > comes up during development-for-personal-and-open-source-use. > > Maybe I'm just being dense but, umm. I don't know what any of these > words mean :-). I'm not unfamiliar with redistributors; part of my > confusion is that this is a concept that AFAIK distro package systems > don't have. Maybe it would help if you have a concrete example of a > scenario where they would benefit from having this distinction? > > > One particular area where this is problematic is in the widespread > > advice "always pin your dependencies" which is usually presented > > without the all important "for application or service deployment" > > qualifier. As a first approximation: > > pinning-for-app-or-service-deployment == good, > > pinning-for-local-testing == good, > > pinning-for-library-or-framework-publication-to-PyPI == bad. > > > > pipenv borrows the Ruby solution to modeling this by having Pipfile > > for abstract dependency declarations and Pipfile.lock for concrete > > integration testing ones, so the idea here is to propagate that model > > to pydist.json by separating the "requires" field with abstract > > development dependencies from the "integrates" field with concrete > > deployment dependencies. > > What's the benefit of putting this in pydist.json? I feel like for the > usual deployment cases (a) going straight from Pipfile.lock -> venv is > pretty much sufficient, with no need to put this into a package, but > (b) if you really do want to put it into a package, then the natural > approach would be to make an empty wheel like > "my-django-app-deploy.whl" whose dependencies were the contents of > Pipfile.lock. > > There's certainly a distinction to be made between the abstract > dependencies and the exact locked dependencies, but to me the natural > way to model that distinction is by re-using the distinction we > already have been source packages and binary packages. The build > process for this placeholder wheel is to "compile down" the abstract > dependencies into concrete dependencies, and the resulting wheel > encodes the result of this compilation. Again, no new concepts needed. > > > In the vast majority of publication-to-PyPi cases people won't need > > the "integrates" field, since what they're publishing on PyPI will > > just be their abstract dependencies, and any warning against using > > "==" will recommend using "~=" or ">=" instead. But there *are* > > legitimate uses of pinning-for-publication (like the PyObjC > > metapackage bundling all its subcomponents, or when building for > > private deployment infastructure), so there needs to be a way to > > represent "Yes, I'm pinning this dependency for publication, and I'm > > aware of the significance of doing so" > > Why can't PyObjC just use regular dependencies? That's what distro > metapackages have done for decades, right? > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas at kluyver.me.uk Wed Feb 15 11:07:15 2017 From: thomas at kluyver.me.uk (Thomas Kluyver) Date: Wed, 15 Feb 2017 16:07:15 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> <159421251.8715496.1487172664114@mail.yahoo.com> Message-ID: <1487174835.595075.881975160.671E3576@webmail.messagingengine.com> On Wed, Feb 15, 2017, at 03:40 PM, Daniel Holth wrote: > It would make sense to go ahead and delete the obsolete fields, I'm > sure they were overlooked because they are not common in the wild. > > From PEP 345: > * Deprecated fields: > * Requires (in favor of Requires-Dist) > * Provides (in favor of Provides-Dist) For reference, packages made with flit do use 'Provides' to indicate the name of the importable module or package that the distribution installs. This seems to me to be something worth exposing - in another thread, we're discussing downloading and scanning packages to get this information. But I accept that it's not very useful while only a tiny minority of packages do it. Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Wed Feb 15 11:14:23 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 15 Feb 2017 16:14:23 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 15 February 2017 at 15:50, Freddy Rietdijk wrote: > It's quite frustrating as a downstream having to deal with packages where > versions are pinned unnecessarily and therefore I've also requested on the > Setuptools tracker a flag that ignores constraints [1] (though I fear I > would have to pull up my sleeves for this one :) ) . Sort of repeating my earlier question, but how often does this happen in reality? (As a proportion of the packages you deal with). And how often is it that a simple request/PR to the package author to remove the explicit version requirements is rejected? (I assume your first response is to file an issue with upstream?) If you *do* get in a situation where the package explicitly requires certain versions of its dependencies, and you ignore those requirements, then presumably you're taking responsibility for supporting a combination that the upstream author doesn't support. How do you handle that? I'm not trying to pick on your process here, or claim that distributions are somehow doing things wrongly. But I am trying to understand what redistributors' expectations are of package authors. Nick said he wants to guide authors away from explicit version pinning. That's fine, but is the problem so big that the occasional bug report to offending projects saying "please don't pin exact versions" is insufficient guidance? Paul From freddyrietdijk at fridh.nl Wed Feb 15 11:55:22 2017 From: freddyrietdijk at fridh.nl (Freddy Rietdijk) Date: Wed, 15 Feb 2017 17:55:22 +0100 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: > Sort of repeating my earlier question, but how often does this happen in reality? >From a quick check in our repo we have patched about 1% of our packages to remove the constraints. We have close to 2000 Python packages. We don't necessarily patch all the constraints, only when they collide with the version we would like the package to use so the actual percentage is likely higher. Larger applications that have many dependencies that are fixed have been kept out of Nixpkgs for now. Their fixed dependencies means we likely need multiple versions of packages. While Nix can handle that, it means more maintenance. We have a tool that can take e.g. a requirements.txt file and generate expressions, but it won't help you much with bug-fix releases when maintainers don't update their pinned requirements. > And how often is it that a simple request/PR to the package author to remove the explicit version requirements is rejected? That's hard to say. If I look at what packages I've contributed to Nixpkgs, then in my experience this is something that is typically dealt with by upstream when asked. > If you *do* get in a situation where the package explicitly requires certain versions of its dependencies, and you ignore those requirements, then presumably you're taking responsibility for supporting a combination that the upstream author doesn't support. How do you handle that? Typical situations are bug-fix releases. So far I haven't encountered any issues with using other versions, but like I said, larger applications that pin their dependencies have been mostly kept out of Nixpkgs. If we do encounter issues, then we have to find a solution. The likeliest situation is that an application requires a different version, an in that case we would then have an expression/package of that version specifically for that application. We don't have a global site-packages so we can do that. > Nick said he wants to guide authors away from explicit version pinning. That's fine, but is the problem so big that the occasional bug report to offending projects saying "please don't pin exact versions" is insufficient guidance? The main problem I see is that it limits in how far you can automatically update to newer versions and thus release bug/security fixes. Just one inappropriate pin is sufficient to break dependency solving. On Wed, Feb 15, 2017 at 5:14 PM, Paul Moore wrote: > On 15 February 2017 at 15:50, Freddy Rietdijk > wrote: > > It's quite frustrating as a downstream having to deal with packages where > > versions are pinned unnecessarily and therefore I've also requested on > the > > Setuptools tracker a flag that ignores constraints [1] (though I fear I > > would have to pull up my sleeves for this one :) ) . > > Sort of repeating my earlier question, but how often does this happen > in reality? (As a proportion of the packages you deal with). And how > often is it that a simple request/PR to the package author to remove > the explicit version requirements is rejected? (I assume your first > response is to file an issue with upstream?) > > If you *do* get in a situation where the package explicitly requires > certain versions of its dependencies, and you ignore those > requirements, then presumably you're taking responsibility for > supporting a combination that the upstream author doesn't support. How > do you handle that? > > I'm not trying to pick on your process here, or claim that > distributions are somehow doing things wrongly. But I am trying to > understand what redistributors' expectations are of package authors. > Nick said he wants to guide authors away from explicit version > pinning. That's fine, but is the problem so big that the occasional > bug report to offending projects saying "please don't pin exact > versions" is insufficient guidance? > > Paul > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Wed Feb 15 12:01:31 2017 From: brett at python.org (Brett Cannon) Date: Wed, 15 Feb 2017 17:01:31 +0000 Subject: [Distutils] Python installation not working In-Reply-To: <2038221091.6741722.1487050035224@mail.yahoo.com> References: <2038221091.6741722.1487050035224.ref@mail.yahoo.com> <2038221091.6741722.1487050035224@mail.yahoo.com> Message-ID: This actually isn't the right place to ask for installation help, Chitra (this list is about how to package up Python projects). For general support questions you should email python-list. On Wed, 15 Feb 2017 at 05:11 Chitra Dewan via Distutils-SIG < distutils-sig at python.org> wrote: > Hello, > > I am *beginner in Python* > I am facing problems in installing Python 3.5 on my windows vista x32 > machine. > I downloaded python-3.5.2.exe from Python.org. It is downloaded as an > exe. When I try to install it via "Run as administrator" , nothing > happens. Same behavior with 3.6 version > > kindly advise > > > Regards & Thanks, Chitra Dewan > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jim at jimfulton.info Wed Feb 15 12:01:55 2017 From: jim at jimfulton.info (Jim Fulton) Date: Wed, 15 Feb 2017 12:01:55 -0500 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On Wed, Feb 15, 2017 at 11:55 AM, Freddy Rietdijk wrote: > > Sort of repeating my earlier question, but how often does this happen > in reality? > > From a quick check in our repo we have patched about 1% of our packages to > remove the constraints. We have close to 2000 Python packages. We don't > necessarily patch all the constraints, only when they collide with the > version we would like the package to use so the actual percentage is likely > higher. > > Larger applications that have many dependencies that are fixed have been > kept out of Nixpkgs for now. Their fixed dependencies means we likely need > multiple versions of packages. While Nix can handle that, it means more > maintenance. We have a tool that can take e.g. a requirements.txt file and > generate expressions, but it won't help you much with bug-fix releases when > maintainers don't update their pinned requirements. > I suppose this isn't a problem for Java applications, which use jar files and per-application class paths. Jim -- Jim Fulton http://jimfulton.info -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Wed Feb 15 12:57:39 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 15 Feb 2017 17:57:39 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: Thanks for your reply, it was very helpful. On 15 February 2017 at 16:55, Freddy Rietdijk wrote: > Larger applications that have many dependencies that are fixed have been > kept out of Nixpkgs for now. I notice here (and in a few other places) you talk about "Applications". From what I understand of Nick's position, applications absolutely should pin their dependencies - so if I'm understanding correctly, those applications will (and should) continue to pin exact versions. As regards automatic packaging of new upstream versions (of libraries rather than applications), I guess if you get upstream to remove the pinned versions, this problem goes away. > The main problem I see is that it limits in how far you can automatically update to newer versions and thus release bug/security fixes. Just one inappropriate pin is sufficient to break dependency solving. I'm not sure I follow this. Suppose we have foo 1.0 depending on bar. If foo 1.0 has doesn't pin bar (possibly because you reported to them that they shouldn't) then foo 1.1 isn't going to suddenly add the pin back. So you can update foo fine. And you can update bar because there's no pin. So yes, while "one inappropriate pin" can cause a problem, getting upstream to fix that is a one-off cost, not an ongoing issue. So, in summary, * I agree that libraries pinning dependencies too tightly is bad. * Distributions can easily enough report such pins upstream when the library is initially packaged, so there's no ongoing cost here (just possibly a delay before the library can be packaged). * Libraries can legitimately have appropriate pins (typically to ranges of versions). So distributions have to be able to deal with that. * Applications *should* pin precise versions. Distributions have to decide whether to respect those pins or remove them and then take on support of the combination that upstream doesn't support. * But application pins should be in a requirements.txt file, so ignoring version specs is pretty simple (just a script to run against the requirements file). * Because Python doesn't support multiple installed versions of packages, conflicting requirements *will* be a problem that distros have to solve themselves (the language response is "use a venv"). Nick is suggesting that the requirement metadata be prohibited from using exact pins, but there's alternative metadata for "yes, I really mean an exact pin". To me: 1. This doesn't have any bearing on *application* pins, as they aren't in metadata. 2. Distributions still have to be able to deal with libraries having exact pins, as it's an explicitly supported possibility. 3. You can still manage (effectively) exact pins without being explicit - foo >1.6,<1.8 pretty much does it. And that doesn't even have to be a deliberate attempt to break the system, it could be a genuine attempt to avoid known issues, that just got too aggressive. So we're left with additional complexity for library authors to understand, for what seems like no benefit in practice to distribution builders. The only stated benefit of the 2 types of metadata is to educate library authors of the benefits of not pinning versions - and it seems like a very sweeping measure, where bug reports from distributions seem like they would be a much more focused and just as effective approach. Paul From njs at pobox.com Wed Feb 15 13:10:08 2017 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 15 Feb 2017 10:10:08 -0800 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On Feb 15, 2017 07:41, "Nick Coghlan" wrote: >> pipenv borrows the Ruby solution to modeling this by having Pipfile >> for abstract dependency declarations and Pipfile.lock for concrete >> integration testing ones, so the idea here is to propagate that model >> to pydist.json by separating the "requires" field with abstract >> development dependencies from the "integrates" field with concrete >> deployment dependencies. > > What's the benefit of putting this in pydist.json? I feel like for the > usual deployment cases (a) going straight from Pipfile.lock -> venv is > pretty much sufficient, with no need to put this into a package, but > (b) if you really do want to put it into a package, then the natural > approach would be to make an empty wheel like > "my-django-app-deploy.whl" whose dependencies were the contents of > Pipfile.lock. My goal with the split is to get to a state where: - exactly zero projects on PyPI use "==" or "===" in their requires metadata (because PyPI explicitly prohibits it) - the vast majority of projects on PyPI *don't* have an "integrates" section - those projects that do have an `integrates` section have a valid reason for it (like PyObjC) For anyone making the transition from application and web service development to library and framework development, the transition from "always pin exact versions of your dependencies for deployment" to "when publishing a library or framework, only rule out the combinations that you're pretty sure *won't* work" is one of the trickiest to deal with as current tools *don't alert you to the fact that there's a difference to be learned*. Restricting what can go into requires creates an opportunity to ask users whether they're publishing a pre-integrated project or not: if yes, then they add the "integrates" field and put their pinned dependencies there; if not, then they relax the "==" constraints to "~=" or ">=". Ah-hah, this does make sense as a problem, thanks! However, your solution seems very odd to me :-). If the goal is to put an "are you sure/yes I'm sure" UX barrier between users and certain version settings, then why make a distinction that every piece of downstream software has to be aware of and ignore? Pypi seems like a funny place in the stack to be implementing this. It would be much simpler to implement this feature at the build system level, like e.g. setuptools could require that dependencies that you think are over strict be specified in an install_requires_yes_i_really_mean_it= field, without requiring any metadata changes. Basically it sounds like you're saying you want to extend the metadata so that it can represent both broken and non-broken packages, so that both can be created, passed around, and checked for. And I'm saying, how about instead we do that checking when creating the package in the first place. (Of course I can't see any way to do any of this that won't break existing sdists, but I guess you've already decided you're OK with that. I guess I should say that I'm a bit dubious that this is so important in the first place; I feel like there are lots of legitimate use cases for == dependencies and lots of kinds of linting we might want to apply to try and improve the level of packaging quality.) Either way, PyPI will believe your answer, it's just refusing the temptation to guess that using "==" or "===" in the requires section is sufficient to indicate that you're deliberately publishing a pre-integrated project. > There's certainly a distinction to be made between the abstract > dependencies and the exact locked dependencies, but to me the natural > way to model that distinction is by re-using the distinction we > already have been source packages and binary packages. The build > process for this placeholder wheel is to "compile down" the abstract > dependencies into concrete dependencies, and the resulting wheel > encodes the result of this compilation. Again, no new concepts needed. Source vs binary isn't where the distinction applies, though. For example, it's legitimate for PyObjC to have pinned dependencies even when distributed in source form, as it's a metapackage used solely to integrate the various PyObjC subprojects into a single "release". ?? So that means that some packages have a loosely specified source that compiles down to a more strictly specified binary, and some have a more strictly specified source that compiles down to an equally strictly specified binary. That's... an argument in favor of my way of thinking about it, isn't it? That it can naturally express both situations? My point is that *for the cases where there's an important distinction between Pipfile and Pipfile.lock*, we already have a way to think about that distinction without introducing new concepts. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From dholth at gmail.com Wed Feb 15 13:15:04 2017 From: dholth at gmail.com (Daniel Holth) Date: Wed, 15 Feb 2017 18:15:04 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: I also get a little frustrated with this kind of proposal "no pins" which I read as "annoy the publisher to try to prevent them from annoying the consumer". As a free software publisher I feel entitled to annoy the consumer, an activity I will indulge in inversely proportional to my desire for users. Who is the star? It should be possible to publish applications to pypi. Much of the packaging we have is completely web application focused, these applications are not usually published at all. On Wed, Feb 15, 2017 at 12:58 PM Paul Moore wrote: > Thanks for your reply, it was very helpful. > > On 15 February 2017 at 16:55, Freddy Rietdijk > wrote: > > Larger applications that have many dependencies that are fixed have been > > kept out of Nixpkgs for now. > > I notice here (and in a few other places) you talk about > "Applications". From what I understand of Nick's position, > applications absolutely should pin their dependencies - so if I'm > understanding correctly, those applications will (and should) continue > to pin exact versions. > > As regards automatic packaging of new upstream versions (of libraries > rather than applications), I guess if you get upstream to remove the > pinned versions, this problem goes away. > > > The main problem I see is that it limits in how far you can > automatically update to newer versions and thus release bug/security fixes. > Just one inappropriate pin is sufficient to break dependency solving. > > I'm not sure I follow this. Suppose we have foo 1.0 depending on bar. > If foo 1.0 has doesn't pin bar (possibly because you reported to them > that they shouldn't) then foo 1.1 isn't going to suddenly add the pin > back. So you can update foo fine. And you can update bar because > there's no pin. So yes, while "one inappropriate pin" can cause a > problem, getting upstream to fix that is a one-off cost, not an > ongoing issue. > > So, in summary, > > * I agree that libraries pinning dependencies too tightly is bad. > * Distributions can easily enough report such pins upstream when the > library is initially packaged, so there's no ongoing cost here (just > possibly a delay before the library can be packaged). > * Libraries can legitimately have appropriate pins (typically to > ranges of versions). So distributions have to be able to deal with > that. > * Applications *should* pin precise versions. Distributions have to > decide whether to respect those pins or remove them and then take on > support of the combination that upstream doesn't support. > * But application pins should be in a requirements.txt file, so > ignoring version specs is pretty simple (just a script to run against > the requirements file). > * Because Python doesn't support multiple installed versions of > packages, conflicting requirements *will* be a problem that distros > have to solve themselves (the language response is "use a venv"). > > Nick is suggesting that the requirement metadata be prohibited from > using exact pins, but there's alternative metadata for "yes, I really > mean an exact pin". To me: > > 1. This doesn't have any bearing on *application* pins, as they aren't > in metadata. > 2. Distributions still have to be able to deal with libraries having > exact pins, as it's an explicitly supported possibility. > 3. You can still manage (effectively) exact pins without being > explicit - foo >1.6,<1.8 pretty much does it. And that doesn't even > have to be a deliberate attempt to break the system, it could be a > genuine attempt to avoid known issues, that just got too aggressive. > > So we're left with additional complexity for library authors to > understand, for what seems like no benefit in practice to distribution > builders. The only stated benefit of the 2 types of metadata is to > educate library authors of the benefits of not pinning versions - and > it seems like a very sweeping measure, where bug reports from > distributions seem like they would be a much more focused and just as > effective approach. > > Paul > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Wed Feb 15 14:44:42 2017 From: donald at stufft.io (Donald Stufft) Date: Wed, 15 Feb 2017 14:44:42 -0500 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: > On Feb 15, 2017, at 1:15 PM, Daniel Holth wrote: > > I also get a little frustrated with this kind of proposal "no pins" which I read as "annoy the publisher to try to prevent them from annoying the consumer". As a free software publisher I feel entitled to annoy the consumer, an activity I will indulge in inversely proportional to my desire for users. Who is the star? > > It should be possible to publish applications to pypi. Much of the packaging we have is completely web application focused, these applications are not usually published at all. > I haven?t fully followed this thread, and while the recommendation is and will always be to use the least strict version specifier that will work for your application, I am pretty heavily -1 on mandating that people do not use ``==``. I am also fairly heavily -1 on confusing the data model even more by making two sets of dependencies, one that allows == and one that doesn?t. I don?t think that overly restrictive pins is that common of a problem (if anything, we?re more likely to have too loose of pins, due to the always-upgrade nature of pip and the difficulty of exhaustivly testing every possible version combination). In cases where this actively harms the end user (effectively when there is a security issue or a conflict) we can tell the user about it (theoretically, not in practice yet) but beyond that, this is best handled by opening individual issues up on each individual repository, just like any other packaging issue with that project. ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From brunson at brunson.com Wed Feb 15 18:37:55 2017 From: brunson at brunson.com (Eric Brunson) Date: Wed, 15 Feb 2017 23:37:55 +0000 Subject: [Distutils] Python installation not working In-Reply-To: References: <2038221091.6741722.1487050035224.ref@mail.yahoo.com> <2038221091.6741722.1487050035224@mail.yahoo.com> Message-ID: <0100015a4423b70e-e0bdb844-3db1-4e91-a5b7-f0afb2b1501f-000000@email.amazonses.com> help at python.org is also set up to provide this kind of assistance. On Wed, Feb 15, 2017 at 10:05 AM Brett Cannon wrote: > This actually isn't the right place to ask for installation help, Chitra > (this list is about how to package up Python projects). For general support > questions you should email python-list. > > On Wed, 15 Feb 2017 at 05:11 Chitra Dewan via Distutils-SIG < > distutils-sig at python.org> wrote: > > Hello, > > I am *beginner in Python* > I am facing problems in installing Python 3.5 on my windows vista x32 > machine. > I downloaded python-3.5.2.exe from Python.org. It is downloaded as an > exe. When I try to install it via "Run as administrator" , nothing > happens. Same behavior with 3.6 version > > kindly advise > > > Regards & Thanks, Chitra Dewan > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From glyph at twistedmatrix.com Thu Feb 16 01:15:49 2017 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Wed, 15 Feb 2017 22:15:49 -0800 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: <03BB249B-13CA-40C3-8990-E3BD5A3FB715@twistedmatrix.com> > On Feb 15, 2017, at 11:44 AM, Donald Stufft wrote: > > >> On Feb 15, 2017, at 1:15 PM, Daniel Holth > wrote: >> >> I also get a little frustrated with this kind of proposal "no pins" which I read as "annoy the publisher to try to prevent them from annoying the consumer". As a free software publisher I feel entitled to annoy the consumer, an activity I will indulge in inversely proportional to my desire for users. Who is the star? >> >> It should be possible to publish applications to pypi. Much of the packaging we have is completely web application focused, these applications are not usually published at all. >> > > > > I haven?t fully followed this thread, and while the recommendation is and will always be to use the least strict version specifier that will work for your application, I am pretty heavily -1 on mandating that people do not use ``==``. I am also fairly heavily -1 on confusing the data model even more by making two sets of dependencies, one that allows == and one that doesn?t. I hope I'm not repeating a suggestion that appears up-thread, but, if you want to distribute an application with pinned dependencies, you could always released 'foo-lib' with a lenient set of dependencies, and 'foo-app' which depends on 'foo-lib' but pins the transitive closure of all dependencies with '=='. Your CI system could automatically release a new 'foo-app' every time any dependency has a new release and a build against the last release of 'foo-app' passes. -glyph -------------- next part -------------- An HTML attachment was scrubbed... URL: From freddyrietdijk at fridh.nl Thu Feb 16 04:15:57 2017 From: freddyrietdijk at fridh.nl (Freddy Rietdijk) Date: Thu, 16 Feb 2017 10:15:57 +0100 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: > I notice here (and in a few other places) you talk about "Applications". From what I understand of Nick's position, applications absolutely should pin their dependencies - so if I'm understanding correctly, those applications will (and should) continue to pin exact versions. Application developers typically don't test against all combinations of dependency versions and it also doesn't really make sense for them. Therefore it is understandable from their point of view to pin their dependencies. However, should they pin to a certain major/minor version, or to a patch version? In my opinion they best pin to minor versions. That should be sufficient to guarantee the app works. Let the distributions take care of providing the latest patch version so that it remains safe. And that means indeed specifying >1.6,<1.8 (or actually >=1.7,<1.8), and not ==1.7 or ==1.7.3. The same goes for the meta-packages. On Wed, Feb 15, 2017 at 6:57 PM, Paul Moore wrote: > Thanks for your reply, it was very helpful. > > On 15 February 2017 at 16:55, Freddy Rietdijk > wrote: > > Larger applications that have many dependencies that are fixed have been > > kept out of Nixpkgs for now. > > I notice here (and in a few other places) you talk about > "Applications". From what I understand of Nick's position, > applications absolutely should pin their dependencies - so if I'm > understanding correctly, those applications will (and should) continue > to pin exact versions. > > As regards automatic packaging of new upstream versions (of libraries > rather than applications), I guess if you get upstream to remove the > pinned versions, this problem goes away. > > > The main problem I see is that it limits in how far you can > automatically update to newer versions and thus release bug/security fixes. > Just one inappropriate pin is sufficient to break dependency solving. > > I'm not sure I follow this. Suppose we have foo 1.0 depending on bar. > If foo 1.0 has doesn't pin bar (possibly because you reported to them > that they shouldn't) then foo 1.1 isn't going to suddenly add the pin > back. So you can update foo fine. And you can update bar because > there's no pin. So yes, while "one inappropriate pin" can cause a > problem, getting upstream to fix that is a one-off cost, not an > ongoing issue. > > So, in summary, > > * I agree that libraries pinning dependencies too tightly is bad. > * Distributions can easily enough report such pins upstream when the > library is initially packaged, so there's no ongoing cost here (just > possibly a delay before the library can be packaged). > * Libraries can legitimately have appropriate pins (typically to > ranges of versions). So distributions have to be able to deal with > that. > * Applications *should* pin precise versions. Distributions have to > decide whether to respect those pins or remove them and then take on > support of the combination that upstream doesn't support. > * But application pins should be in a requirements.txt file, so > ignoring version specs is pretty simple (just a script to run against > the requirements file). > * Because Python doesn't support multiple installed versions of > packages, conflicting requirements *will* be a problem that distros > have to solve themselves (the language response is "use a venv"). > > Nick is suggesting that the requirement metadata be prohibited from > using exact pins, but there's alternative metadata for "yes, I really > mean an exact pin". To me: > > 1. This doesn't have any bearing on *application* pins, as they aren't > in metadata. > 2. Distributions still have to be able to deal with libraries having > exact pins, as it's an explicitly supported possibility. > 3. You can still manage (effectively) exact pins without being > explicit - foo >1.6,<1.8 pretty much does it. And that doesn't even > have to be a deliberate attempt to break the system, it could be a > genuine attempt to avoid known issues, that just got too aggressive. > > So we're left with additional complexity for library authors to > understand, for what seems like no benefit in practice to distribution > builders. The only stated benefit of the 2 types of metadata is to > educate library authors of the benefits of not pinning versions - and > it seems like a very sweeping measure, where bug reports from > distributions seem like they would be a much more focused and just as > effective approach. > > Paul > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Feb 17 03:56:04 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 17 Feb 2017 09:56:04 +0100 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 15 Feb 2017 23:28, "Paul Moore" wrote: So, in summary, * I agree that libraries pinning dependencies too tightly is bad. * Distributions can easily enough report such pins upstream when the library is initially packaged, so there's no ongoing cost here (just possibly a delay before the library can be packaged). No, we can't easily do this. libraries.io tracks more than *two million* open source projects. Debian is the largest Linux distribution, and only tracks 50k packages. That means it is typically going to be *app* developers that run into the problem of inappropriately pinned dependencies and So if we rely on a manual "publish with pinned dependencies", "get bug report from redistributor or app developer", "republish with unpinned dependencies", we'll be in a situation where: - the affected app developer or redistributor is going to have a negative experience with the project - the responsible publisher is either going to have a negative interaction with an end user or redistributor, or else they'll just silently move on to find an alternative library - we relinquish any control of the tone used when the publisher is alerted to the problem By contrast, if we design the metadata format such that *PyPI* can provide a suitable error message, then: - publishers get alerted to the problem *prior* to publication - end users and redistributors are unlikely to encounter the problem directly - we retain full control over the tone of the error notification * Libraries can legitimately have appropriate pins (typically to ranges of versions). So distributions have to be able to deal with that. * Applications *should* pin precise versions. Distributions have to decide whether to respect those pins or remove them and then take on support of the combination that upstream doesn't support. * But application pins should be in a requirements.txt file, so ignoring version specs is pretty simple (just a script to run against the requirements file). Applications also get packaged as sdists and wheel files, so pydist.json does need to handle that case. Nick is suggesting that the requirement metadata be prohibited from using exact pins, but there's alternative metadata for "yes, I really mean an exact pin". To me: 1. This doesn't have any bearing on *application* pins, as they aren't in metadata. 2. Distributions still have to be able to deal with libraries having exact pins, as it's an explicitly supported possibility. 3. You can still manage (effectively) exact pins without being explicit - foo >1.6,<1.8 pretty much does it. And that doesn't even have to be a deliberate attempt to break the system, it could be a genuine attempt to avoid known issues, that just got too aggressive. People aren't going to do the last one accidentally, but they *will* use "==" when transferring app development practices to library development. So we're left with additional complexity for library authors to understand, for what seems like no benefit in practice to distribution builders. - We'll get more automated conversions with pyp2rpm and similar tools that "just work" without human intervention - We'll get fewer negative interpersonal interactions between upstream publishers and downstream redistributors It won't magically make everything all sunshine and roses, but we're currently at a point where about 70% of pyp2rpm conversions fail for various reasons, so every little bit helps :) The only stated benefit of the 2 types of metadata is to educate library authors of the benefits of not pinning versions - and it seems like a very sweeping measure, where bug reports from distributions seem like they would be a much more focused and just as effective approach. We've been playing that whack-a-mole game for years, and it sucks enormously for both publishers and redistributors from a user experience perspective. More importantly though, it's already failing to scale adequately, hence the rise of technologies like Docker, Flatpak, and Snappy that push more integration and update responsibilities back to application and service developers. The growth rates on PyPI mean we can expect those scalability challenges to get *worse* rather than better in the coming years. By pushing this check down into the tooling infrastructure, the aim would be to make the automated systems take on the task of being the "bad guy", rather than asking humans to do it later. Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Feb 17 04:56:28 2017 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 17 Feb 2017 01:56:28 -0800 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On Fri, Feb 17, 2017 at 12:56 AM, Nick Coghlan wrote: > By contrast, if we design the metadata format such that *PyPI* can provide a > suitable error message, then: But all these benefits you're describing also work if you s/PyPI/setuptools/, no? And that doesn't require any metadata PEPs or global coordination, you could send them a PR this afternoon if you want. -n -- Nathaniel J. Smith -- https://vorpus.org From p.f.moore at gmail.com Fri Feb 17 05:08:14 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 17 Feb 2017 10:08:14 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 17 February 2017 at 08:56, Nick Coghlan wrote: > - we retain full control over the tone of the error notification I tried to formulate a long response to this email, and got completely bogged down. So I'm going to give a brief[1] response for now and duck out until the dust settles. By "we" above, I assume you mean distutils-sig/PyPA. As part of that group, I find the complexities of how distributions package stuff up, and the expected interactions of the multitude of parties involved in the model you describe, completely baffling. That's fine normally (as a Windows developer, I don't typically interact with Linux distributions) but when it comes to being part of distutils-sig/PyPA in terms of how we present things like this, I feel a responsibility to understand (and by proxy, represent users who are similarly unaware of distro practices, etc). I understand (somewhat) the motivations behind this distinction between "requires" and "integrates"[2] but I think we need to come up with a much more straightforward explanation - geared towards library authors who don't understand (and probably aren't that interested in) the underling issues - before we standardise anything. Because otherwise, we'll be rehashing this debate over and over as library authors get errors they don't understand, and come asking. Paul [1] Yes, this was as brief as I could manage :-( [2] As a data point, I couldn't even think of the right terms to use here without scanning back over the email thread to look for them. That indicates to me that the concepts are anything but intuitive :-( From fungi at yuggoth.org Fri Feb 17 08:18:51 2017 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 17 Feb 2017 13:18:51 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: <20170217131851.GR12827@yuggoth.org> On 2017-02-17 09:56:04 +0100 (+0100), Nick Coghlan wrote: [...] > So if we rely on a manual "publish with pinned dependencies", "get bug > report from redistributor or app developer", "republish with unpinned > dependencies", we'll be in a situation where: > > - the affected app developer or redistributor is going to have a negative > experience with the project > - the responsible publisher is either going to have a negative interaction > with an end user or redistributor, or else they'll just silently move on to > find an alternative library > - we relinquish any control of the tone used when the publisher is alerted > to the problem > > By contrast, if we design the metadata format such that *PyPI* can provide > a suitable error message, then: > > - publishers get alerted to the problem *prior* to publication > - end users and redistributors are unlikely to encounter the problem > directly > - we retain full control over the tone of the error notification [...] It seems like the same could be said of many common mistakes which can be identified with some degree of certainty through analysis of the contents being uploaded. Why not also scan for likely security vulnerabilities with a static analyzer and refuse offending uploads unless the uploader toggles the magic "yes I really mean it" switch? Surely security issues are even greater downstream risks than simple dependency problems. (NB: I'm not in favor of that either, just nudging an example in the reductio ad absurdum direction.) -- Jeremy Stanley From ronaldoussoren at mac.com Mon Feb 20 02:51:17 2017 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Mon, 20 Feb 2017 08:51:17 +0100 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: <3F63BAFC-7457-4D5F-B9B4-3D7CA4638E89@mac.com> > On 15 Feb 2017, at 15:11, Nathaniel Smith wrote: > > >> In the vast majority of publication-to-PyPi cases people won't need >> the "integrates" field, since what they're publishing on PyPI will >> just be their abstract dependencies, and any warning against using >> "==" will recommend using "~=" or ">=" instead. But there *are* >> legitimate uses of pinning-for-publication (like the PyObjC >> metapackage bundling all its subcomponents, or when building for >> private deployment infastructure), so there needs to be a way to >> represent "Yes, I'm pinning this dependency for publication, and I'm >> aware of the significance of doing so" > > Why can't PyObjC just use regular dependencies? That's what distro > metapackages have done for decades, right? PyObjC is conceptually a single project that is split in multiple PyPI distributions to make it easier to install only the parts you need (and can install, PyObjC wraps macOS frameworks including some that may not be available on the OS version that you?re running). The project is managed as a single entity and updates will always release new versions of all PyPI packages for the project. ?pip install pyobjc==3.1? should install that version, and should not result in a mix of versions if you use this to downgrade (which could happen if the metapackage used ?>=? to specify the version of the concrete packages). BTW. I?m not sure if my choice to split PyObjC in a large collection of PyPI packages is still the right choice with current state of the packaging landscape. Ronald (the PyObjC maintainer) > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig -------------- next part -------------- An HTML attachment was scrubbed... URL: From ganwilliam at outlook.com Wed Feb 22 13:21:20 2017 From: ganwilliam at outlook.com (William Gan) Date: Wed, 22 Feb 2017 18:21:20 +0000 Subject: [Distutils] pip install error Message-ID: Good day, I got your email from the Installing Python Modules page in the Python 3.6.0 documentation. I encountered an error when trying to install a package in the Python IDLE shell: >>> python -m pip install numpy SyntaxError: invalid syntax I am using Windows 10. I have not been able to find any solution to this. Could you please help? Alternatively, please direct me to a web group for help. Many thanks. Gan william -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail at timgolden.me.uk Wed Feb 22 13:36:36 2017 From: mail at timgolden.me.uk (Tim Golden) Date: Wed, 22 Feb 2017 18:36:36 +0000 Subject: [Distutils] pip install error In-Reply-To: References: Message-ID: <8f8cbd22-d141-d538-7a75-9bcacc5dbb93@timgolden.me.uk> On 22/02/2017 18:21, William Gan wrote: > I got your email from the Installing Python Modules page in the Python > 3.6.0 documentation. You might do better ask this kind of question on the Tutor list: https://mail.python.org/mailman/listinfo/tutor > I encountered an error when trying to install a package in the Python > IDLE shell: > >>>> python -m pip install numpy > > SyntaxError: invalid syntax However, the answer here is straightforward enough: the python/pip command is an operating system command, ie to be run from the Command Prompt, not from the Python prompt. So: * Press the Windows Key or otherwise invoke Windows' search mode * Start typing: cmd * An icon for the "Command Prompt" should appear. Click it. * In the window which appears, at the command prompt, type the command you used above: python -mpip install numpy TJG From dholth at gmail.com Wed Feb 22 13:37:03 2017 From: dholth at gmail.com (Daniel Holth) Date: Wed, 22 Feb 2017 18:37:03 +0000 Subject: [Distutils] pip install error In-Reply-To: References: Message-ID: import subprocess; subprocess.Popen("python -m pip install sqlalchemy", stdout=subprocess.PIPE).communicate() On Wed, Feb 22, 2017 at 1:28 PM William Gan wrote: > Good day, > > > > I got your email from the Installing Python Modules page in the Python > 3.6.0 documentation. > > > > I encountered an error when trying to install a package in the Python IDLE > shell: > > >>> python -m pip install numpy > > SyntaxError: invalid syntax > > > > I am using Windows 10. > > > > I have not been able to find any solution to this. Could you please help? > Alternatively, please direct me to a web group for help. > > > > Many thanks. > > Gan william > > > > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ganwilliam at outlook.com Wed Feb 22 13:44:20 2017 From: ganwilliam at outlook.com (William Gan) Date: Wed, 22 Feb 2017 18:44:20 +0000 Subject: [Distutils] pip install error In-Reply-To: References: Message-ID: Hello Tim and Daniel, Many thanks for the very quick response. I forgot to mention earlier that I did try on the Command Prompt. However, then I went into Python by entering the command ?py? and tried the pip install there. Following Tim?s email, I tried at user prompt and it worked. Noted suggestion on Tutor List. I have since subscribed to it. Many thanks again. Have a great day! From: Daniel Holth [mailto:dholth at gmail.com] Sent: Thursday, February 23, 2017 2:37 AM To: William Gan ; distutils-sig at python.org Subject: Re: [Distutils] pip install error import subprocess; subprocess.Popen("python -m pip install sqlalchemy", stdout=subprocess.PIPE).communicate() On Wed, Feb 22, 2017 at 1:28 PM William Gan > wrote: Good day, I got your email from the Installing Python Modules page in the Python 3.6.0 documentation. I encountered an error when trying to install a package in the Python IDLE shell: >>> python -m pip install numpy SyntaxError: invalid syntax I am using Windows 10. I have not been able to find any solution to this. Could you please help? Alternatively, please direct me to a web group for help. Many thanks. Gan william _______________________________________________ Distutils-SIG maillist - Distutils-SIG at python.org https://mail.python.org/mailman/listinfo/distutils-sig -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Feb 23 03:03:36 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 23 Feb 2017 18:03:36 +1000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 15 Feb 2017 23:40, "Nathaniel Smith" wrote: On Feb 15, 2017 07:41, "Nick Coghlan" wrote: Ah-hah, this does make sense as a problem, thanks! However, your solution seems very odd to me :-). If the goal is to put an "are you sure/yes I'm sure" UX barrier between users and certain version settings, then why make a distinction that every piece of downstream software has to be aware of and ignore? Pypi seems like a funny place in the stack to be implementing this. It would be much simpler to implement this feature at the build system level, like e.g. setuptools could require that dependencies that you think are over strict be specified in an install_requires_yes_i_really_mean_it= field, without requiring any metadata changes. If you're publishing to a *private* index server then version pinning should be allowed by default and you shouldn't get a warning. It's only when publishing to PyPI as a *public* index server that overly restrictive dependencies become a UX problem. The simplest way of modelling this that I've come up with is a boolean "allow pinned dependencies" flag - without the flag, "==" and "===" would emit warnings or errors when releasing to a public index server, with it they wouldn't trigger any complaints. Basically it sounds like you're saying you want to extend the metadata so that it can represent both broken and non-broken packages, so that both can be created, passed around, and checked for. And I'm saying, how about instead we do that checking when creating the package in the first place. Build time isn't right, due to this being a perfectly acceptable thing to do when building solely for private use. It's only you make the "I'm going to publish this for the entire community to use" that the intent needs to be clarified (as at that point you're switching from "I'm solving to my own problems" to "My problems may be shared by other people, and I'd like to help them out if I can"). (Of course I can't see any way to do any of this that won't break existing sdists, but I guess you've already decided you're OK with that. I guess I should say that I'm a bit dubious that this is so important in the first place; I feel like there are lots of legitimate use cases for == dependencies and lots of kinds of linting we might want to apply to try and improve the level of packaging quality.) Either way, PyPI will believe your answer, it's just refusing the temptation to guess that using "==" or "===" in the requires section is sufficient to indicate that you're deliberately publishing a pre-integrated project. > There's certainly a distinction to be made between the abstract > dependencies and the exact locked dependencies, but to me the natural > way to model that distinction is by re-using the distinction we > already have been source packages and binary packages. The build > process for this placeholder wheel is to "compile down" the abstract > dependencies into concrete dependencies, and the resulting wheel > encodes the result of this compilation. Again, no new concepts needed. Source vs binary isn't where the distinction applies, though. For example, it's legitimate for PyObjC to have pinned dependencies even when distributed in source form, as it's a metapackage used solely to integrate the various PyObjC subprojects into a single "release". ?? So that means that some packages have a loosely specified source that compiles down to a more strictly specified binary, and some have a more strictly specified source that compiles down to an equally strictly specified binary. That's... an argument in favor of my way of thinking about it, isn't it? That it can naturally express both situations? My point is that *for the cases where there's an important distinction between Pipfile and Pipfile.lock*, we already have a way to think about that distinction without introducing new concepts. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Feb 23 03:18:55 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 23 Feb 2017 18:18:55 +1000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 23 February 2017 at 18:03, Nick Coghlan wrote: > > > On 15 Feb 2017 23:40, "Nathaniel Smith" wrote: > > On Feb 15, 2017 07:41, "Nick Coghlan" wrote: > > > > Ah-hah, this does make sense as a problem, thanks! > > However, your solution seems very odd to me :-). > > If the goal is to put an "are you sure/yes I'm sure" UX barrier between > users and certain version settings, then why make a distinction that every > piece of downstream software has to be aware of and ignore? Pypi seems like > a funny place in the stack to be implementing this. It would be much > simpler to implement this feature at the build system level, like e.g. > setuptools could require that dependencies that you think are over strict > be specified in an install_requires_yes_i_really_mean_it= field, without > requiring any metadata changes. > > > If you're publishing to a *private* index server then version pinning > should be allowed by default and you shouldn't get a warning. > > It's only when publishing to PyPI as a *public* index server that overly > restrictive dependencies become a UX problem. > > The simplest way of modelling this that I've come up with is a boolean > "allow pinned dependencies" flag - without the flag, "==" and "===" would > emit warnings or errors when releasing to a public index server, with it > they wouldn't trigger any complaints. > > Basically it sounds like you're saying you want to extend the metadata so > that it can represent both broken and non-broken packages, so that both can > be created, passed around, and checked for. And I'm saying, how about > instead we do that checking when creating the package in the first place. > > > Build time isn't right, due to this being a perfectly acceptable thing to > do when building solely for private use. It's only you make the "I'm going > to publish this for the entire community to use" that the intent needs to > be clarified (as at that point you're switching from "I'm solving to my own > problems" to "My problems may be shared by other people, and I'd like to > help them out if I can"). > And TIL that Ctrl-Enter is Gmail's keyboard shortcut for sending an email :) > (Of course I can't see any way to do any of this that won't break existing > sdists, but I guess you've already decided you're OK with that. I guess I > should say that I'm a bit dubious that this is so important in the first > place; I feel like there are lots of legitimate use cases for == > dependencies and lots of kinds of linting we might want to apply to try and > improve the level of packaging quality.) > > Existing sdists won't have pydist.json, so none of this will apply. > > Either way, PyPI will believe your answer, it's just refusing the > temptation to guess that using "==" or "===" in the requires section > is sufficient to indicate that you're deliberately publishing a > pre-integrated project. > > > There's certainly a distinction to be made between the abstract > > dependencies and the exact locked dependencies, but to me the natural > > way to model that distinction is by re-using the distinction we > > already have been source packages and binary packages. The build > > process for this placeholder wheel is to "compile down" the abstract > > dependencies into concrete dependencies, and the resulting wheel > > encodes the result of this compilation. Again, no new concepts needed. > > Source vs binary isn't where the distinction applies, though. For > example, it's legitimate for PyObjC to have pinned dependencies even > when distributed in source form, as it's a metapackage used solely to > integrate the various PyObjC subprojects into a single "release". > > > ?? So that means that some packages have a loosely specified source that > compiles down to a more strictly specified binary, and some have a more > strictly specified source that compiles down to an equally strictly > specified binary. That's... an argument in favor of my way of thinking > about it, isn't it? That it can naturally express both situations? > > Why are you bringing source vs binary into this? That has *nothing* to do with the problem space, which is about the large grey area between "definitely doesn't work" (aka "we tested this combination and it failed") and "will almost certainly work" (aka "we tested this specific combination of dependencies and it passed"). When publishing a software *component* (such as a library or application), the most important information to communicate to users is the former (i.e. the combinations you know *don't* work), while for applications & services you typically want to communicate *both* (i.e. the combinations you know definitively *don't* work, *and* the specific combinations you tested). While you do need to do at least one build to actually run the tests, once you have those results, the metadata is just as applicable to the original source artifact as it is to the specific built binary. > My point is that *for the cases where there's an important distinction > between Pipfile and Pipfile.lock*, we already have a way to think about > that distinction without introducing new concepts. > > Most software components won't have a Pipfile or Pipfile.lock, as that's an application & service oriented way of framing the distinction. However, as Daniel said in his reply, we *do* want people to be able to publish applications and services like sentry or supervisord to PyPI, and we also want to allow people to publish metapackages like PyObjC. The problem I'm trying to address is that we *don't* currently give publishers a machine readable way to say definitively "This is a pre-integrated application, service or metapackage" rather than "This is a component intended for integration into a larger application, service or metapackage". I'm not a huge fan of having simple boolean toggles in metadata definitions (hence the more elaborate idea of two different kinds of dependency declaration), but this may be a case where that's a good way to go, since it would mean that services and tools that care can check it (with a recommendation in the spec saying that public index servers SHOULD check it), while those that don't care would continue to have a single unified set of dependency declarations to work with. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Feb 23 03:33:57 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 23 Feb 2017 18:33:57 +1000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: <20170217131851.GR12827@yuggoth.org> References: <425841221.7853973.1487103672849@mail.yahoo.com> <20170217131851.GR12827@yuggoth.org> Message-ID: On 17 February 2017 at 23:18, Jeremy Stanley wrote: > On 2017-02-17 09:56:04 +0100 (+0100), Nick Coghlan wrote: > [...] > > So if we rely on a manual "publish with pinned dependencies", "get bug > > report from redistributor or app developer", "republish with unpinned > > dependencies", we'll be in a situation where: > > > > - the affected app developer or redistributor is going to have a negative > > experience with the project > > - the responsible publisher is either going to have a negative > interaction > > with an end user or redistributor, or else they'll just silently move on > to > > find an alternative library > > - we relinquish any control of the tone used when the publisher is > alerted > > to the problem > > > > By contrast, if we design the metadata format such that *PyPI* can > provide > > a suitable error message, then: > > > > - publishers get alerted to the problem *prior* to publication > > - end users and redistributors are unlikely to encounter the problem > > directly > > - we retain full control over the tone of the error notification > [...] > > It seems like the same could be said of many common mistakes which > can be identified with some degree of certainty through analysis of > the contents being uploaded. Why not also scan for likely security > vulnerabilities with a static analyzer and refuse offending uploads > unless the uploader toggles the magic "yes I really mean it" switch? > Surely security issues are even greater downstream risks than simple > dependency problems. (NB: I'm not in favor of that either, just > nudging an example in the reductio ad absurdum direction.) > Most of the other potential checks are about forming an opinion about software quality, rather than attempting to discern publisher intent. Now, we could ask all package developers "Is this an application, service, or metapackage?", but then we'd have to get into a detailed discussion of what those terms mean, and help them decide whether or not any of them apply to what they're doing. It would also be a complete waste of their time if they're not attempting to pin any dependencies in the first place, or if they're not publishing the component to a public index server. Alternatively, we can defer asking any question at all until they do something where the difference matters: attempting to pin a dependency to a specific version when publishing to a public index server. At that point, there is an ambiguity in intent as there are multiple reasons somebody could be doing that: - they're actually publishing an application, service, or metapackage, so dependency pinning is entirely reasonable - they've carried over habits learned in application and service development into component and framework publishing - they've carried over habits learned in other ecosystems that encourage runtime version mixing (e.g. npm/js) into their Python publishing So the discussion in this thread has convinced me that a separate "allow_pinned_dependencies" flag is a much better way to model this than attempting to define different dependency types, but I still want to include it in the metadata model :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Thu Feb 23 03:37:06 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 23 Feb 2017 08:37:06 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 23 February 2017 at 08:18, Nick Coghlan wrote: > I'm not a huge fan of having simple boolean toggles in metadata definitions > (hence the more elaborate idea of two different kinds of dependency > declaration), but this may be a case where that's a good way to go, since it > would mean that services and tools that care can check it (with a > recommendation in the spec saying that public index servers SHOULD check > it), while those that don't care would continue to have a single unified set > of dependency declarations to work with. While boolean metadata may not be ideal in the general case, I think it makes sense here. If you want to make it more acceptable, maybe make it Package-Type, with values "application" or "library". On a related but tangential point, can I make a plea for using simpler language when documenting this (and even when discussing it)? The term "pre-integrated application" means very little to me in any practical sense beyond "application", and it brings a whole load of negative connotations - I deal with Java build processes on occasion, and the whole terminology there ("artifacts", "deployment units", ...) makes for a pretty hostile experience for the newcomer. I'd like to avoid Python packaging going down that route - even if the cost is a little vagueness in terms. Paul From ncoghlan at gmail.com Thu Feb 23 03:44:47 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 23 Feb 2017 18:44:47 +1000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 23 February 2017 at 18:37, Paul Moore wrote: > On 23 February 2017 at 08:18, Nick Coghlan wrote: > > I'm not a huge fan of having simple boolean toggles in metadata > definitions > > (hence the more elaborate idea of two different kinds of dependency > > declaration), but this may be a case where that's a good way to go, > since it > > would mean that services and tools that care can check it (with a > > recommendation in the spec saying that public index servers SHOULD check > > it), while those that don't care would continue to have a single unified > set > > of dependency declarations to work with. > > While boolean metadata may not be ideal in the general case, I think > it makes sense here. If you want to make it more acceptable, maybe > make it Package-Type, with values "application" or "library". > That gets us back into the world of defining what the various package types mean, and I really don't want to go there :) Instead, I'm thinking in terms of a purely capability based field: "allow_pinned_dependencies", with the default being "False", but actually checking the field also only being a SHOULD for public index servers and a MAY for everything else. That would be enough for downstream tooling to pick up and say "I should treat this as a multi-component module rather than as an individual standalone component", *without* having to inflict the task of understanding the complexities of multi-tier distribution systems onto all component publishers :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Thu Feb 23 03:53:52 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 23 Feb 2017 08:53:52 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 23 February 2017 at 08:44, Nick Coghlan wrote: > That gets us back into the world of defining what the various package types > mean, and I really don't want to go there :) And yet I still don't understand what's wrong with "application", "library", and "metapackage" (the latter saying to me "complex thing that I don't need to understand"). Those terms are clear enough - after all, they are precisely the ones we've always used when debating "should you pin or not"? Sure, there's a level of judgement involved - but it's precisely the *same* judgement as we're asking authors to make when asking"should I pin", just using the underlying distinction directly. > Instead, I'm thinking in terms of a purely capability based field: > "allow_pinned_dependencies", with the default being "False", but actually > checking the field also only being a SHOULD for public index servers and a > MAY for everything else. How would the user see this? As a magic flag they have to set to "yes" so that they can pin dependencies? Because if that's the situation, I'd imagine a lot of authors just cargo-culting "add this flag to get my package to upload" without actually thinking about the implications. (They'll search Stack Overflow for the error message, so putting what it's for in the docs won't help...) Paul From njs at pobox.com Thu Feb 23 06:49:53 2017 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 23 Feb 2017 03:49:53 -0800 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On Thu, Feb 23, 2017 at 12:44 AM, Nick Coghlan wrote: > On 23 February 2017 at 18:37, Paul Moore wrote: >> >> On 23 February 2017 at 08:18, Nick Coghlan wrote: >> > I'm not a huge fan of having simple boolean toggles in metadata >> > definitions >> > (hence the more elaborate idea of two different kinds of dependency >> > declaration), but this may be a case where that's a good way to go, >> > since it >> > would mean that services and tools that care can check it (with a >> > recommendation in the spec saying that public index servers SHOULD check >> > it), while those that don't care would continue to have a single unified >> > set >> > of dependency declarations to work with. >> >> While boolean metadata may not be ideal in the general case, I think >> it makes sense here. If you want to make it more acceptable, maybe >> make it Package-Type, with values "application" or "library". > > > That gets us back into the world of defining what the various package types > mean, and I really don't want to go there :) > > Instead, I'm thinking in terms of a purely capability based field: > "allow_pinned_dependencies", with the default being "False", but actually > checking the field also only being a SHOULD for public index servers and a > MAY for everything else. > > That would be enough for downstream tooling to pick up and say "I should > treat this as a multi-component module rather than as an individual > standalone component", *without* having to inflict the task of understanding > the complexities of multi-tier distribution systems onto all component > publishers :) I'm still not sure I understand what you're trying to do, but this feels like you're trying to have it both ways... if you don't want to define what the different package types mean, and it's purely a capability-based field, then surely that means that downstream tooling *can't* make assumptions about what kind of package type it is based on the field? ISTM that from the point of view of downstream tooling, "allow_pinned_dependencies" carries literally no information, because all it means is "this package is on a public server and its Requires-Dist field has an == in it", which are things we already know. I can see how this would help your goal of educating uploaders about good package hygiene, but not how it helps downstream distributors. (Here's an example I've just run into that involves a == dependency on a public package: I have a library that needs to access some C API calls on Windows, but not on other platforms. The natural way to do this is to split out the CFFI code into its own package, _mylib_windows_helper or whatever, that has zero public interface, and have mylib v1.2.3 require "_mylib_windows_helper==1.2.3; os_name == 'nt'". That way I can distribute one pure-Python wheel + one binary wheel and everything just works. But there's no sense in which this is an "integrated application" or anything, it's just a single library that usually ships in one .whl but sometimes ships in 2 .whls.) ((In actual fact I'm currently not building the package this way because setuptools makes it extremely painful to actually maintain that setup. Really I need the ability to build two wheels out of a single source package. Since we don't have that, I'm instead using CFFI's slow and semi-deprecated ABI mode, which lets me call C functions from a pure Python package. But what I described above is really the "right" solution, it's just tooling limitations that make it painful.)) -n -- Nathaniel J. Smith -- https://vorpus.org From ncoghlan at gmail.com Thu Feb 23 07:32:42 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 23 Feb 2017 22:32:42 +1000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 23 February 2017 at 18:53, Paul Moore wrote: > On 23 February 2017 at 08:44, Nick Coghlan wrote: > > That gets us back into the world of defining what the various package > types > > mean, and I really don't want to go there :) > > And yet I still don't understand what's wrong with "application", > "library", and "metapackage" (the latter saying to me "complex thing > that I don't need to understand"). Those terms are clear enough - > after all, they are precisely the ones we've always used when debating > "should you pin or not"? > > Sure, there's a level of judgement involved - but it's precisely the > *same* judgement as we're asking authors to make when asking"should I > pin", just using the underlying distinction directly. > Thinking about it further, I may be OK with that, especially since we can point to concrete examples. component: a library or framework used to build Python applications. Users will mainly interact with the component via a Python API. Examples: requests, numpy, pytz application: an installable client application or web service. Users will mainly interact with the service via either the command line, a GUI, or a network interface. Examples: ckan (network), ansible (cli), spyder (GUI) metapackage: a package that collects specific versions of other components into a single installable group Example: PyObjC And then we'd note in the spec that public index servers SHOULD warn when components use pinned dependencies, while other tools MAY warn about that case. Going down that path would also end up addressing this old RFE for the packaging user guide: https://github.com/pypa/python-packaging-user-guide/issues/100 > > Instead, I'm thinking in terms of a purely capability based field: > > "allow_pinned_dependencies", with the default being "False", but actually > > checking the field also only being a SHOULD for public index servers and > a > > MAY for everything else. > > How would the user see this? As a magic flag they have to set to "yes" > so that they can pin dependencies? Because if that's the situation, > I'd imagine a lot of authors just cargo-culting "add this flag to get > my package to upload" without actually thinking about the > implications. (They'll search Stack Overflow for the error message, so > putting what it's for in the docs won't help...) > Pre-answering questions on SO can work incredibly well, though :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From freddyrietdijk at fridh.nl Thu Feb 23 08:04:58 2017 From: freddyrietdijk at fridh.nl (Freddy Rietdijk) Date: Thu, 23 Feb 2017 14:04:58 +0100 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: > Here's an example I've just run into that involves a == dependency on a public package: I have a library that needs to access some C API calls on Windows, but not on other platforms. The natural way to do this is to split out the CFFI code into its own package, _mylib_windows_helper or whatever, that has zero public interface, and have mylib v1.2.3 require "_mylib_windows_helper==1.2.3; os_name == 'nt'". You have a public library, that, depending on the platform, depends on a public (helper) library that has no public interface? That doesn't sound good to me. If you don't want to implement a public interface then it should just be included in the main library because it is in the end a requirement of the library. It's a pity you can't have a universal wheel but so be it. Choosing to depend on an exact version of a package that has no public interfance is in my opinion the wrong solution. As I stated before, though perhaps not explicitly, I cannot think of *any* good reason that one uses == in `install_requires`. Something like `>= 1.7, < 1.8` should be sufficient. In the CFFI case that should be sufficient unless you change your function signatures in a maintenance release (which is bad). And in case of a metapackage like PyObjC this should also be sufficient because it will downgrade dependencies when downgrading the metapackage while still giving you the latest maintenance releases of the dependencies. Regarding 'application', 'library', and 'metapackage'. In Nixpkgs we distinguish Python libraries and applications. Applications are available for 1 version of the interpreter, whereas libraries are available for all (supported) interpreter versions. It's nice if it were more explicit on say PyPI whether a package is a library or an application. There are difficult cases though, e.g., `ipython`. Is that an application or a library? As user I would argue that it is an application, however, it should be available for each version of the interpreter and that's why we branded it a library. Metapackages. `jupyter` is a metapackage. We had to put it with the rest of the Python libraries for the same reason as we put `ipython` there. >From a distributions' point of view I don't see why you would want to have them mentioned separately. On Thu, Feb 23, 2017 at 12:49 PM, Nathaniel Smith wrote: > On Thu, Feb 23, 2017 at 12:44 AM, Nick Coghlan wrote: > > On 23 February 2017 at 18:37, Paul Moore wrote: > >> > >> On 23 February 2017 at 08:18, Nick Coghlan wrote: > >> > I'm not a huge fan of having simple boolean toggles in metadata > >> > definitions > >> > (hence the more elaborate idea of two different kinds of dependency > >> > declaration), but this may be a case where that's a good way to go, > >> > since it > >> > would mean that services and tools that care can check it (with a > >> > recommendation in the spec saying that public index servers SHOULD > check > >> > it), while those that don't care would continue to have a single > unified > >> > set > >> > of dependency declarations to work with. > >> > >> While boolean metadata may not be ideal in the general case, I think > >> it makes sense here. If you want to make it more acceptable, maybe > >> make it Package-Type, with values "application" or "library". > > > > > > That gets us back into the world of defining what the various package > types > > mean, and I really don't want to go there :) > > > > Instead, I'm thinking in terms of a purely capability based field: > > "allow_pinned_dependencies", with the default being "False", but actually > > checking the field also only being a SHOULD for public index servers and > a > > MAY for everything else. > > > > That would be enough for downstream tooling to pick up and say "I should > > treat this as a multi-component module rather than as an individual > > standalone component", *without* having to inflict the task of > understanding > > the complexities of multi-tier distribution systems onto all component > > publishers :) > > I'm still not sure I understand what you're trying to do, but this > feels like you're trying to have it both ways... if you don't want to > define what the different package types mean, and it's purely a > capability-based field, then surely that means that downstream tooling > *can't* make assumptions about what kind of package type it is based > on the field? ISTM that from the point of view of downstream tooling, > "allow_pinned_dependencies" carries literally no information, because > all it means is "this package is on a public server and its > Requires-Dist field has an == in it", which are things we already > know. I can see how this would help your goal of educating uploaders > about good package hygiene, but not how it helps downstream > distributors. > > (Here's an example I've just run into that involves a == dependency on > a public package: I have a library that needs to access some C API > calls on Windows, but not on other platforms. The natural way to do > this is to split out the CFFI code into its own package, > _mylib_windows_helper or whatever, that has zero public interface, and > have mylib v1.2.3 require "_mylib_windows_helper==1.2.3; os_name == > 'nt'". That way I can distribute one pure-Python wheel + one binary > wheel and everything just works. But there's no sense in which this is > an "integrated application" or anything, it's just a single library > that usually ships in one .whl but sometimes ships in 2 .whls.) > > ((In actual fact I'm currently not building the package this way > because setuptools makes it extremely painful to actually maintain > that setup. Really I need the ability to build two wheels out of a > single source package. Since we don't have that, I'm instead using > CFFI's slow and semi-deprecated ABI mode, which lets me call C > functions from a pure Python package. But what I described above is > really the "right" solution, it's just tooling limitations that make > it painful.)) > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Feb 23 08:13:31 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 23 Feb 2017 23:13:31 +1000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 23 February 2017 at 23:04, Freddy Rietdijk wrote: > > Here's an example I've just run into that involves a == dependency on > a public package: I have a library that needs to access some C API > calls on Windows, but not on other platforms. The natural way to do > this is to split out the CFFI code into its own package, > _mylib_windows_helper or whatever, that has zero public interface, and > have mylib v1.2.3 require "_mylib_windows_helper==1.2.3; os_name == > 'nt'". > > You have a public library, that, depending on the platform, depends on a > public (helper) library that has no public interface? That doesn't sound > good to me. If you don't want to implement a public interface then it > should just be included in the main library because it is in the end a > requirement of the library. It's a pity you can't have a universal wheel > but so be it. Choosing to depend on an exact version of a package that has > no public interfance is in my opinion the wrong solution. > > As I stated before, though perhaps not explicitly, I cannot think of *any* > good reason that one uses == in `install_requires`. Something like `>= 1.7, > < 1.8` should be sufficient. In the CFFI case that should be sufficient > unless you change your function signatures in a maintenance release (which > is bad). And in case of a metapackage like PyObjC this should also be > sufficient because it will downgrade dependencies when downgrading the > metapackage while still giving you the latest maintenance releases of the > dependencies. > > Regarding 'application', 'library', and 'metapackage'. In Nixpkgs we > distinguish Python libraries and applications. Applications are available > for 1 version of the interpreter, whereas libraries are available for all > (supported) interpreter versions. It's nice if it were more explicit on say > PyPI whether a package is a library or an application. There are difficult > cases though, e.g., `ipython`. Is that an application or a library? As user > I would argue that it is an application, however, it should be available > for each version of the interpreter and that's why we branded it a library. > That sounds pretty similar to the distinction in Fedora as well, which has been highlighted by the Python 3 migration effort: libraries emit both Python 2 & 3 RPMs from their source RPM (and will for as long as Fedora and the library both support Python 2), while applications just switch from depending on Python 2 to depending on Python 3 instead. > Metapackages. `jupyter` is a metapackage. We had to put it with the rest > of the Python libraries for the same reason as we put `ipython` there. From > a distributions' point of view I don't see why you would want to have > them mentioned separately. > >From a distro point of view, explicit upstream metapackages would provide a hint saying "these projects should be upgraded as a unit rather than independently". We're free to ignore that hint if we want to, but doing so means we get to keep the pieces if they break rather than just being able to report the problem back upstream :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Thu Feb 23 08:27:23 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 23 Feb 2017 13:27:23 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 23 February 2017 at 12:32, Nick Coghlan wrote: > component: a library or framework used to build Python applications. Users > will mainly interact with the component via a Python API. Examples: > requests, numpy, pytz Sorry to nitpick, but why is "component" better than "library"? People typically understand that "library" includes "framework" in this context. OTOH someone who's written a new library won't necessarily know that in this context (and *only* this context) we want them to describe it as a "component". (As far as I know, we don't use the term "component" anywhere else in the Python ecosystem currently). This feels to me somewhat like the failed attempts to force a distinction between "package" and "distribution". In the end, people use the terms they are comfortable with, and work with a certain level of context-dependent ambiguity. Of course, if the goal here is to raise the barrier for entry to PyPI by expecting people to have to understand this type of concept and the implications before uploading, then that's fair. It's not something I think we should be aiming for personally, but I can see that organisations who want to be able to rely on the quality of what's available on PyPI would be in favour of a certain level of self-selection being applied. Personally I view PyPI as more of a public resource, like github, where it's up to the consumer to assess quality - so to me this is a change of focus. But YMMV. Paul From p.f.moore at gmail.com Thu Feb 23 08:42:23 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 23 Feb 2017 13:42:23 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 23 February 2017 at 13:04, Freddy Rietdijk wrote: >> Here's an example I've just run into that involves a == dependency on >> a public package: I have a library that needs to access some C API >> calls on Windows, but not on other platforms. The natural way to do >> this is to split out the CFFI code into its own package, >> _mylib_windows_helper or whatever, that has zero public interface, and >> have mylib v1.2.3 require "_mylib_windows_helper==1.2.3; os_name == >> 'nt '". > > You have a public library, that, depending on the platform, depends on a > public (helper) library that has no public interface? That doesn't sound > good to me. If you don't want to implement a public interface then it should > just be included in the main library because it is in the end a requirement > of the library. It's a pity you can't have a universal wheel but so be it. > Choosing to depend on an exact version of a package that has no public > interfance is in my opinion the wrong solution. The helper library is only public in the sense that it's published on PyPI. I'd describe it as an optional helper. If PyPI had a way of marking such libraries as "only allow downloading to satisfy a dependency" then I'd say mark it that way - but we don't. Requiring non-universal (and consequently version-dependent) wheels for platforms that don't need them seems like a cure that's worse than the disease. Personally, I find Nathaniel's example to be a compelling reason for wanting to specify exact dependencies for something that's not an "application". As an author, it's how I'd prefer to bundle a package like this. And IMO, if distributions prefer that I don't do that, I'd say it's up to them to explain what they want me to do, and how it'll benefit me and my direct users. At the moment all I'm seeing is "you should" and "it's the wrong solution" - you may be right, but surely it's obvious that you need to explain*why* your view is correct? Or at a minimum, if there is no direct benefit to me, why I, as an author, should modify my preferred development model to make things easier for you. Not all packages published on PyPI need or want to be bundled into OS distributions[1]. Paul [1] OTOH, the bulk of this discussion is currently about theoretical cases anyway. Maybe it would be worth everyone (myself included) taking a deep breath, and refocusing on actual cases where there is a problem right now (I don't know if anyone can identify such cases - I know I can't). Asking directly of the authors of such packages "would you be OK with the following proposal" would likely be very enlightening. From ncoghlan at gmail.com Thu Feb 23 08:47:01 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 23 Feb 2017 23:47:01 +1000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 23 February 2017 at 22:32, Nick Coghlan wrote: > On 23 February 2017 at 18:53, Paul Moore wrote: > >> On 23 February 2017 at 08:44, Nick Coghlan wrote: >> > That gets us back into the world of defining what the various package >> types >> > mean, and I really don't want to go there :) >> >> And yet I still don't understand what's wrong with "application", >> "library", and "metapackage" (the latter saying to me "complex thing >> that I don't need to understand"). Those terms are clear enough - >> after all, they are precisely the ones we've always used when debating >> "should you pin or not"? >> >> Sure, there's a level of judgement involved - but it's precisely the >> *same* judgement as we're asking authors to make when asking"should I >> pin", just using the underlying distinction directly. >> > > Thinking about it further, I may be OK with that, especially since we can > point to concrete examples. > > component: a library or framework used to build Python applications. > Users will mainly interact with the component via a Python API. Examples: > requests, numpy, pytz > Slight amendment here to use the term "library" rather than the generic component (freeing up the latter for its usual meaning in referring to arbitrary software components). I also realised that we need a separate category to cover packages like "pip" itself, and I chose "tool" based on the name of the field in pyproject.toml: ============ library: a software component used to build Python applications. Users will mainly interact with the component via a Python API. Libraries are essentially dynamic plugins for a Python runtime. Examples: requests, numpy, pytz tool: a software utility used to develop and deploy Python libraries, applications, and scripts. Users will mainly interact with the component via the command line, or a GUI. Examples: pip, pycodestyle, gunicorn, jupyter application: an installable client application or web service. Users will mainly interact with the service via either the command line, a GUI, or a network interface. While they may expose Python APIs to end users, the fact they're written in Python themselves is technically an implementation detail, making it possible to use them without even being aware that Python exists. Examples: ckan (network), ansible (cli), spyder (GUI) metapackage: a package that collects specific versions of other components into a single installable group. Example: PyObjC ============ I think a package_type field with those possible values would cover everything I was worried about when I came up with the idea of the separate "integrates" field, and it seems like it would be relatively straightforward to explain to newcomers. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From encukou at gmail.com Thu Feb 23 09:24:02 2017 From: encukou at gmail.com (Petr Viktorin) Date: Thu, 23 Feb 2017 15:24:02 +0100 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> Message-ID: <313f03c2-a6e4-eaed-2c57-92bd82be48a8@gmail.com> On 02/23/2017 02:47 PM, Nick Coghlan wrote: > > ============ > library: a software component used to build Python applications. > Users will mainly interact with the component via a Python API. > Libraries are essentially dynamic plugins for a Python runtime. > Examples: requests, numpy, pytz Assuming frameworks are included, it woud be useful to add e.g. "django" to the examples. > tool: a software utility used to develop and deploy Python > libraries, applications, and scripts. Users will mainly interact with > the component via the command line, or a GUI. Examples: pip, > pycodestyle, gunicorn, jupyter > application: an installable client application or web service. Users > will mainly interact with the service via either the command line, a > GUI, or a network interface. While they may expose Python APIs to end > users, the fact they're written in Python themselves is technically an > implementation detail, making it possible to use them without even being > aware that Python exists. Examples: ckan (network), ansible (cli), > spyder (GUI) > metapackage: a package that collects specific versions of other > components into a single installable group. Example: PyObjC > ============ From p.f.moore at gmail.com Thu Feb 23 09:28:03 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 23 Feb 2017 14:28:03 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 23 February 2017 at 13:47, Nick Coghlan wrote: > > Slight amendment here to use the term "library" rather than the generic > component (freeing up the latter for its usual meaning in referring to > arbitrary software components). I also realised that we need a separate > category to cover packages like "pip" itself, and I chose "tool" based on > the name of the field in pyproject.toml: > > ============ > library: a software component used to build Python applications. Users > will mainly interact with the component via a Python API. Libraries are > essentially dynamic plugins for a Python runtime. Examples: requests, numpy, > pytz > tool: a software utility used to develop and deploy Python libraries, > applications, and scripts. Users will mainly interact with the component via > the command line, or a GUI. Examples: pip, pycodestyle, gunicorn, jupyter > application: an installable client application or web service. Users > will mainly interact with the service via either the command line, a GUI, or > a network interface. While they may expose Python APIs to end users, the > fact they're written in Python themselves is technically an implementation > detail, making it possible to use them without even being aware that Python > exists. Examples: ckan (network), ansible (cli), spyder (GUI) > metapackage: a package that collects specific versions of other > components into a single installable group. Example: PyObjC > ============ > > I think a package_type field with those possible values would cover > everything I was worried about when I came up with the idea of the separate > "integrates" field, and it seems like it would be relatively straightforward > to explain to newcomers. Yeah, that looks good. I'd assume that: (1) The field is optional. (2) The field is 99% for information only, with the only imposed semantics being that PyPI can reject use of == constraints in install_requires unless the type is explicitly "application" or "metapackage". Specifically, I doubt people will make a firm distinction between "tool" and "library". In many cases it'll be a matter of opinion. Is py.test a tool or a library? It has a command line interface after all. I'd also drop "used to develop and deploy Python libraries, applications, and scripts" - why does what it's used for affect its category? I can think of examples I think of as "tools" that are general purpose (e.g. youtube-dl) but I'd expect you to claim they are "applications". But unless they pin their dependencies (which youtube-dl doesn't AFAIK) the distinction is irrelevant. So I prefer to leave it to the author to decide, rather than force an artificial split. Thanks for taking the time to address my concerns! Paul From p.f.moore at gmail.com Thu Feb 23 09:28:31 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 23 Feb 2017 14:28:31 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: <313f03c2-a6e4-eaed-2c57-92bd82be48a8@gmail.com> References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <313f03c2-a6e4-eaed-2c57-92bd82be48a8@gmail.com> Message-ID: On 23 February 2017 at 14:24, Petr Viktorin wrote: > On 02/23/2017 02:47 PM, Nick Coghlan wrote: >> >> >> ============ >> library: a software component used to build Python applications. >> Users will mainly interact with the component via a Python API. >> Libraries are essentially dynamic plugins for a Python runtime. >> Examples: requests, numpy, pytz > > > Assuming frameworks are included, it woud be useful to add e.g. "django" to > the examples. +1 From thomas at kluyver.me.uk Thu Feb 23 09:49:07 2017 From: thomas at kluyver.me.uk (Thomas Kluyver) Date: Thu, 23 Feb 2017 14:49:07 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: <1487861347.1340695.890514816.4132E607@webmail.messagingengine.com> On Thu, Feb 23, 2017, at 02:28 PM, Paul Moore wrote: > I'd also drop "used to develop and deploy Python libraries, > applications, and scripts" - why does what it's used for affect its > category? Things for working on & with Python code often have installation requirements a bit different from other applications. E.g. pip installs (or used to) with aliases specific to the Python version it runs on, so pip, pip3 and pip-3.5 could all point to the same command. Clearly it wouldn't make sense to do that for youtube-dl. I'm not sure about 'tool' as a name for this category, but they often do require different handling to general applications. Thomas From ncoghlan at gmail.com Thu Feb 23 10:09:29 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 24 Feb 2017 01:09:29 +1000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 24 February 2017 at 00:28, Paul Moore wrote: > Specifically, I doubt people will make a firm distinction between > "tool" and "library". In many cases it'll be a matter of opinion. Is > py.test a tool or a library? It has a command line interface after > all. I'd also drop "used to develop and deploy Python libraries, > applications, and scripts" - why does what it's used for affect its > category? I can think of examples I think of as "tools" that are > general purpose (e.g. youtube-dl) but I'd expect you to claim they are > "applications". But unless they pin their dependencies (which > youtube-dl doesn't AFAIK) the distinction is irrelevant. So I prefer > to leave it to the author to decide, rather than force an artificial > split. > The difference is that: * tool = you typically want at least one copy per Python interpreter (like a library) * application = you typically only want one copy per system It may be clearer to make the former category "devtool", since it really is specific to tools that are coupled to the task of Python development. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Thu Feb 23 10:11:10 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 23 Feb 2017 15:11:10 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: <1487861347.1340695.890514816.4132E607@webmail.messagingengine.com> References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> <1487861347.1340695.890514816.4132E607@webmail.messagingengine.com> Message-ID: On 23 February 2017 at 14:49, Thomas Kluyver wrote: > On Thu, Feb 23, 2017, at 02:28 PM, Paul Moore wrote: >> I'd also drop "used to develop and deploy Python libraries, >> applications, and scripts" - why does what it's used for affect its >> category? > > Things for working on & with Python code often have installation > requirements a bit different from other applications. E.g. pip installs > (or used to) with aliases specific to the Python version it runs on, so > pip, pip3 and pip-3.5 could all point to the same command. Clearly it > wouldn't make sense to do that for youtube-dl. > > I'm not sure about 'tool' as a name for this category, but they often do > require different handling to general applications. Point taken, but in the absence of a behavioural difference, why not let the author decide? If I wrote "grep in Python", I'd call it a tool, not an application. The author of pyline (https://pypi.python.org/pypi/pyline) describes it as a "tool". For me, command line utilities are typically called tools. Applications tend to have (G)UIs. I don't think we should repurpose existing terms. And unless we're planning on enforcing different behaviour, I don't think we need to try to dictate at all. If we were to add a facility to create versioned names (rather than just having a special-case hack for pip) then I could imagine restricting it to certain package types - although I can't imagine why we would bother doing so - but let's not worry about that until it happens. Or maybe we'd want to insist that pip only allow build tools to have a certain package type (setuptools, flit, ...) but again, why bother? What's the gain? Paul From p.f.moore at gmail.com Thu Feb 23 10:27:01 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 23 Feb 2017 15:27:01 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 23 February 2017 at 15:09, Nick Coghlan wrote: > The difference is that: > > * tool = you typically want at least one copy per Python interpreter (like a > library) > * application = you typically only want one copy per system > > It may be clearer to make the former category "devtool", since it really is > specific to tools that are coupled to the task of Python development. Ah, OK. That's a good distinction, but I'd avoid linking it to "used for developing Python code". I wouldn't call pyline something used for developing Python code, although you'd want to install it to the (possibly multiple) Python versions you want to use in your one-liners. OTOH, I'd agree you want copies of Jupyter per interpreter, although I'd call Jupyter an application, not a development tool. There's a lot of people who would view Jupyter as an application with a built in Python interpreter rather than the other way around. And do you want to say that Jupyter cannot pin dependencies because it's a "tool" rather than an "application"? Maybe we should keep the package type neutral on this question, and add a separate field to denote one per system vs one per interpreter? But again, without proposed behaviour tied to the value, I'm inclined not to care. (And not to add metadata that no-one will bother using). Paul From donald at stufft.io Thu Feb 23 10:46:06 2017 From: donald at stufft.io (Donald Stufft) Date: Thu, 23 Feb 2017 10:46:06 -0500 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: > On Feb 23, 2017, at 6:49 AM, Nathaniel Smith wrote: > > (Here's an example I've just run into that involves a == dependency on > a public package: I have a library that needs to access some C API > calls on Windows, but not on other platforms. The natural way to do > this is to split out the CFFI code into its own package, > _mylib_windows_helper or whatever, that has zero public interface, and > have mylib v1.2.3 require "_mylib_windows_helper==1.2.3; os_name == > 'nt'". That way I can distribute one pure-Python wheel + one binary > wheel and everything just works. But there's no sense in which this is > an "integrated application" or anything, it's just a single library > that usually ships in one .whl but sometimes ships in 2 .whls.) > > ((In actual fact I'm currently not building the package this way > because setuptools makes it extremely painful to actually maintain > that setup. Really I need the ability to build two wheels out of a > single source package. Since we don't have that, I'm instead using > CFFI's slow and semi-deprecated ABI mode, which lets me call C > functions from a pure Python package. But what I described above is > really the "right" solution, it's just tooling limitations that make > it painful.)) Another way of handling this is to just publish a universal wheel and a Windows binary wheel. Pip will select the more specific one (the binary one) over the universal wheel when it is available. ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Feb 23 11:04:26 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 24 Feb 2017 02:04:26 +1000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 24 February 2017 at 01:27, Paul Moore wrote: > On 23 February 2017 at 15:09, Nick Coghlan wrote: > > The difference is that: > > > > * tool = you typically want at least one copy per Python interpreter > (like a > > library) > > * application = you typically only want one copy per system > > > > It may be clearer to make the former category "devtool", since it really > is > > specific to tools that are coupled to the task of Python development. > > Ah, OK. That's a good distinction, but I'd avoid linking it to "used > for developing Python code". I wouldn't call pyline something used for > developing Python code, although you'd want to install it to the > (possibly multiple) Python versions you want to use in your > one-liners. OTOH, I'd agree you want copies of Jupyter per > interpreter, although I'd call Jupyter an application, not a > development tool. There's a lot of people who would view Jupyter as an > application with a built in Python interpreter rather than the other > way around. And do you want to say that Jupyter cannot pin > dependencies because it's a "tool" rather than an "application"? > It provides a frame for a discussion between publishers and redistributors on how publishers would like their software to be treated. Marking it as an application is saying "Treat it as a standalone application, and don't try to integrate it with anything else" Marking it as a library is saying "Treat it as a Python component that expects to be integrated into a larger application" Marking it as a metapackage is saying "Treat this particular set of libraries as a coherent whole, and don't try to mix-and-match other versions" Marking it as a devtool is saying "This doesn't export a stable Python API (except maybe to plugins), but you should treat it as a library anyway" Redistributors may *ask* a publisher to reclassify their project as a library or a devtool (and hence also avoid pinning their dependencies in order to make integration easier), but publishers will always have the option of saying "No, we want to you to treat it as an application, and we won't help your end users if we know you're overriding our pinned dependencies and the issue can't be reproduced outside your custom configuration". Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Thu Feb 23 12:14:42 2017 From: donald at stufft.io (Donald Stufft) Date: Thu, 23 Feb 2017 12:14:42 -0500 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: > On Feb 23, 2017, at 11:04 AM, Nick Coghlan wrote: > > Redistributors may *ask* a publisher to reclassify their project as a library or a devtool (and hence also avoid pinning their dependencies in order to make integration easier), but publishers will always have the option of saying "No, we want to you to treat it as an application, and we won't help your end users if we know you're overriding our pinned dependencies and the issue can't be reproduced outside your custom configuration". This whole discussion feels like trying to overcomplicate something that?s already not a simple to solve a problem that I don?t think is really that widespread. My estimation is that 99% of people who are currently using ``==`` will just immediately switch over to using whatever flag we provide that allows them to still do that. Adding a ?do the thing I asked for? detritus to the project seems like a bad idea. It?s not really any different than if a project say, only released Wheels. While we want to encourage projects to release sdists (and to not ping versions) trying to enforce that isn?t worth the cost. Like most packaging issues, I think that it?s best solved by opening up issues on the offending project?s issue tracker. ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From dholth at gmail.com Thu Feb 23 12:41:58 2017 From: dholth at gmail.com (Daniel Holth) Date: Thu, 23 Feb 2017 17:41:58 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: Another way to look at the problem is that it is just too hard to override what the package says. For example in buildout you can provide a patch for any package that does not do exactly what you want, and it is applied during installation. This could include patching the dependencies. On Thu, Feb 23, 2017 at 12:15 PM Donald Stufft wrote: On Feb 23, 2017, at 11:04 AM, Nick Coghlan wrote: Redistributors may *ask* a publisher to reclassify their project as a library or a devtool (and hence also avoid pinning their dependencies in order to make integration easier), but publishers will always have the option of saying "No, we want to you to treat it as an application, and we won't help your end users if we know you're overriding our pinned dependencies and the issue can't be reproduced outside your custom configuration". This whole discussion feels like trying to overcomplicate something that?s already not a simple to solve a problem that I don?t think is really that widespread. My estimation is that 99% of people who are currently using ``==`` will just immediately switch over to using whatever flag we provide that allows them to still do that. Adding a ?do the thing I asked for? detritus to the project seems like a bad idea. It?s not really any different than if a project say, only released Wheels. While we want to encourage projects to release sdists (and to not ping versions) trying to enforce that isn?t worth the cost. Like most packaging issues, I think that it?s best solved by opening up issues on the offending project?s issue tracker. ? Donald Stufft _______________________________________________ Distutils-SIG maillist - Distutils-SIG at python.org https://mail.python.org/mailman/listinfo/distutils-sig -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve.dower at python.org Thu Feb 23 13:19:20 2017 From: steve.dower at python.org (Steve Dower) Date: Thu, 23 Feb 2017 10:19:20 -0800 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> Message-ID: <50bccf0d-47d2-2ce4-e6f3-b197610d0061@python.org> On 23Feb2017 0914, Donald Stufft wrote: > >> On Feb 23, 2017, at 11:04 AM, Nick Coghlan > > wrote: >> >> Redistributors may *ask* a publisher to reclassify their project as a >> library or a devtool (and hence also avoid pinning their dependencies >> in order to make integration easier), but publishers will always have >> the option of saying "No, we want to you to treat it as an >> application, and we won't help your end users if we know you're >> overriding our pinned dependencies and the issue can't be reproduced >> outside your custom configuration". > > > This whole discussion feels like trying to overcomplicate something > that?s already not a simple to solve a problem that I don?t think is > really that widespread. My estimation is that 99% of people who are > currently using ``==`` will just immediately switch over to using > whatever flag we provide that allows them to still do that. Adding a ?do > the thing I asked for? detritus to the project seems like a bad idea. > > It?s not really any different than if a project say, only released > Wheels. While we want to encourage projects to release sdists (and to > not ping versions) trying to enforce that isn?t worth the cost. Like > most packaging issues, I think that it?s best solved by opening up > issues on the offending project?s issue tracker. +1. This has been my feeling the entire time I spent catching up on the thread just now. As soon as "user education" becomes a requirement, we may as well do the simplest and least restrictive metadata possible and use the education to help people understand the impact of their decisions. Cheers, Steve From xav.fernandez at gmail.com Thu Feb 23 14:56:20 2017 From: xav.fernandez at gmail.com (Xavier Fernandez) Date: Thu, 23 Feb 2017 20:56:20 +0100 Subject: [Distutils] distlib and wheel metadata In-Reply-To: <50bccf0d-47d2-2ce4-e6f3-b197610d0061@python.org> References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <50bccf0d-47d2-2ce4-e6f3-b197610d0061@python.org> Message-ID: +1 also. This whole double requirement feels over-complicated for what seems like a rather small usecase: it would be interesting to have a few stats on the number of packages concerned by this pinning (maybe just scan all the last uploaded wheels of each package ?). And if one needs to classify packages type, why not add a new high level trove classifier ? Le 23 f?vr. 2017 19:19, "Steve Dower" a ?crit : On 23Feb2017 0914, Donald Stufft wrote: > > On Feb 23, 2017, at 11:04 AM, Nick Coghlan > > wrote: >> >> Redistributors may *ask* a publisher to reclassify their project as a >> library or a devtool (and hence also avoid pinning their dependencies >> in order to make integration easier), but publishers will always have >> the option of saying "No, we want to you to treat it as an >> application, and we won't help your end users if we know you're >> overriding our pinned dependencies and the issue can't be reproduced >> outside your custom configuration". >> > > > This whole discussion feels like trying to overcomplicate something > that?s already not a simple to solve a problem that I don?t think is > really that widespread. My estimation is that 99% of people who are > currently using ``==`` will just immediately switch over to using > whatever flag we provide that allows them to still do that. Adding a ?do > the thing I asked for? detritus to the project seems like a bad idea. > > It?s not really any different than if a project say, only released > Wheels. While we want to encourage projects to release sdists (and to > not ping versions) trying to enforce that isn?t worth the cost. Like > most packaging issues, I think that it?s best solved by opening up > issues on the offending project?s issue tracker. > +1. This has been my feeling the entire time I spent catching up on the thread just now. As soon as "user education" becomes a requirement, we may as well do the simplest and least restrictive metadata possible and use the education to help people understand the impact of their decisions. Cheers, Steve _______________________________________________ Distutils-SIG maillist - Distutils-SIG at python.org https://mail.python.org/mailman/listinfo/distutils-sig -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Feb 23 15:42:59 2017 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 23 Feb 2017 12:42:59 -0800 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On Feb 23, 2017 7:46 AM, "Donald Stufft" wrote: On Feb 23, 2017, at 6:49 AM, Nathaniel Smith wrote: (Here's an example I've just run into that involves a == dependency on a public package: I have a library that needs to access some C API calls on Windows, but not on other platforms. The natural way to do this is to split out the CFFI code into its own package, _mylib_windows_helper or whatever, that has zero public interface, and have mylib v1.2.3 require "_mylib_windows_helper==1.2.3; os_name == 'nt'". That way I can distribute one pure-Python wheel + one binary wheel and everything just works. But there's no sense in which this is an "integrated application" or anything, it's just a single library that usually ships in one .whl but sometimes ships in 2 .whls.) ((In actual fact I'm currently not building the package this way because setuptools makes it extremely painful to actually maintain that setup. Really I need the ability to build two wheels out of a single source package. Since we don't have that, I'm instead using CFFI's slow and semi-deprecated ABI mode, which lets me call C functions from a pure Python package. But what I described above is really the "right" solution, it's just tooling limitations that make it painful.)) Another way of handling this is to just publish a universal wheel and a Windows binary wheel. Pip will select the more specific one (the binary one) over the universal wheel when it is available. Thanks, I was wondering about that :-). Still, I don't really like this solution in this case, because if someone did install the universal wheel on Windows it would be totally broken, yet there'd be no metadata to warn them. (This is a case where the binary isn't just an acceleration module, but is providing crucial functionality.) Even if pip wouldn't do this automatically, it's easy to imagine cases where it would happen. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From anthony at xtfx.me Thu Feb 23 16:51:29 2017 From: anthony at xtfx.me (C Anthony Risinger) Date: Thu, 23 Feb 2017 15:51:29 -0600 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <50bccf0d-47d2-2ce4-e6f3-b197610d0061@python.org> Message-ID: On Thu, Feb 23, 2017 at 1:56 PM, Xavier Fernandez wrote: > +1 also. > This whole double requirement feels over-complicated for what seems like a > rather small usecase: it would be interesting to have a few stats on the > number of packages concerned by this pinning (maybe just scan all the last > uploaded wheels of each package ?). > FWIW, an application packaging tool a wrote several years ago used to hit into dependency solver problems quite often. The tool leaned on distlib (which is a pretty nice library, but strict, as noted in OP) because there is/was no interface to pip. IIRC we upstreamed a few patches related to this and for sure carried some local patches. The distlib solver would bind up from impossible constraints, yet every time, pip found a way to "power through" the exact same configuration despite blatantly incompatible metadata at times. I never looked into it further on pip's side (though probably someone here can confirm/deny this) but I suspect poor metadata is more widespread than pip makes visible. I had a dump from 2014 of the distlib data at red-dove.com and I ran a quick script against it: https://gist.github.com/anthonyrisinger/f9140191009fb1ec1434cb0585a4a75c total_projects: 41228 total_projects_eq: 182 % affected: 0.44% total_files: 285248 total_files_eq: 1276 % affected: 0.45% total_reqs: 642447 total_reqs_bare: 460080 % affected: 71.61% I know the distlib data (from 2014 at least) is imperfect, but this would suggest not many projects use "==" explicitly. Maybe the bigger problem is that 75% of requirements have no version specifier at all. I know for us this specifically contributed to our solver problems because distlib was eager about choosing a version for such requirements, even though a later package might fulfill the requirement. Maybe this has since changed, but we needed to patch it at the time [1]. We really have to figure out this distribution stuff friends. Existing files, new metadata files, old PEPs, new PEPs... it's looking a bit "broken windows theory" for the principal method used to share Python with the world, and the outsized lens though which Python is perceived. Maybe this means hard or opinionated decisions, but I really can't stress enough how much of a drag it is to an otherwise reasonably solid Python experience. There is a real perception that it's more trouble than it's worth, especially with many other good options at the table. [1] https://github.com/anthonyrisinger/zippy/commit/1c5d34d89805c47188a18cfbe17cfc39a9cb4480#diff-a533aaf4eec84e7c5d85d2129e10514fR1168 -- C Anthony -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Thu Feb 23 17:21:28 2017 From: donald at stufft.io (Donald Stufft) Date: Thu, 23 Feb 2017 17:21:28 -0500 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <50bccf0d-47d2-2ce4-e6f3-b197610d0061@python.org> Message-ID: <55F3D30A-48EE-4B5B-B69A-FBD0891430B6@stufft.io> > On Feb 23, 2017, at 4:51 PM, C Anthony Risinger wrote: > > The distlib solver would bind up from impossible constraints, yet every time, pip found a way to "power through" the exact same configuration despite blatantly incompatible metadata at times. I never looked into it further on pip's side (though probably someone here can confirm/deny this) but I suspect poor metadata is more widespread than pip makes visible. <1% of projects or files using == suggests to me that there is very few people using == incorrectly. ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From anthony at xtfx.me Thu Feb 23 17:31:37 2017 From: anthony at xtfx.me (C Anthony Risinger) Date: Thu, 23 Feb 2017 16:31:37 -0600 Subject: [Distutils] distlib and wheel metadata In-Reply-To: <55F3D30A-48EE-4B5B-B69A-FBD0891430B6@stufft.io> References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <50bccf0d-47d2-2ce4-e6f3-b197610d0061@python.org> <55F3D30A-48EE-4B5B-B69A-FBD0891430B6@stufft.io> Message-ID: On Thu, Feb 23, 2017 at 4:21 PM, Donald Stufft wrote: > > On Feb 23, 2017, at 4:51 PM, C Anthony Risinger wrote: > > The distlib solver would bind up from impossible constraints, yet every > time, pip found a way to "power through" the exact same configuration > despite blatantly incompatible metadata at times. I never looked into it > further on pip's side (though probably someone here can confirm/deny this) > but I suspect poor metadata is more widespread than pip makes visible. > > > <1% of projects or files using == suggests to me that there is very few > people using == incorrectly. > Yeah I'm pretty sure the bigger problem was version-less reqs eagerly selecting a version (eg. latest) incompatible with later requirements provided by a different package, but then treating them as hard reqs by that point. I'll defer to you on how pip deals with things today. I'll try to resurface a concrete example. I know for certain pip at that time (circa 2015) was capable of installing a set of packages where the dependency information was not solvent, because I pointed it out to my team (I actually think python-dateutil was involved for that one, mentioned in another post). I would agree though, "==" is way way less widespread than no version at all. -- C Anthony -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Thu Feb 23 17:32:51 2017 From: donald at stufft.io (Donald Stufft) Date: Thu, 23 Feb 2017 17:32:51 -0500 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <50bccf0d-47d2-2ce4-e6f3-b197610d0061@python.org> <55F3D30A-48EE-4B5B-B69A-FBD0891430B6@stufft.io> Message-ID: <80EEA344-23DD-4DB6-A19C-7932197867FC@stufft.io> > On Feb 23, 2017, at 5:31 PM, C Anthony Risinger wrote: > > Yeah I'm pretty sure the bigger problem was version-less reqs eagerly selecting a version (eg. latest) incompatible with later requirements provided by a different package, but then treating them as hard reqs by that point. I'll defer to you on how pip deals with things today. > > I'll try to resurface a concrete example. I know for certain pip at that time (circa 2015) was capable of installing a set of packages where the dependency information was not solvent, because I pointed it out to my team (I actually think python-dateutil was involved for that one, mentioned in another post). > Yea, pip doesn?t really have a dep solver. It?s mechanism for selecting which version to install is? not smart. ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Thu Feb 23 21:28:05 2017 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 23 Feb 2017 20:28:05 -0600 Subject: [Distutils] distlib and wheel metadata In-Reply-To: <80EEA344-23DD-4DB6-A19C-7932197867FC@stufft.io> References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <50bccf0d-47d2-2ce4-e6f3-b197610d0061@python.org> <55F3D30A-48EE-4B5B-B69A-FBD0891430B6@stufft.io> <80EEA344-23DD-4DB6-A19C-7932197867FC@stufft.io> Message-ID: On Thursday, February 23, 2017, Donald Stufft wrote: > > On Feb 23, 2017, at 5:31 PM, C Anthony Risinger > wrote: > > Yeah I'm pretty sure the bigger problem was version-less reqs eagerly > selecting a version (eg. latest) incompatible with later requirements > provided by a different package, but then treating them as hard reqs by > that point. I'll defer to you on how pip deals with things today. > > I'll try to resurface a concrete example. I know for certain pip at that > time (circa 2015) was capable of installing a set of packages where the > dependency information was not solvent, because I pointed it out to my team > (I actually think python-dateutil was involved for that one, mentioned in > another post). > > > > Yea, pip doesn?t really have a dep solver. It?s mechanism for selecting > which version to install is? not smart. > "Pip needs a dependency resolver" https://github.com/pypa/pip/issues/988 - {Enthought, Conda,}: SAT solver (there are many solutions) - easy_install: - pip: > > ? > Donald Stufft > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Feb 23 23:30:45 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 24 Feb 2017 14:30:45 +1000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: <55F3D30A-48EE-4B5B-B69A-FBD0891430B6@stufft.io> References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <50bccf0d-47d2-2ce4-e6f3-b197610d0061@python.org> <55F3D30A-48EE-4B5B-B69A-FBD0891430B6@stufft.io> Message-ID: On 24 February 2017 at 08:21, Donald Stufft wrote: > > On Feb 23, 2017, at 4:51 PM, C Anthony Risinger wrote: > > The distlib solver would bind up from impossible constraints, yet every > time, pip found a way to "power through" the exact same configuration > despite blatantly incompatible metadata at times. I never looked into it > further on pip's side (though probably someone here can confirm/deny this) > but I suspect poor metadata is more widespread than pip makes visible. > > > <1% of projects or files using == suggests to me that there is very few > people using == incorrectly. > And if it does become a more notable problem in the future then a metadata independent way of dealing with it would be to add a warning to twine suggesting replacing "==" with "~=" (as well as an off switch to say "Don't bug me about that"). So I think the upshot of all this is that the entire semantic dependency structure in PEP 426 should be simplified to: 1. A single "dependencies" list that allows any PEP 508 dependency specifier 2. A MAY clause permitting tools to warn about the use of `==` and `===` 3. A MAY clause permitting tools to prohibit the use of direct references 4. A conventional set of "extras" with pre-defined semantics ("build", "dev", "doc", "test") That gives us an approach that's entirely compatible with the current 1.x metadata formats (so existing tools will still work), while also moving us closer to a point where PEP 426 could actually be accepted. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Fri Feb 24 01:11:48 2017 From: wes.turner at gmail.com (Wes Turner) Date: Fri, 24 Feb 2017 00:11:48 -0600 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <50bccf0d-47d2-2ce4-e6f3-b197610d0061@python.org> Message-ID: On Thursday, February 23, 2017, Xavier Fernandez wrote: > +1 also. > This whole double requirement feels over-complicated for what seems like a > rather small usecase: it would be interesting to have a few stats on the > number of packages concerned by this pinning (maybe just scan all the last > uploaded wheels of each package ?). > > > And if one needs to classify packages type, why not add a new high level > trove classifier ? > +1 This could be accomplished with a trove classifier (because Entity Attribute boolean-Value) The component/library, application, metapackage categorical would require far more docs than: pip install --ignore-versions metapkgname Which is effectively, probably, maybe the same? as: pip install metapkg pip install --upgrade __ALL__ ... say, given that metapkgname requires (install_requires) ipython, and the requirements.txt is: metapkgname # ipython==4.2 ipython If pip freeze returns: ipython metapkgname And I then: pip freeze -- | xargs pip install --upgrade Haven't I then upgraded ipython past the metapackage pinned version, anyway? http://stackoverflow.com/questions/2720014/upgrading-all-packages-with-pip The best workaround that I'm aware of: - Create integration test and then build scripts - Run test/build script in a container - Change dependencies, Commit, Create a PR, (e.g. Travis CI runs the test/build/report/post script), Review integration test output What integration tests do the RPM/DNF package maintainers run for, say, django, simplejson, [and psycopg2, for django-ca]? If others have already integration-tested even a partially overlapping set, that's great and it would be great to be able to store, share, and search those build artifacts (logs, pass/fail). Additionally, practically, could we add metadata pointing to zero or more OS packages, per-distribution? How do I know that there's probably a somewhat-delayed repackaging named "python-ipython" which *might* work with the rest of the bleeding edge trunk builds I consider as stable as yesterday, given which tests? > > Le 23 f?vr. 2017 19:19, "Steve Dower" > a ?crit : > > On 23Feb2017 0914, Donald Stufft wrote: > >> >> On Feb 23, 2017, at 11:04 AM, Nick Coghlan >> >>> >> >> wrote: >>> >>> Redistributors may *ask* a publisher to reclassify their project as a >>> library or a devtool (and hence also avoid pinning their dependencies >>> in order to make integration easier), but publishers will always have >>> the option of saying "No, we want to you to treat it as an >>> application, and we won't help your end users if we know you're >>> overriding our pinned dependencies and the issue can't be reproduced >>> outside your custom configuration". >>> >> >> >> This whole discussion feels like trying to overcomplicate something >> that?s already not a simple to solve a problem that I don?t think is >> really that widespread. My estimation is that 99% of people who are >> currently using ``==`` will just immediately switch over to using >> whatever flag we provide that allows them to still do that. Adding a ?do >> the thing I asked for? detritus to the project seems like a bad idea. >> >> It?s not really any different than if a project say, only released >> Wheels. While we want to encourage projects to release sdists (and to >> not ping versions) trying to enforce that isn?t worth the cost. Like >> most packaging issues, I think that it?s best solved by opening up >> issues on the offending project?s issue tracker. >> > > +1. This has been my feeling the entire time I spent catching up on the > thread just now. > > As soon as "user education" becomes a requirement, we may as well do the > simplest and least restrictive metadata possible and use the education to > help people understand the impact of their decisions. > > Cheers, > Steve > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > > https://mail.python.org/mailman/listinfo/distutils-sig > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lele at metapensiero.it Fri Feb 24 06:37:17 2017 From: lele at metapensiero.it (Lele Gaifax) Date: Fri, 24 Feb 2017 12:37:17 +0100 Subject: [Distutils] Issue with (latest?) buildout and package name containing underscore Message-ID: <87d1e73jma.fsf@metapensiero.it> Hi all, I have a installation setup of an application of mine that uses zc.buildout (see #1). Today I had to reinstall it on a new machine, using Python 3.6. Executing its bootstrap.py installed latest zc.buildout (2.8.0). All the required packages are pinned to an exact version (see #2), so it surprised me to hit the following error executing the buildout: Version and requirements information containing transaction: [versions] constraint on transaction: 1.4.4 Requirement of SoL==3.37: transaction Requirement of zope.sqlalchemy: transaction Requirement of transaction: zope.interface Requirement of pyramid_tm: transaction>=2.0 While: Installing sol. Error: The requirement ('transaction>=2.0') is not allowed by your [versions] constraint (1.4.4) Investigating the issue, I found that buildout was actually installing ?pyramid-tm? (with a dash), not ?pyramid_tm? (underscore, the only one present on PyPI, see #3): ... Getting distribution for 'pyramid_tm'. Got pyramid-tm 1.1.1. ... And that of course is the source of the problem: hacking a local copy of the versions.cfg adding the following line: pyramid-tm = 0.12.1 and then adjusting the buildout.cfg to load that local copy fixed the problem. So the question is: what is the real nature of the problem? I downloaded current pyramid_tm sources, and there is no trace of a ?pyramid-tm? except in the project's URL: $ git grep pyramid-tm setup.py: url="http://docs.pylonsproject.org/projects/pyramid-tm/en/latest/", but effectively when I dig inside the egg that buildout produced I can find the following: $ grep -r pyramid-tm * EGG-INFO/PKG-INFO:Name: pyramid-tm EGG-INFO/PKG-INFO:Home-page: http://docs.pylonsproject.org/projects/pyramid-tm/en/latest/ The differences since my last install-from-scratch of the application (that worked flawlessly) are basically version 3.6 of Python and version 2.8.0 of zc.buildout. Can anyone shed some light on the problem? By what logic buildout used a different name for that particular package? Thank you in advance, ciao, lele. #1 https://bitbucket.org/lele/solista #2 https://bitbucket.org/lele/sol/raw/master/requirements/versions.cfg #3 https://pypi.python.org/pypi/pyramid_tm -- nickname: Lele Gaifax | Quando vivr? di quello che ho pensato ieri real: Emanuele Gaifas | comincer? ad aver paura di chi mi copia. lele at metapensiero.it | -- Fortunato Depero, 1929. From p.f.moore at gmail.com Fri Feb 24 06:43:44 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 24 Feb 2017 11:43:44 +0000 Subject: [Distutils] Issue with (latest?) buildout and package name containing underscore In-Reply-To: <87d1e73jma.fsf@metapensiero.it> References: <87d1e73jma.fsf@metapensiero.it> Message-ID: On 24 February 2017 at 11:37, Lele Gaifax wrote: > Can anyone shed some light on the problem? By what logic buildout used a > different name for that particular package? While I don't know anything about buildout, pyramid-tm is the normalised version of pyramid_tm - see https://www.python.org/dev/peps/pep-0503/#normalized-names Your two setups may be behaving differently regarding normalisation (I'd suspect the older version wasn't normalising correctly - it's a relatively new change). Paul From lele at metapensiero.it Fri Feb 24 06:54:58 2017 From: lele at metapensiero.it (Lele Gaifax) Date: Fri, 24 Feb 2017 12:54:58 +0100 Subject: [Distutils] Issue with (latest?) buildout and package name containing underscore References: <87d1e73jma.fsf@metapensiero.it> Message-ID: <878tov3ist.fsf@metapensiero.it> Paul Moore writes: > On 24 February 2017 at 11:37, Lele Gaifax wrote: >> Can anyone shed some light on the problem? By what logic buildout used a >> different name for that particular package? > > While I don't know anything about buildout, pyramid-tm is the > normalised version of pyramid_tm - see > https://www.python.org/dev/peps/pep-0503/#normalized-names Oh, I see, thank you. Does that mean that the right thing I should do is always using such normalized names in my requirements.txt/versions.cfg? ciao, lele. -- nickname: Lele Gaifax | Quando vivr? di quello che ho pensato ieri real: Emanuele Gaifas | comincer? ad aver paura di chi mi copia. lele at metapensiero.it | -- Fortunato Depero, 1929. From p.f.moore at gmail.com Fri Feb 24 06:59:44 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 24 Feb 2017 11:59:44 +0000 Subject: [Distutils] Issue with (latest?) buildout and package name containing underscore In-Reply-To: <878tov3ist.fsf@metapensiero.it> References: <87d1e73jma.fsf@metapensiero.it> <878tov3ist.fsf@metapensiero.it> Message-ID: On 24 February 2017 at 11:54, Lele Gaifax wrote: > Paul Moore writes: > >> On 24 February 2017 at 11:37, Lele Gaifax wrote: >>> Can anyone shed some light on the problem? By what logic buildout used a >>> different name for that particular package? >> >> While I don't know anything about buildout, pyramid-tm is the >> normalised version of pyramid_tm - see >> https://www.python.org/dev/peps/pep-0503/#normalized-names > > Oh, I see, thank you. Does that mean that the right thing I should do is > always using such normalized names in my requirements.txt/versions.cfg? I *think* it shouldn't matter. The problem will likely be with older tools not normalising. So using normalised names throughout might help such tools. Paul From jim at jimfulton.info Fri Feb 24 08:13:24 2017 From: jim at jimfulton.info (Jim Fulton) Date: Fri, 24 Feb 2017 08:13:24 -0500 Subject: [Distutils] Issue with (latest?) buildout and package name containing underscore In-Reply-To: References: <87d1e73jma.fsf@metapensiero.it> <878tov3ist.fsf@metapensiero.it> Message-ID: On Fri, Feb 24, 2017 at 6:59 AM, Paul Moore wrote: > On 24 February 2017 at 11:54, Lele Gaifax wrote: > > Paul Moore writes: > > > >> On 24 February 2017 at 11:37, Lele Gaifax wrote: > >>> Can anyone shed some light on the problem? By what logic buildout used > a > >>> different name for that particular package? > >> > >> While I don't know anything about buildout, pyramid-tm is the > >> normalised version of pyramid_tm - see > >> https://www.python.org/dev/peps/pep-0503/#normalized-names > > > > Oh, I see, thank you. Does that mean that the right thing I should do is > > always using such normalized names in my requirements.txt/versions.cfg? > > I *think* it shouldn't matter. The problem will likely be with older > tools not normalising. So using normalised names throughout might help > such tools. > Thanks Paul. Yes, this is a buildout bug: https://github.com/buildout/buildout/issues/317 This case shed the light on the bug for me. Thanks. Jim -- Jim Fulton http://jimfulton.info -------------- next part -------------- An HTML attachment was scrubbed... URL: From mekulisnicole at gmail.com Mon Feb 27 06:35:02 2017 From: mekulisnicole at gmail.com (blah whoops) Date: Mon, 27 Feb 2017 19:35:02 +0800 Subject: [Distutils] (no subject) Message-ID: hye do u have facebook.py?i hv just downloaded python 2.7..n its seems its unavailable thx nic -------------- next part -------------- An HTML attachment was scrubbed... URL: From pradyunsg at gmail.com Tue Feb 28 10:14:13 2017 From: pradyunsg at gmail.com (Pradyun Gedam) Date: Tue, 28 Feb 2017 15:14:13 +0000 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver Message-ID: Hello Everyone! Google released the list of accepted organizations for GSoC 2017 and PSF is one of them. I guess this would a good time for me to seek feedback on the approach I'm planning to take for my potential GSoC project. I hope this mailing list is the right place to do so. --- Here's my current plan of action along with reasoning for the choices made: A separate PR will be made for each of these stages. Every stage does depend on the previous ones being completed. 1. Refactor all dependency resolution responsibility in pip into a new, separate module. This would allow any future changes/improvements in the dependency resolution to be added without major changes in the rest of the code-base. As of today, the RequirementSet class within pip seems to be doing a lot of work and dependency resolution is a responsibility that doesn't need to given to it, especially when it's avoidable. 2. Implement dependency information caching. This would allow the resolver to not cause the re-computation of the dependencies of a package, if they have already been computed, speeding up the resolution. 3. Implement a backtracking resolver. A backtracking solver would be appropriate given that we don't have a way to pre-compute the dependencies for *all* the packages or statically determine the dependencies - a SAT solver would not be feasible. 4. (if time permits) Move any dependency resolution code out into a separate library. This would make it possible for other projects (like buildout or a future pip replacement) to reuse the dependency resolver. By making each of the stages separate PRs, incremental improvements would be made so that even if I leave this project midway, there will be some work merged already if someone comes back to this problem later. That said, I don't intend to leave this project midway. I do intend to reuse some of the work done by Robert Collins in PR #2716 on pip's GitHub repository. Stages 2 and 3 are separate because I see them as distinctly different tasks which touch very different portions of the code-base. There's is strong coupling between them though. I'm looking forward to the feedback. :) Regards, Pradyun -------------- next part -------------- An HTML attachment was scrubbed... URL: From jim at jimfulton.info Tue Feb 28 10:48:09 2017 From: jim at jimfulton.info (Jim Fulton) Date: Tue, 28 Feb 2017 10:48:09 -0500 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: References: Message-ID: On Tue, Feb 28, 2017 at 10:14 AM, Pradyun Gedam wrote: ... > 4. (if time permits) Move any dependency resolution code out into a > separate library. > > This would make it possible for other projects (like buildout or a > future pip replacement) to reuse the dependency resolver. > Thank you! ... I do intend to reuse some of the work done by Robert Collins in PR #2716 on > pip's GitHub repository. > Are you aware of the proof of concept in distlib? https://distil.readthedocs.io/en/0.1.0/overview.html#actual-improvements Jim -- Jim Fulton http://jimfulton.info -------------- next part -------------- An HTML attachment was scrubbed... URL: