From chris.barker at noaa.gov Wed Feb 1 11:07:15 2017 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 1 Feb 2017 08:07:15 -0800 Subject: [Distutils] install questions and help requested. ---pyautogui In-Reply-To: References: Message-ID: <8960273004026282209@unknownmsgid> This is really a list for discussing development of distribution tools, rather than help on basic usage. But; > > >>> pip install pyautogui > SyntaxError: invalid syntax This looks like you are trying to run pip at the Python prompt. Pip is designed to be run st a system command line ("DOS prompt") Try the same command there. -CHB > > Sincerely, > Michael G. Strain Jr. > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig From ncoghlan at gmail.com Mon Feb 6 06:17:05 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 6 Feb 2017 12:17:05 +0100 Subject: [Distutils] Adding the "Description-Content-Type" metadata field Message-ID: Hi folks, Marc Abramowitz has prepared a PR for the Core Metadata section of the specifications page [1] that adds a new "Description-Content-Type" field: https://github.com/pypa/python-packaging-user-guide/pull/258 The draft text has now reached the point where I'm prepared to accept it, so this thread offers folks one last chance to provide feedback before we make it official. Full text of the new subsection ========================================= Description-Content-Type ~~~~~~~~~~~~~~~~~~~~~~~~ A string containing the format of the distribution's description, so that tools can intelligently render the description. Historically, PyPI supported descriptions in plain text and `reStructuredText (reST) `_, and could render reST into HTML. However, it is common for distribution authors to write the description in `Markdown `_ (`RFC 7763 `_) as many code hosting sites render Markdown READMEs, and authors would reuse the file for the description. PyPI didn't recognize the format and so could not render the description correctly. This resulted in many packages on PyPI with poorly-rendered descriptions when Markdown is left as plain text, or worse, was attempted to be rendered as reST. This field allows the distribution author to specify the format of their description, opening up the possibility for PyPI and other tools to be able to render Markdown and other formats. The format of this field is the same as the ``Content-Type`` header in HTTP (e.g.: `RFC 1341 `_). Briefly, this means that it has a ``type/subtype`` part and then it can optionally have a number of parameters: Format:: Description-Content-Type: /; charset=[; = ...] The ``type/subtype`` part has only a few legal values: - ``text/plain`` - ``text/x-rst`` - ``text/markdown`` The ``charset`` parameter can be used to specify whether the character set in use is UTF-8, ASCII, etc. If ``charset`` is not provided, then it is recommended that the implementation (e.g.: PyPI) treat the content as UTF-8. Other parameters might be specific to the chosen subtype. For example, for the ``markdown`` subtype, there is a ``variant`` parameter that allows specifying the variant of Markdown in use, such as: - ``CommonMark`` for `CommonMark` `_ - ``GFM`` for `GitHub Flavored Markdown (GFM) `_ - ``Original`` for `Gruber's original Markdown syntax `_ Example:: Description-Content-Type: text/plain; charset=UTF-8 Example:: Description-Content-Type: text/x-rst; charset=UTF-8 Example:: Description-Content-Type: text/markdown; charset=UTF-8; variant=CommonMark Example:: Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM Example:: Description-Content-Type: text/markdown; charset=UTF-8; variant=Original If a ``Description-Content-Type`` is not specified or it's set to an unrecognized value, then the assumed content type is ``text/x-rst; charset=UTF-8``. If the ``charset`` is not specified or it's set to an unrecognized value, then the assumed ``charset`` is ``UTF-8``. If the subtype is ``markdown`` and ``variant`` is not specified or it's set to an unrecognized value, then the assumed ``variant`` is ``CommonMark``. ========================================= [1] https://packaging.python.org/specifications/#core-metadata Regards, Nick. P.S. I know I still need to update https://www.pypa.io/en/latest/specifications/ to reflect the ability to make small backwards compatible adjustments to the specifications without a PEP, so I'll get that sorted today, since I've been talking about it for approximately forever. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Mon Feb 6 07:14:04 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 6 Feb 2017 13:14:04 +0100 Subject: [Distutils] pypa.io PR to document the actual current spec update process Message-ID: Hi folks, The "Specifications" section in the pypa.io developer's manual had fallen behind the process we've actually been using in recent times, so I've finally submitted a PR to bring it up to date: https://github.com/pypa/pypa.io/pull/19 General questions about the change are best asked on the list, while detailed comments on the specific wording in the PR are best submitted through GitHub. Cheers, Nick. P.S. See https://github.com/pypa/pypa.io/issues/11 for some additional background -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From vmittal05 at gmail.com Mon Feb 6 14:27:49 2017 From: vmittal05 at gmail.com (varun mittal) Date: Tue, 7 Feb 2017 00:57:49 +0530 Subject: [Distutils] bdist_deb always creates 'all' architecture package for me Message-ID: Hi all I am totally new to debian package building. Need to create a deb package from source, for Ubuntu. The package would contain mostly python code and a singular C file. My control file in 'debian' directory reads 'any' for Architecture. But running bdist_deb always creates _all.deb package. How to control that ? I tried forcing it to 'amd64' too, but didn't succeed Thanks n regards Mittal -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas at kluyver.me.uk Tue Feb 7 06:29:30 2017 From: thomas at kluyver.me.uk (Thomas Kluyver) Date: Tue, 07 Feb 2017 11:29:30 +0000 Subject: [Distutils] Indexing modules in Python distributions Message-ID: <1486466970.2092062.872926736.0C8AF205@webmail.messagingengine.com> For a variety of reasons, I would like to build an index of what modules/packages are contained in which distributions ('packages') on PyPI. For instance: - Identifying requirements by static analysis of code: 'import zmq' -> requires pyzmq - Finding corresponding packages from different packaging systems: pyzmq on PyPI corresponds to pyzmq in conda, and python[3]-zmq in Debian repositories. This is an oversimplification, but importable module names provide a common basis to compare packages. I'd like a tool that could pick between different ways of installing a given module. People often assume that the import name is the same as the name on PyPI. This is true in the vast majority of cases, but there's no requirement that they are the same, and there are cases where they're not - pyzmq is one example. The metadata field 'Provides' is, according to PEP 314, intended for this purpose, but the standard packaging tools don't make it easy to use, and consequently very few packages specify it. I have started putting together a tool to index wheels. It reads a .whl file, finds modules inside it, and tries to identify namespace packages. It's still quite rough, but it worked with the wheels I tried. https://github.com/takluyver/wheeldex Is this something that other people are interested in? One thing I'm trying to work out at the moment is how the data would be accessed: as a web service that tools can query online, or more like Linux packaging, where tools download and cache a list to do lookups locally. Or both? There's also, of course, the question of how the index would be built and updated. Thanks, Thomas From steve.dower at python.org Tue Feb 7 09:38:46 2017 From: steve.dower at python.org (Steve Dower) Date: Tue, 7 Feb 2017 06:38:46 -0800 Subject: [Distutils] Indexing modules in Python distributions In-Reply-To: <1486466970.2092062.872926736.0C8AF205@webmail.messagingengine.com> References: <1486466970.2092062.872926736.0C8AF205@webmail.messagingengine.com> Message-ID: I'm interested, and potentially in a position to provide funded infrastructure for this (though perhaps not as soon as you'd like, since things can move slowly at my end). My personal preference would be to download a full list. This is slow moving data that will gzip nicely, and my uses (in IDE) will require many tentative queries. I can also see value in a single-query API, but keep it simple - the value here is in the data, not the lookup. As far as updates go, most packaging systems should have some sort of release notification or update feed, so the work is likely going to be in hooking up to those and turning it into a scan task. Cheers, Steve Top-posted from my Windows Phone -----Original Message----- From: "Thomas Kluyver" Sent: ?2/?7/?2017 3:30 To: "distutils-sig at python.org" Subject: [Distutils] Indexing modules in Python distributions For a variety of reasons, I would like to build an index of what modules/packages are contained in which distributions ('packages') on PyPI. For instance: - Identifying requirements by static analysis of code: 'import zmq' -> requires pyzmq - Finding corresponding packages from different packaging systems: pyzmq on PyPI corresponds to pyzmq in conda, and python[3]-zmq in Debian repositories. This is an oversimplification, but importable module names provide a common basis to compare packages. I'd like a tool that could pick between different ways of installing a given module. People often assume that the import name is the same as the name on PyPI. This is true in the vast majority of cases, but there's no requirement that they are the same, and there are cases where they're not - pyzmq is one example. The metadata field 'Provides' is, according to PEP 314, intended for this purpose, but the standard packaging tools don't make it easy to use, and consequently very few packages specify it. I have started putting together a tool to index wheels. It reads a .whl file, finds modules inside it, and tries to identify namespace packages. It's still quite rough, but it worked with the wheels I tried. https://github.com/takluyver/wheeldex Is this something that other people are interested in? One thing I'm trying to work out at the moment is how the data would be accessed: as a web service that tools can query online, or more like Linux packaging, where tools download and cache a list to do lookups locally. Or both? There's also, of course, the question of how the index would be built and updated. Thanks, Thomas _______________________________________________ Distutils-SIG maillist - Distutils-SIG at python.org https://mail.python.org/mailman/listinfo/distutils-sig -------------- next part -------------- An HTML attachment was scrubbed... URL: From Christopher.Wilcox at microsoft.com Tue Feb 7 11:49:14 2017 From: Christopher.Wilcox at microsoft.com (Chris Wilcox) Date: Tue, 7 Feb 2017 16:49:14 +0000 Subject: [Distutils] Indexing modules in Python distributions In-Reply-To: References: <1486466970.2092062.872926736.0C8AF205@webmail.messagingengine.com> Message-ID: Thanks for cc-ing me Steve. I may be able to help jump-start this a bit and provide a platform for this to run on. I deployed a small service that scans PyPI to figure out statistics on Python 2 vs Python 3 support using PyPI Classifiers. The source is on GitHub: https://github.com/crwilcox/PyPI-Gatherer. It watches the PyPI updates feed and refreshes entries for packages as they show up as modified. It should be possible to add your lib, query, and add an additional row or two to the result. I am happy to work together on this. Also, the data is stored in an Azure Table Storage which has rest endpoints (and a Python SDK) that makes getting the published data straight-forward. Here is an example of using the data provided by the service. This is a Jupyter Notebook analysing Python 3 Adoption: https://notebooks.azure.com/chris/libraries/pypidataanalysis Thanks. Chris From: Steve Dower [mailto:steve.dower at python.org] Sent: Tuesday, 7 February, 2017 6:39 To: Thomas Kluyver ; distutils-sig at python.org Cc: Chris Wilcox Subject: RE: [Distutils] Indexing modules in Python distributions I'm interested, and potentially in a position to provide funded infrastructure for this (though perhaps not as soon as you'd like, since things can move slowly at my end). My personal preference would be to download a full list. This is slow moving data that will gzip nicely, and my uses (in IDE) will require many tentative queries. I can also see value in a single-query API, but keep it simple - the value here is in the data, not the lookup. As far as updates go, most packaging systems should have some sort of release notification or update feed, so the work is likely going to be in hooking up to those and turning it into a scan task. Cheers, Steve Top-posted from my Windows Phone ________________________________ From: Thomas Kluyver Sent: ?2/?7/?2017 3:30 To: distutils-sig at python.org Subject: [Distutils] Indexing modules in Python distributions For a variety of reasons, I would like to build an index of what modules/packages are contained in which distributions ('packages') on PyPI. For instance: - Identifying requirements by static analysis of code: 'import zmq' -> requires pyzmq - Finding corresponding packages from different packaging systems: pyzmq on PyPI corresponds to pyzmq in conda, and python[3]-zmq in Debian repositories. This is an oversimplification, but importable module names provide a common basis to compare packages. I'd like a tool that could pick between different ways of installing a given module. People often assume that the import name is the same as the name on PyPI. This is true in the vast majority of cases, but there's no requirement that they are the same, and there are cases where they're not - pyzmq is one example. The metadata field 'Provides' is, according to PEP 314, intended for this purpose, but the standard packaging tools don't make it easy to use, and consequently very few packages specify it. I have started putting together a tool to index wheels. It reads a .whl file, finds modules inside it, and tries to identify namespace packages. It's still quite rough, but it worked with the wheels I tried. https://github.com/takluyver/wheeldex Is this something that other people are interested in? One thing I'm trying to work out at the moment is how the data would be accessed: as a web service that tools can query online, or more like Linux packaging, where tools download and cache a list to do lookups locally. Or both? There's also, of course, the question of how the index would be built and updated. Thanks, Thomas _______________________________________________ Distutils-SIG maillist - Distutils-SIG at python.org https://mail.python.org/mailman/listinfo/distutils-sig -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas at kluyver.me.uk Wed Feb 8 13:14:38 2017 From: thomas at kluyver.me.uk (Thomas Kluyver) Date: Wed, 08 Feb 2017 18:14:38 +0000 Subject: [Distutils] Indexing modules in Python distributions In-Reply-To: References: <1486466970.2092062.872926736.0C8AF205@webmail.messagingengine.com> Message-ID: <1486577678.268328.874662408.717603DB@webmail.messagingengine.com> Thanks Steve, Chris, On Tue, Feb 7, 2017, at 04:49 PM, Chris Wilcox wrote: > I may be able to help jump-start this a bit and provide a platform for > this to run on. I deployed a small service that scans PyPI to figure > out statistics on Python 2 vs Python 3 support using PyPI Classifiers. > The source is on GitHub: https://github.com/crwilcox/PyPI-Gatherer. > It watches the PyPI updates feed and refreshes entries for packages as > they show up as modified. It should be possible to add your lib, > query, and add an additional row or two to the result. I am happy to > work together on this. Also, the data is stored in an Azure Table > Storage which has rest endpoints (and a Python SDK) that makes getting > the published data straight-forward. I had a quick look through this, and it does look like it should provide a useful framework for scanning PyPI and updating the results. :-) What I'm proposing differs in that it would need to download files from PyPI - basically all of them, if we're thorough about it. I imagine that's going to involve a lot of data transfer. Do we know what order of magnitude we're talking about? Is it so large that we should be thinking of running the scanner in the same data centre as the file storage? Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Wed Feb 8 18:06:28 2017 From: wes.turner at gmail.com (Wes Turner) Date: Wed, 8 Feb 2017 17:06:28 -0600 Subject: [Distutils] Indexing modules in Python distributions In-Reply-To: <1486577678.268328.874662408.717603DB@webmail.messagingengine.com> References: <1486466970.2092062.872926736.0C8AF205@webmail.messagingengine.com> <1486577678.268328.874662408.717603DB@webmail.messagingengine.com> Message-ID: On Wednesday, February 8, 2017, Thomas Kluyver wrote: > Thanks Steve, Chris, > > On Tue, Feb 7, 2017, at 04:49 PM, Chris Wilcox wrote: > > I may be able to help jump-start this a bit and provide a platform for > this to run on. I deployed a small service that scans PyPI to figure out > statistics on Python 2 vs Python 3 support using PyPI Classifiers. The > source is on GitHub: https://github.com/crwilcox/PyPI-Gatherer. It > watches the PyPI updates feed and refreshes entries for packages as they > show up as modified. It should be possible to add your lib, query, and add > an additional row or two to the result. I am happy to work together on > this. Also, the data is stored in an Azure Table Storage which has rest > endpoints (and a Python SDK) that makes getting the published data > straight-forward. > > > I had a quick look through this, and it does look like it should provide a > useful framework for scanning PyPI and updating the results. :-) > > What I'm proposing differs in that it would need to download files from > PyPI - basically all of them, if we're thorough about it. I imagine that's > going to involve a lot of data transfer. Do we know what order of magnitude > we're talking about? Is it so large that we should be thinking of running > the scanner in the same data centre as the file storage? > So, IIUC, you're looking to emit ((URL, release, platform), namespaces_odict) for each new and all existing packages; by uncompressing every package and running every setup.py (hopefully in a container)? https://github.com/python/pypi-salt/blob/master/provisioning/salt/roots/pillar/top.sls https://github.com/python/pypi-salt/blob/master/provisioning/salt/roots/pillar/warehouse-deploys/warehouse-dev.sls https://github.com/python/pypi-salt/blob/master/provisioning/salt/roots/salt/warehouse/web.sls - https://github.com/pypa/warehouse/blob/master/warehouse/packaging/search.py - elasticsearch_dsl - https://github.com/pypa/warehouse/blob/master/warehouse/packaging/models.py - SQLAlchemy - https://github.com/pypa/warehouse/blob/master/warehouse/celery.py - celery - https://github.com/pypa/warehouse/blob/master/warehouse/legacy/api/json.py - namespaces are useful metadata (worth adding to the spec) - https://github.com/pypa/interoperability-peps/issues/31 - JSONLD - https://github.com/python/psf-salt/blob/master/pillar/prod/top.sls - https://github.com/python/psf-salt/blob/master/pillar/prod/roles.sls - One CI project (container FROM python: (debian)) per python package with additional metadata per project? - conda-forge solves for this case - and then how to post the extra metadata (build artifact) back from the CI build and mark the task as done Could this (namespace extraction) be added to 'setup.py build' for the future? > > Thomas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pradyunsg at gmail.com Wed Feb 8 21:15:29 2017 From: pradyunsg at gmail.com (Pradyun Gedam) Date: Thu, 09 Feb 2017 02:15:29 +0000 Subject: [Distutils] GSoC 2017 - Working on pip Message-ID: Hello Everyone! Ralf Gommers suggested that I put this proposal here on this list, for feedback and for seeing if anyone would be willing to mentor me. So, here it is. ----- My name is Pradyun Gedam. I'm currently a first year student VIT University in India. I would like to apply for GSoC 2017 under PSF. I currently have a project in mind - the "pip needs a dependency resolver" issue [1]. I would like to take on this specific project but am willing to do some other project as well. For some background, around mid 2016, I started contributing to pip. The first issue I tackled was #59 [2] - a request for upgrade command and an upgrade-all command that has been open for over 5.5 years. Over the months following that, I've have had the opportunity to work with and understand multiple parts of pip's codebase while working on this issue and a few others. This search on GitHub issues [3] also provides a good summary of what work I've done on pip. [2]: https://github.com/pypa/pip/issues/988 [2]: https://github.com/pypa/pip/issues/59 [3]: https://github.com/pypa/pip/issues?q=author%3Apradyunsg Eagerly-waiting-for-a-response-ly, Pradyun Gedam -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas at kluyver.me.uk Thu Feb 9 05:33:27 2017 From: thomas at kluyver.me.uk (Thomas Kluyver) Date: Thu, 09 Feb 2017 10:33:27 +0000 Subject: [Distutils] Indexing modules in Python distributions In-Reply-To: References: <1486466970.2092062.872926736.0C8AF205@webmail.messagingengine.com> <1486577678.268328.874662408.717603DB@webmail.messagingengine.com> Message-ID: <1486636407.1437380.875436168.70CE2BE5@webmail.messagingengine.com> On Wed, Feb 8, 2017, at 11:06 PM, Wes Turner wrote: > So, IIUC, > you're looking to emit > ((URL, release, platform), namespaces_odict) > for each new and all existing packages; > by uncompressing every package and running every setup.py (hopefully > in a container)? Something like that, yes. For packages that publish wheels, we can analyse those directly without needing to run setup.py. Of course there are many packages with only sdists published. > Could this (namespace extraction) be added to 'setup.py build' for > the future? Potentially. As I mentioned, there is a place in the metadata to put this information - the 'Provides' field. However, relying on package uploaders would take a long time to build up decent coverage of the available packages, so I'm inclined to focus on scanning PyPI, similar to the tool Chris already showed. Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From xav.fernandez at gmail.com Thu Feb 9 07:51:58 2017 From: xav.fernandez at gmail.com (Xavier Fernandez) Date: Thu, 9 Feb 2017 13:51:58 +0100 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: Message-ID: That's would be a great news :) On Thu, Feb 9, 2017 at 3:15 AM, Pradyun Gedam wrote: > Hello Everyone! > > Ralf Gommers suggested that I put this proposal here on this list, for > feedback and for seeing if anyone would be willing to mentor me. So, here > it is. > > ----- > > My name is Pradyun Gedam. I'm currently a first year student VIT > University in India. > > I would like to apply for GSoC 2017 under PSF. > > I currently have a project in mind - the "pip needs a dependency resolver" > issue [1]. I would like to take on this specific project but am willing to > do some other project as well. > > For some background, around mid 2016, I started contributing to pip. The > first issue I tackled was #59 [2] - a request for upgrade command and an > upgrade-all command that has been open for over 5.5 years. Over the months > following that, I've have had the opportunity to work with and understand > multiple parts of pip's codebase while working on this issue and a few > others. This search on GitHub issues [3] also provides a good summary of > what work I've done on pip. > > [2]: https://github.com/pypa/pip/issues/988 > [2]: https://github.com/pypa/pip/issues/59 > [3]: https://github.com/pypa/pip/issues?q=author%3Apradyunsg > > Eagerly-waiting-for-a-response-ly, > Pradyun Gedam > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Feb 9 09:20:19 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 9 Feb 2017 15:20:19 +0100 Subject: [Distutils] Indexing modules in Python distributions In-Reply-To: <1486577678.268328.874662408.717603DB@webmail.messagingengine.com> References: <1486466970.2092062.872926736.0C8AF205@webmail.messagingengine.com> <1486577678.268328.874662408.717603DB@webmail.messagingengine.com> Message-ID: On 8 February 2017 at 19:14, Thomas Kluyver wrote: > What I'm proposing differs in that it would need to download files from PyPI > - basically all of them, if we're thorough about it. I imagine that's going > to involve a lot of data transfer. Do we know what order of magnitude we're > talking about? Is it so large that we should be thinking of running the > scanner in the same data centre as the file storage? Last time I asked Donald about doing things like this, he noted that a full mirror is ~215 GiB. That was a year or two ago so I assume the number has gone up since then, but it should still be in the same order of magnitude. >From an ecosystem resilience point of view, there's also a lot to be said for having copies of the full PyPI bulk artifact store in both AWS S3 (which is where the production PyPI data lives) and in Azure :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From donald at stufft.io Thu Feb 9 09:53:00 2017 From: donald at stufft.io (Donald Stufft) Date: Thu, 9 Feb 2017 09:53:00 -0500 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: Message-ID: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io> I?ve never done it before, but I?m happy to provide mentoring on this. > On Feb 8, 2017, at 9:15 PM, Pradyun Gedam wrote: > > Hello Everyone! > > Ralf Gommers suggested that I put this proposal here on this list, for feedback and for seeing if anyone would be willing to mentor me. So, here it is. > > ----- > > My name is Pradyun Gedam. I'm currently a first year student VIT University in India. > > I would like to apply for GSoC 2017 under PSF. > > I currently have a project in mind - the "pip needs a dependency resolver" issue [1]. I would like to take on this specific project but am willing to do some other project as well. > > For some background, around mid 2016, I started contributing to pip. The first issue I tackled was #59 [2] - a request for upgrade command and an upgrade-all command that has been open for over 5.5 years. Over the months following that, I've have had the opportunity to work with and understand multiple parts of pip's codebase while working on this issue and a few others. This search on GitHub issues [3] also provides a good summary of what work I've done on pip. > > [2]: https://github.com/pypa/pip/issues/988 > [2]: https://github.com/pypa/pip/issues/59 > [3]: https://github.com/pypa/pip/issues?q=author%3Apradyunsg > > Eagerly-waiting-for-a-response-ly, > Pradyun Gedam > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Thu Feb 9 17:18:22 2017 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 9 Feb 2017 22:18:22 +0000 Subject: [Distutils] Indexing modules in Python distributions In-Reply-To: <1486577678.268328.874662408.717603DB@webmail.messagingengine.com> References: <1486466970.2092062.872926736.0C8AF205@webmail.messagingengine.com> <1486577678.268328.874662408.717603DB@webmail.messagingengine.com> Message-ID: <20170209221822.GS12827@yuggoth.org> On 2017-02-08 18:14:38 +0000 (+0000), Thomas Kluyver wrote: [...] > What I'm proposing differs in that it would need to download files from > PyPI - basically all of them, if we're thorough about it. I imagine > that's going to involve a lot of data transfer. Do we know what order of > magnitude we're talking about? [...] The crowd I run with uses https://pypi.org/project/bandersnatch/ to maintain a full PyPI mirror for our project's distributed CI system, and du says the current aggregate size is 488GiB. Also if you want to initialize a full mirror this way, plan for it to take several days to populate. -- Jeremy Stanley From pradyunsg at gmail.com Fri Feb 10 13:20:03 2017 From: pradyunsg at gmail.com (Pradyun Gedam) Date: Fri, 10 Feb 2017 18:20:03 +0000 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io> References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io> Message-ID: Yay! Thank you so much for a prompt and positive response! I'm pretty excited and looking forward to this. On Thu, Feb 9, 2017, 20:23 Donald Stufft wrote: I?ve never done it before, but I?m happy to provide mentoring on this. On Feb 8, 2017, at 9:15 PM, Pradyun Gedam wrote: Hello Everyone! Ralf Gommers suggested that I put this proposal here on this list, for feedback and for seeing if anyone would be willing to mentor me. So, here it is. ----- My name is Pradyun Gedam. I'm currently a first year student VIT University in India. I would like to apply for GSoC 2017 under PSF. I currently have a project in mind - the "pip needs a dependency resolver" issue [1]. I would like to take on this specific project but am willing to do some other project as well. For some background, around mid 2016, I started contributing to pip. The first issue I tackled was #59 [2] - a request for upgrade command and an upgrade-all command that has been open for over 5.5 years. Over the months following that, I've have had the opportunity to work with and understand multiple parts of pip's codebase while working on this issue and a few others. This search on GitHub issues [3] also provides a good summary of what work I've done on pip. [2]: https://github.com/pypa/pip/issues/988 [2]: https://github.com/pypa/pip/issues/59 [3]: https://github.com/pypa/pip/issues?q=author%3Apradyunsg Eagerly-waiting-for-a-response-ly, Pradyun Gedam _______________________________________________ Distutils-SIG maillist - Distutils-SIG at python.org https://mail.python.org/mailman/listinfo/distutils-sig ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Fri Feb 10 13:59:32 2017 From: wes.turner at gmail.com (Wes Turner) Date: Fri, 10 Feb 2017 12:59:32 -0600 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io> Message-ID: >From the discussion on https://github.com/pypa/pip/issues/988#issuecomment-279033079: - https://github.com/ContinuumIO/pycosat (picosat) - https://github.com/ContinuumIO/pycosat/blob/master/pycosat.c (C) - https://github.com/ContinuumIO/pycosat/blob/master/picosat.c - https://github.com/ContinuumIO/pycosat/tree/master/examples - https://github.com/enthought/sat-solver (MiniSat) - https://github.com/enthought/sat-solver/tree/master/simplesat/tests - https://github.com/enthought/sat-solver/blob/master/requirements.txt (PyYAML, enum34) Is there a better way than SAT? On Fri, Feb 10, 2017 at 12:20 PM, Pradyun Gedam wrote: > Yay! Thank you so much for a prompt and positive response! I'm pretty > excited and looking forward to this. > > On Thu, Feb 9, 2017, 20:23 Donald Stufft wrote: > > I?ve never done it before, but I?m happy to provide mentoring on this. > > On Feb 8, 2017, at 9:15 PM, Pradyun Gedam wrote: > > Hello Everyone! > > Ralf Gommers suggested that I put this proposal here on this list, for > feedback and for seeing if anyone would be willing to mentor me. So, here > it is. > > ----- > > My name is Pradyun Gedam. I'm currently a first year student VIT > University in India. > > I would like to apply for GSoC 2017 under PSF. > > I currently have a project in mind - the "pip needs a dependency resolver" > issue [1]. I would like to take on this specific project but am willing to > do some other project as well. > > For some background, around mid 2016, I started contributing to pip. The > first issue I tackled was #59 [2] - a request for upgrade command and an > upgrade-all command that has been open for over 5.5 years. Over the months > following that, I've have had the opportunity to work with and understand > multiple parts of pip's codebase while working on this issue and a few > others. This search on GitHub issues [3] also provides a good summary of > what work I've done on pip. > > [2]: https://github.com/pypa/pip/issues/988 > [2]: https://github.com/pypa/pip/issues/59 > [3]: https://github.com/pypa/pip/issues?q=author%3Apradyunsg > > Eagerly-waiting-for-a-response-ly, > Pradyun Gedam > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > > > ? > > Donald Stufft > > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcappos at nyu.edu Fri Feb 10 14:33:47 2017 From: jcappos at nyu.edu (Justin Cappos) Date: Fri, 10 Feb 2017 14:33:47 -0500 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io>

Message-ID: Yes, don't use a SAT solver. It requires all metadata from all packages (~30MB uncompressed) and gives hard to predict results in some cases. Also the lack of fixed dependencies is a substantial problem for a SAT solver. Overall, we think it makes more sense to use a simple backtracking dependency resolution algorithm. Sebastien Awwad (CCed) has been looking at a bunch of data around the speed and other tradeoffs of the different algos. Sebastien: Sometime next week, can you write it up in a way that is suitable for sharing? Justin On Fri, Feb 10, 2017 at 1:59 PM, Wes Turner wrote: > From the discussion on https://github.com/pypa/pip/ > issues/988#issuecomment-279033079: > > > - https://github.com/ContinuumIO/pycosat (picosat) > - https://github.com/ContinuumIO/pycosat/blob/master/pycosat.c (C) > - https://github.com/ContinuumIO/pycosat/blob/master/picosat.c > - https://github.com/ContinuumIO/pycosat/tree/master/examples > - https://github.com/enthought/sat-solver (MiniSat) > - https://github.com/enthought/sat-solver/tree/master/ > simplesat/tests > - https://github.com/enthought/sat-solver/blob/master/ > requirements.txt (PyYAML, enum34) > > > Is there a better way than SAT? > > On Fri, Feb 10, 2017 at 12:20 PM, Pradyun Gedam > wrote: > >> Yay! Thank you so much for a prompt and positive response! I'm pretty >> excited and looking forward to this. >> >> On Thu, Feb 9, 2017, 20:23 Donald Stufft wrote: >> >> I?ve never done it before, but I?m happy to provide mentoring on this. >> >> On Feb 8, 2017, at 9:15 PM, Pradyun Gedam wrote: >> >> Hello Everyone! >> >> Ralf Gommers suggested that I put this proposal here on this list, for >> feedback and for seeing if anyone would be willing to mentor me. So, here >> it is. >> >> ----- >> >> My name is Pradyun Gedam. I'm currently a first year student VIT >> University in India. >> >> I would like to apply for GSoC 2017 under PSF. >> >> I currently have a project in mind - the "pip needs a dependency >> resolver" issue [1]. I would like to take on this specific project but am >> willing to do some other project as well. >> >> For some background, around mid 2016, I started contributing to pip. The >> first issue I tackled was #59 [2] - a request for upgrade command and an >> upgrade-all command that has been open for over 5.5 years. Over the months >> following that, I've have had the opportunity to work with and understand >> multiple parts of pip's codebase while working on this issue and a few >> others. This search on GitHub issues [3] also provides a good summary of >> what work I've done on pip. >> >> [2]: https://github.com/pypa/pip/issues/988 >> [2]: https://github.com/pypa/pip/issues/59 >> [3]: https://github.com/pypa/pip/issues?q=author%3Apradyunsg >> >> Eagerly-waiting-for-a-response-ly, >> Pradyun Gedam >> >> _______________________________________________ >> Distutils-SIG maillist - Distutils-SIG at python.org >> https://mail.python.org/mailman/listinfo/distutils-sig >> >> >> >> ? >> >> Donald Stufft >> >> >> _______________________________________________ >> Distutils-SIG maillist - Distutils-SIG at python.org >> https://mail.python.org/mailman/listinfo/distutils-sig >> >> > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastienawwad at gmail.com Fri Feb 10 14:53:41 2017 From: sebastienawwad at gmail.com (Sebastien Awwad) Date: Fri, 10 Feb 2017 19:53:41 +0000 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io>

Message-ID: While there may be some clever way of delivering changes to dependency metadata that consumes less bandwidth (The first delivery for a client will be somewhat large, but versioned metadata information plus compressed deltas could move clients from one version of the full metadata set to another, for example?), the larger problem, I think, is the lack of fixed dependencies. Even with a moderately small percentage of distributions having variable immediate dependencies, this expands out substantially when you consider all distributions that depend on those and so on, meaning that the full set of installed distributions when you run `pip install xyz==a.b.c` is surprisingly variable. In a series of install attempts run over about 400,000 of the package versions on PyPI last year, I found that simply changing the version of Python employed in an otherwise identical virtual environment results in pip installing different packages or package versions, for 16% of the distributions. If dependencies were knowable in static metadata, there would be a decent case for SAT solving. I'll try to get back to a write-up after the current rush on my main project subsides. On Fri, Feb 10, 2017 at 2:34 PM Justin Cappos wrote: > Yes, don't use a SAT solver. It requires all metadata from all packages > (~30MB uncompressed) and gives hard to predict results in some cases. > Also the lack of fixed dependencies is a substantial problem for a SAT > solver. Overall, we think it makes more sense to use a simple backtracking > dependency resolution algorithm. > > Sebastien Awwad (CCed) has been looking at a bunch of data around the > speed and other tradeoffs of the different algos. Sebastien: Sometime > next week, can you write it up in a way that is suitable for sharing? > > > Justin > > On Fri, Feb 10, 2017 at 1:59 PM, Wes Turner wrote: > > From the discussion on > https://github.com/pypa/pip/issues/988#issuecomment-279033079: > > > - https://github.com/ContinuumIO/pycosat (picosat) > - https://github.com/ContinuumIO/pycosat/blob/master/pycosat.c (C) > - https://github.com/ContinuumIO/pycosat/blob/master/picosat.c > - https://github.com/ContinuumIO/pycosat/tree/master/examples > - https://github.com/enthought/sat-solver (MiniSat) > - > https://github.com/enthought/sat-solver/tree/master/simplesat/tests > - > https://github.com/enthought/sat-solver/blob/master/requirements.txt (PyYAML, > enum34) > > > Is there a better way than SAT? > > On Fri, Feb 10, 2017 at 12:20 PM, Pradyun Gedam > wrote: > > Yay! Thank you so much for a prompt and positive response! I'm pretty > excited and looking forward to this. > > On Thu, Feb 9, 2017, 20:23 Donald Stufft wrote: > > I?ve never done it before, but I?m happy to provide mentoring on this. > > On Feb 8, 2017, at 9:15 PM, Pradyun Gedam wrote: > > Hello Everyone! > > Ralf Gommers suggested that I put this proposal here on this list, for > feedback and for seeing if anyone would be willing to mentor me. So, here > it is. > > ----- > > My name is Pradyun Gedam. I'm currently a first year student VIT > University in India. > > I would like to apply for GSoC 2017 under PSF. > > I currently have a project in mind - the "pip needs a dependency resolver" > issue [1]. I would like to take on this specific project but am willing to > do some other project as well. > > For some background, around mid 2016, I started contributing to pip. The > first issue I tackled was #59 [2] - a request for upgrade command and an > upgrade-all command that has been open for over 5.5 years. Over the months > following that, I've have had the opportunity to work with and understand > multiple parts of pip's codebase while working on this issue and a few > others. This search on GitHub issues [3] also provides a good summary of > what work I've done on pip. > > [2]: https://github.com/pypa/pip/issues/988 > [2]: https://github.com/pypa/pip/issues/59 > [3]: https://github.com/pypa/pip/issues?q=author%3Apradyunsg > > Eagerly-waiting-for-a-response-ly, > Pradyun Gedam > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > > > ? > > Donald Stufft > > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Fri Feb 10 15:52:03 2017 From: cournape at gmail.com (David Cournapeau) Date: Fri, 10 Feb 2017 15:52:03 -0500 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io>

Message-ID: On Fri, Feb 10, 2017 at 2:33 PM, Justin Cappos wrote: > Yes, don't use a SAT solver. It requires all metadata from all packages > (~30MB uncompressed) and gives hard to predict results in some cases. > I doubt there exists an algorithm where this is not the case. Also the lack of fixed dependencies is a substantial problem for a SAT > solver. Overall, we think it makes more sense to use a simple backtracking > dependency resolution algorithm. > As soon as you want to deal with version ranges and ensure consistency of the installed packages, backtracking stops being simple rather quickly. I agree lack of fixed dependencies is an issue, but I doubt it is specific to a SAT solver. SAT solvers have been used successfully in many cases now: composer (php), dnf (Red Hat/Fedora), conda or our own packages manager at Enthought in python, 0install. I would certainly be interested in seeing a proper comparison with other algorithms. David > Sebastien Awwad (CCed) has been looking at a bunch of data around the > speed and other tradeoffs of the different algos. Sebastien: Sometime > next week, can you write it up in a way that is suitable for sharing? > > Justin > > On Fri, Feb 10, 2017 at 1:59 PM, Wes Turner wrote: > >> From the discussion on https://github.com/pypa/pip/is >> sues/988#issuecomment-279033079: >> >> >> - https://github.com/ContinuumIO/pycosat (picosat) >> - https://github.com/ContinuumIO/pycosat/blob/master/pycosat.c (C) >> - https://github.com/ContinuumIO/pycosat/blob/master/picosat.c >> - https://github.com/ContinuumIO/pycosat/tree/master/examples >> - https://github.com/enthought/sat-solver (MiniSat) >> - https://github.com/enthought/sat-solver/tree/master/simplesa >> t/tests >> - https://github.com/enthought/sat-solver/blob/master/requirem >> ents.txt (PyYAML, enum34) >> >> >> Is there a better way than SAT? >> >> On Fri, Feb 10, 2017 at 12:20 PM, Pradyun Gedam >> wrote: >> >>> Yay! Thank you so much for a prompt and positive response! I'm pretty >>> excited and looking forward to this. >>> >>> On Thu, Feb 9, 2017, 20:23 Donald Stufft wrote: >>> >>> I?ve never done it before, but I?m happy to provide mentoring on this. >>> >>> On Feb 8, 2017, at 9:15 PM, Pradyun Gedam wrote: >>> >>> Hello Everyone! >>> >>> Ralf Gommers suggested that I put this proposal here on this list, for >>> feedback and for seeing if anyone would be willing to mentor me. So, here >>> it is. >>> >>> ----- >>> >>> My name is Pradyun Gedam. I'm currently a first year student VIT >>> University in India. >>> >>> I would like to apply for GSoC 2017 under PSF. >>> >>> I currently have a project in mind - the "pip needs a dependency >>> resolver" issue [1]. I would like to take on this specific project but am >>> willing to do some other project as well. >>> >>> For some background, around mid 2016, I started contributing to pip. The >>> first issue I tackled was #59 [2] - a request for upgrade command and an >>> upgrade-all command that has been open for over 5.5 years. Over the months >>> following that, I've have had the opportunity to work with and understand >>> multiple parts of pip's codebase while working on this issue and a few >>> others. This search on GitHub issues [3] also provides a good summary of >>> what work I've done on pip. >>> >>> [2]: https://github.com/pypa/pip/issues/988 >>> [2]: https://github.com/pypa/pip/issues/59 >>> [3]: https://github.com/pypa/pip/issues?q=author%3Apradyunsg >>> >>> Eagerly-waiting-for-a-response-ly, >>> Pradyun Gedam >>> >>> _______________________________________________ >>> Distutils-SIG maillist - Distutils-SIG at python.org >>> https://mail.python.org/mailman/listinfo/distutils-sig >>> >>> >>> >>> ? >>> >>> Donald Stufft >>> >>> >>> _______________________________________________ >>> Distutils-SIG maillist - Distutils-SIG at python.org >>> https://mail.python.org/mailman/listinfo/distutils-sig >>> >>> >> >> _______________________________________________ >> Distutils-SIG maillist - Distutils-SIG at python.org >> https://mail.python.org/mailman/listinfo/distutils-sig >> >> > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Fri Feb 10 16:03:33 2017 From: cournape at gmail.com (David Cournapeau) Date: Fri, 10 Feb 2017 16:03:33 -0500 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io>

Message-ID: On Fri, Feb 10, 2017 at 3:52 PM, David Cournapeau wrote: > > > On Fri, Feb 10, 2017 at 2:33 PM, Justin Cappos wrote: > >> Yes, don't use a SAT solver. It requires all metadata from all packages >> (~30MB uncompressed) and gives hard to predict results in some cases. >> > > I doubt there exists an algorithm where this is not the case. > > Also the lack of fixed dependencies is a substantial problem for a SAT >> solver. Overall, we think it makes more sense to use a simple backtracking >> dependency resolution algorithm. >> > > As soon as you want to deal with version ranges and ensure consistency of > the installed packages, backtracking stops being simple rather quickly. > > I agree lack of fixed dependencies is an issue, but I doubt it is specific > to a SAT solver. SAT solvers have been used successfully in many cases now: > composer (php), dnf (Red Hat/Fedora), conda or our own packages manager at > Enthought in python, 0install. > > I would certainly be interested in seeing a proper comparison with other > algorithms. > I don't have experience implementing non SAT dependency solvers, but I suspect that whatever algorithm you end up using, the "core" is the simple part, and tweaking heuristics will be the hard, developer-time consuming part. David > > David > > >> Sebastien Awwad (CCed) has been looking at a bunch of data around the >> speed and other tradeoffs of the different algos. Sebastien: Sometime >> next week, can you write it up in a way that is suitable for sharing? >> >> Justin >> >> On Fri, Feb 10, 2017 at 1:59 PM, Wes Turner wrote: >> >>> From the discussion on https://github.com/pypa/pip/is >>> sues/988#issuecomment-279033079: >>> >>> >>> - https://github.com/ContinuumIO/pycosat (picosat) >>> - https://github.com/ContinuumIO/pycosat/blob/master/pycosat.c (C) >>> - https://github.com/ContinuumIO/pycosat/blob/master/picosat.c >>> - https://github.com/ContinuumIO/pycosat/tree/master/examples >>> - https://github.com/enthought/sat-solver (MiniSat) >>> - https://github.com/enthought/sat-solver/tree/master/simplesa >>> t/tests >>> - https://github.com/enthought/sat-solver/blob/master/requirem >>> ents.txt (PyYAML, enum34) >>> >>> >>> Is there a better way than SAT? >>> >>> On Fri, Feb 10, 2017 at 12:20 PM, Pradyun Gedam >>> wrote: >>> >>>> Yay! Thank you so much for a prompt and positive response! I'm pretty >>>> excited and looking forward to this. >>>> >>>> On Thu, Feb 9, 2017, 20:23 Donald Stufft wrote: >>>> >>>> I?ve never done it before, but I?m happy to provide mentoring on this. >>>> >>>> On Feb 8, 2017, at 9:15 PM, Pradyun Gedam wrote: >>>> >>>> Hello Everyone! >>>> >>>> Ralf Gommers suggested that I put this proposal here on this list, for >>>> feedback and for seeing if anyone would be willing to mentor me. So, here >>>> it is. >>>> >>>> ----- >>>> >>>> My name is Pradyun Gedam. I'm currently a first year student VIT >>>> University in India. >>>> >>>> I would like to apply for GSoC 2017 under PSF. >>>> >>>> I currently have a project in mind - the "pip needs a dependency >>>> resolver" issue [1]. I would like to take on this specific project but am >>>> willing to do some other project as well. >>>> >>>> For some background, around mid 2016, I started contributing to pip. >>>> The first issue I tackled was #59 [2] - a request for upgrade command and >>>> an upgrade-all command that has been open for over 5.5 years. Over the >>>> months following that, I've have had the opportunity to work with and >>>> understand multiple parts of pip's codebase while working on this issue and >>>> a few others. This search on GitHub issues [3] also provides a good summary >>>> of what work I've done on pip. >>>> >>>> [2]: https://github.com/pypa/pip/issues/988 >>>> [2]: https://github.com/pypa/pip/issues/59 >>>> [3]: https://github.com/pypa/pip/issues?q=author%3Apradyunsg >>>> >>>> Eagerly-waiting-for-a-response-ly, >>>> Pradyun Gedam >>>> >>>> _______________________________________________ >>>> Distutils-SIG maillist - Distutils-SIG at python.org >>>> https://mail.python.org/mailman/listinfo/distutils-sig >>>> >>>> >>>> >>>> ? >>>> >>>> Donald Stufft >>>> >>>> >>>> _______________________________________________ >>>> Distutils-SIG maillist - Distutils-SIG at python.org >>>> https://mail.python.org/mailman/listinfo/distutils-sig >>>> >>>> >>> >>> _______________________________________________ >>> Distutils-SIG maillist - Distutils-SIG at python.org >>> https://mail.python.org/mailman/listinfo/distutils-sig >>> >>> >> >> _______________________________________________ >> Distutils-SIG maillist - Distutils-SIG at python.org >> https://mail.python.org/mailman/listinfo/distutils-sig >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcappos at nyu.edu Fri Feb 10 16:22:56 2017 From: jcappos at nyu.edu (Justin Cappos) Date: Fri, 10 Feb 2017 16:22:56 -0500 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io>

Message-ID: So, there aren't "heuristics" to tweak here. The algorithm just encodes the rules for trying package combinations (usually, latest version first) and then backtracks to a previous point when an unresolvable conflict is found. This is quite different from something like a SAT solver where it does use heuristics to come up with a matching scenario quickly. I don't think developers need to tweak heuristics in either case. You just pick your SAT solver and it has reasonable heuristics built in, right? Thanks, Justin On Fri, Feb 10, 2017 at 4:03 PM, David Cournapeau wrote: > > > On Fri, Feb 10, 2017 at 3:52 PM, David Cournapeau > wrote: > >> >> >> On Fri, Feb 10, 2017 at 2:33 PM, Justin Cappos wrote: >> >>> Yes, don't use a SAT solver. It requires all metadata from all packages >>> (~30MB uncompressed) and gives hard to predict results in some cases. >>> >> >> I doubt there exists an algorithm where this is not the case. >> >> Also the lack of fixed dependencies is a substantial problem for a SAT >>> solver. Overall, we think it makes more sense to use a simple backtracking >>> dependency resolution algorithm. >>> >> >> As soon as you want to deal with version ranges and ensure consistency of >> the installed packages, backtracking stops being simple rather quickly. >> >> I agree lack of fixed dependencies is an issue, but I doubt it is >> specific to a SAT solver. SAT solvers have been used successfully in many >> cases now: composer (php), dnf (Red Hat/Fedora), conda or our own packages >> manager at Enthought in python, 0install. >> >> I would certainly be interested in seeing a proper comparison with other >> algorithms. >> > > I don't have experience implementing non SAT dependency solvers, but I > suspect that whatever algorithm you end up using, the "core" is the simple > part, and tweaking heuristics will be the hard, developer-time consuming > part. > > David > >> >> David >> >> >>> Sebastien Awwad (CCed) has been looking at a bunch of data around the >>> speed and other tradeoffs of the different algos. Sebastien: Sometime >>> next week, can you write it up in a way that is suitable for sharing? >>> >>> Justin >>> >>> On Fri, Feb 10, 2017 at 1:59 PM, Wes Turner >>> wrote: >>> >>>> From the discussion on https://github.com/pypa/pip/is >>>> sues/988#issuecomment-279033079: >>>> >>>> >>>> - https://github.com/ContinuumIO/pycosat (picosat) >>>> - https://github.com/ContinuumIO/pycosat/blob/master/pycosat.c >>>> (C) >>>> - https://github.com/ContinuumIO/pycosat/blob/master/picosat.c >>>> - https://github.com/ContinuumIO/pycosat/tree/master/examples >>>> - https://github.com/enthought/sat-solver (MiniSat) >>>> - https://github.com/enthought/sat-solver/tree/master/simplesa >>>> t/tests >>>> - https://github.com/enthought/sat-solver/blob/master/requirem >>>> ents.txt (PyYAML, enum34) >>>> >>>> >>>> Is there a better way than SAT? >>>> >>>> On Fri, Feb 10, 2017 at 12:20 PM, Pradyun Gedam >>>> wrote: >>>> >>>>> Yay! Thank you so much for a prompt and positive response! I'm pretty >>>>> excited and looking forward to this. >>>>> >>>>> On Thu, Feb 9, 2017, 20:23 Donald Stufft wrote: >>>>> >>>>> I?ve never done it before, but I?m happy to provide mentoring on this. >>>>> >>>>> On Feb 8, 2017, at 9:15 PM, Pradyun Gedam wrote: >>>>> >>>>> Hello Everyone! >>>>> >>>>> Ralf Gommers suggested that I put this proposal here on this list, for >>>>> feedback and for seeing if anyone would be willing to mentor me. So, here >>>>> it is. >>>>> >>>>> ----- >>>>> >>>>> My name is Pradyun Gedam. I'm currently a first year student VIT >>>>> University in India. >>>>> >>>>> I would like to apply for GSoC 2017 under PSF. >>>>> >>>>> I currently have a project in mind - the "pip needs a dependency >>>>> resolver" issue [1]. I would like to take on this specific project but am >>>>> willing to do some other project as well. >>>>> >>>>> For some background, around mid 2016, I started contributing to pip. >>>>> The first issue I tackled was #59 [2] - a request for upgrade command and >>>>> an upgrade-all command that has been open for over 5.5 years. Over the >>>>> months following that, I've have had the opportunity to work with and >>>>> understand multiple parts of pip's codebase while working on this issue and >>>>> a few others. This search on GitHub issues [3] also provides a good summary >>>>> of what work I've done on pip. >>>>> >>>>> [2]: https://github.com/pypa/pip/issues/988 >>>>> [2]: https://github.com/pypa/pip/issues/59 >>>>> [3]: https://github.com/pypa/pip/issues?q=author%3Apradyunsg >>>>> >>>>> Eagerly-waiting-for-a-response-ly, >>>>> Pradyun Gedam >>>>> >>>>> _______________________________________________ >>>>> Distutils-SIG maillist - Distutils-SIG at python.org >>>>> https://mail.python.org/mailman/listinfo/distutils-sig >>>>> >>>>> >>>>> >>>>> ? >>>>> >>>>> Donald Stufft >>>>> >>>>> >>>>> _______________________________________________ >>>>> Distutils-SIG maillist - Distutils-SIG at python.org >>>>> https://mail.python.org/mailman/listinfo/distutils-sig >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> Distutils-SIG maillist - Distutils-SIG at python.org >>>> https://mail.python.org/mailman/listinfo/distutils-sig >>>> >>>> >>> >>> _______________________________________________ >>> Distutils-SIG maillist - Distutils-SIG at python.org >>> https://mail.python.org/mailman/listinfo/distutils-sig >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Fri Feb 10 16:36:14 2017 From: donald at stufft.io (Donald Stufft) Date: Fri, 10 Feb 2017 16:36:14 -0500 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io>

Message-ID: <465A9CA2-CACE-40DE-B20A-47058FDDEDD2@stufft.io> > On Feb 10, 2017, at 2:53 PM, Sebastien Awwad wrote: > > If dependencies were knowable in static metadata, there would be a decent case for SAT solving. I'll try to get back to a write-up after the current rush on my main project subsides. The differences between backtracking and SAT solvers and such is perhaps a bit of of my depth, but just FWIW when installing from Wheel it?s basically just waiting on a new API to get this information in a static form. Installing from sdist still has the problem (and likely will forever) but I think it?s not *unreasonable* to say that using wheels is what you need to do to get fast dep solving and if people aren?t providing wheels it will be slow(er?). ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Fri Feb 10 16:58:15 2017 From: cournape at gmail.com (David Cournapeau) Date: Fri, 10 Feb 2017 16:58:15 -0500 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io>

Message-ID: On Fri, Feb 10, 2017 at 4:28 PM, Justin Cappos wrote: > So, there aren't "heuristics" to tweak here. The algorithm just encodes > the rules for trying package combinations (usually, latest version first) > and then backtracks to a previous point when an unresolvable conflict is > found. > > This is quite different from something like a SAT solver where it does use > heuristics to come up with a matching scenario quickly. > > I don't think developers need to tweak heuristics in either case. You > just pick your SAT solver and it has reasonable heuristics built in, right? > Right, so there are 2 set of heuristics: the heuristics to make SAT solvers more efficient, and heuristics to make it more useful as a dependency resolution algorithm. I am only interested in the 2nd set of heuristics here. So for SAT solvers at least, you need heuristics to tweak the search space toward something more likely solutions (from a dependency POV). E.g. composer will favor already installed packages if they match the problem. That's also why it is rather hard to use a SAT solver as a black box and then wrap it to resolve dependencies, and you instead want to have access to the SAT solver "internals". Don't you need the same kind of heuristics to make backtracking actually useful ? I agree comparing on actual problems is the best way to move this discussion forward, to compare speed, solution quality, feasibility in pip's/pypi context. If you have access to "scenarios", I would be happy to run our own SAT solver on it to compare solver's output. David -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcappos at nyu.edu Fri Feb 10 17:04:57 2017 From: jcappos at nyu.edu (Justin Cappos) Date: Fri, 10 Feb 2017 17:04:57 -0500 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: <465A9CA2-CACE-40DE-B20A-47058FDDEDD2@stufft.io> References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io>

<465A9CA2-CACE-40DE-B20A-47058FDDEDD2@stufft.io> Message-ID: I think the difference Sebastien is trying to say is that you need info from *all* pieces of static metadata. Not just that from the packages you will end up installing. Backtracking dependency resolution will be much more like the wheel model. If one does not backtrack (which is true most of the time), it only needs the metadata from the things you end up install. Justin On Fri, Feb 10, 2017 at 4:36 PM, Donald Stufft wrote: > > On Feb 10, 2017, at 2:53 PM, Sebastien Awwad > wrote: > > If dependencies were knowable in static metadata, there would be a decent > case for SAT solving. I'll try to get back to a write-up after the current > rush on my main project subsides. > > > > The differences between backtracking and SAT solvers and such is perhaps a > bit of of my depth, but just FWIW when installing from Wheel it?s basically > just waiting on a new API to get this information in a static form. > Installing from sdist still has the problem (and likely will forever) but I > think it?s not *unreasonable* to say that using wheels is what you need to > do to get fast dep solving and if people aren?t providing wheels it will be > slow(er?). > > ? > Donald Stufft > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bussonniermatthias at gmail.com Fri Feb 10 17:06:57 2017 From: bussonniermatthias at gmail.com (Matthias Bussonnier) Date: Fri, 10 Feb 2017 14:06:57 -0800 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: <465A9CA2-CACE-40DE-B20A-47058FDDEDD2@stufft.io> References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io>

<465A9CA2-CACE-40DE-B20A-47058FDDEDD2@stufft.io> Message-ID: Hi all, Assuming that all the requirements are wheels and coming from PyPI. Installed using a recent pip. How often do you think the resolution will be the same for all clients, and mostly be "pull everything from latest" ? If so, would it make sense to pre-compute thing on PyPI/warehouse at package publication time, and provide a resolution "hint" as an API endpoint ? If this "hint" is correct, it should avoid clientside work most of time. And the resolution can probably be efficiently updated as you only have to re-solve by looking as the dependees of previous version. -- M On Fri, Feb 10, 2017 at 1:36 PM, Donald Stufft wrote: > > On Feb 10, 2017, at 2:53 PM, Sebastien Awwad > wrote: > > If dependencies were knowable in static metadata, there would be a decent > case for SAT solving. I'll try to get back to a write-up after the current > rush on my main project subsides. > > > > The differences between backtracking and SAT solvers and such is perhaps a > bit of of my depth, but just FWIW when installing from Wheel it?s basically > just waiting on a new API to get this information in a static form. > Installing from sdist still has the problem (and likely will forever) but I > think it?s not *unreasonable* to say that using wheels is what you need to > do to get fast dep solving and if people aren?t providing wheels it will be > slow(er?). > > ? > Donald Stufft > > > > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > From ncoghlan at gmail.com Sat Feb 11 02:35:16 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 11 Feb 2017 08:35:16 +0100 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io>

<465A9CA2-CACE-40DE-B20A-47058FDDEDD2@stufft.io> Message-ID: On 10 Feb 2017 23:05, "Justin Cappos" wrote: I think the difference Sebastien is trying to say is that you need info from *all* pieces of static metadata. Not just that from the packages you will end up installing. Backtracking dependency resolution will be much more like the wheel model. If one does not backtrack (which is true most of the time), it only needs the metadata from the things you end up install. This is key for PyPI I think - for the yum -> dnf transition, one of the biggest still unsolved problems is the increase in the amount of metadata that needs to be transferred (although the file lists used for install-by-filename are a big contributing factor to that). You can fairly readily see this in Docker container builds that rely on dnf - even on a fast connection, you may spend more than a minute downloading dependency metadata. It would take a *lot* of server round trips for per-package metadata retrieval to start comparing to bulk download times for the metadata for 90k+ packages. Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokoproject at gmail.com Sat Feb 11 01:57:41 2017 From: gokoproject at gmail.com (John Wong) Date: Sat, 11 Feb 2017 01:57:41 -0500 Subject: [Distutils] GSoC 2017 - Working on pip In-Reply-To: References: <525E0953-403A-4305-B9C9-31CFA681BEEE@stufft.io>

<465A9CA2-CACE-40DE-B20A-47058FDDEDD2@stufft.io> Message-ID: On Fri, Feb 10, 2017 at 5:06 PM, Matthias Bussonnier < bussonniermatthias at gmail.com> wrote: > Hi all, > > Assuming that all the requirements are wheels and coming from PyPI. > Installed using a recent pip. > > How often do you think the resolution will be the same for all > clients, and mostly be "pull everything from latest" ? I don't think there is anyway around not precompute IMO. But perhaps I am complicating things. what about non-pypi packages like git source? -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Sun Feb 12 19:12:24 2017 From: cournape at gmail.com (David Cournapeau) Date: Sun, 12 Feb 2017 19:12:24 -0500 Subject: [Distutils] PyCon Colombia 2017 keynote on packaging Message-ID: Hi, I was invited to give a talk at PyCon Colombia 2017, and I did it on packaging. I thought people here would be interested to know about it. I insisted on the need for packaging to get software into as many hands as possible, gave a history of the packaging ecosystem, advised people to use packaging.python.org suggestions, and mentioned the manylinux effort. I tried to be as objective as possible there and mention the key people involved. I also talked a bit about what can still be improved, and focused on 3 aspects, none of which are new nor particularly insightful for people here: infrastructure for automatic wheel building, better decoupling of packaging and build, and maybe more controversially, the need for tools to remove python from the equation. https://speakerdeck.com/cournape/python-packaging-in-2017 David -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Feb 13 05:01:36 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 13 Feb 2017 11:01:36 +0100 Subject: [Distutils] PyCon Colombia 2017 keynote on packaging In-Reply-To: References: Message-ID: On 13 Feb 2017 1:20 am, "David Cournapeau" wrote: Hi, I was invited to give a talk at PyCon Colombia 2017, and I did it on packaging. I thought people here would be interested to know about it. Thanks for the heads up! I also talked a bit about what can still be improved, and focused on 3 aspects, none of which are new nor particularly insightful for people here: infrastructure for automatic wheel building, better decoupling of packaging and build, and maybe more controversially, the need for tools to remove python from the equation. https://speakerdeck.com/cournape/python-packaging-in-2017 Yeah, I think that's a good summary of where things are right now, and where we'd like to go next. Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From radomir at dopieralski.pl Mon Feb 13 07:17:28 2017 From: radomir at dopieralski.pl (Radomir Dopieralski) Date: Mon, 13 Feb 2017 13:17:28 +0100 Subject: [Distutils] Trove classifiers for MicroPython? In-Reply-To: <20161020114316.0b587052@ghostwheel> References: <20161020114316.0b587052@ghostwheel> Message-ID: <20170213131728.340bf9dd@ghostwheel> Is this the right place to ask for this? It has been over four months already, and there is no action on this. Perhaps there is some more official way to request this that I am missing? On Thu, 20 Oct 2016 11:43:16 +0200 Radomir Dopieralski wrote: > Hello everyone, > > I'm not sure this is the right place to write to propose new trove > classifiers for PyPi -- if it's not, what would be the right place? > If this is it, then please read below. > > The MicroPython project is quickly growing and becoming more mature, > and as that happens, the number of 3rd-party libraries for it grows. > Many of those libraries get uploaded to PyPi, as you can check by > searching for "micropython". MicroPython has even its own version of > "pip", called "upip", that can be used to install those libraries. > > However, there is as of yet no way to mark that a library is written > for that particular flavor of Python, as there are no trove > classifiers for it. I would like to propose adding a number of > classifiers to amend that situation: > > For the MicroPython itself: > > Programming Language :: Python :: Implementation :: MicroPython > > For the hardware it runs on: > > Operating System :: Baremetal > Environment :: Microcontroller > Environment :: Microcontroller :: PyBoard > Environment :: Microcontroller :: ESP8266 > Environment :: Microcontroller :: Micro:bit > Environment :: Microcontroller :: WiPy > Environment :: Microcontroller :: LoPy > Environment :: Microcontroller :: OpenMV > > I'm not sure if the latter makes sense, but it would certainly be > nice to be able to indicate in a machine-parseable way on which > platforms the code works. > > What do you think? -- Radomir Dopieralski From thomas at kluyver.me.uk Mon Feb 13 12:25:37 2017 From: thomas at kluyver.me.uk (Thomas Kluyver) Date: Mon, 13 Feb 2017 17:25:37 +0000 Subject: [Distutils] Indexing modules in Python distributions In-Reply-To: <20170209221822.GS12827@yuggoth.org> References: <1486466970.2092062.872926736.0C8AF205@webmail.messagingengine.com> <1486577678.268328.874662408.717603DB@webmail.messagingengine.com> <20170209221822.GS12827@yuggoth.org> Message-ID: <1487006737.1298666.879579024.66649E72@webmail.messagingengine.com> Thanks. So the current size is about 0.5 TB, and presumably if people are maintaining full mirrors, PyPI itself can cope with that much outgoing bandwidth being used. Steve & Chris: does downloading & scanning that volume of data sound like something you'd want to do on Azure? Does anyone there have some time to put in to move this forwards? Thomas On Thu, Feb 9, 2017, at 10:18 PM, Jeremy Stanley wrote: > On 2017-02-08 18:14:38 +0000 (+0000), Thomas Kluyver wrote: > [...] > > What I'm proposing differs in that it would need to download files from > > PyPI - basically all of them, if we're thorough about it. I imagine > > that's going to involve a lot of data transfer. Do we know what order of > > magnitude we're talking about? > [...] > > The crowd I run with uses https://pypi.org/project/bandersnatch/ to > maintain a full PyPI mirror for our project's distributed CI system, > and du says the current aggregate size is 488GiB. Also if you want > to initialize a full mirror this way, plan for it to take several > days to populate. > -- > Jeremy Stanley > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig From donald at stufft.io Mon Feb 13 12:35:37 2017 From: donald at stufft.io (Donald Stufft) Date: Mon, 13 Feb 2017 12:35:37 -0500 Subject: [Distutils] Indexing modules in Python distributions In-Reply-To: <1487006737.1298666.879579024.66649E72@webmail.messagingengine.com> References: <1486466970.2092062.872926736.0C8AF205@webmail.messagingengine.com> <1486577678.268328.874662408.717603DB@webmail.messagingengine.com> <20170209221822.GS12827@yuggoth.org> <1487006737.1298666.879579024.66649E72@webmail.messagingengine.com> Message-ID: > On Feb 13, 2017, at 12:25 PM, Thomas Kluyver wrote: > > Thanks. So the current size is about 0.5 TB, and presumably if people > are maintaining full mirrors, PyPI itself can cope with that much > outgoing bandwidth being used. > Yea, PyPI does something like 16TB a day of bandwidth :) ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From jim at jimfulton.info Mon Feb 13 15:51:09 2017 From: jim at jimfulton.info (Jim Fulton) Date: Mon, 13 Feb 2017 15:51:09 -0500 Subject: [Distutils] Announcing experimental wheel support in Buildout Message-ID: I've just released zc.buildout 2.8.0 and the buildout.wheel extension. If you have zc.buildout 2.8.0 or later, and you include: extensions = buildout.wheel In the buildout section of your buildout configuration, then buildout should be able to install distributions as wheels. This allowed me to install numpy using buildout, which wasn't possible before. This is a someone experimental version, which uses humpty to convert wheels to eggs. humpty in term uses uses distlib which seems to mishandle wheel metadata. (For example, it chokes if there's extra distribution meta and makes it impossible for buildout to install python-dateutil from a wheel.) Jim -- Jim Fulton http://jimfulton.info -------------- next part -------------- An HTML attachment was scrubbed... URL: From vinay_sajip at yahoo.co.uk Tue Feb 14 13:10:06 2017 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 14 Feb 2017 18:10:06 +0000 (UTC) Subject: [Distutils] distlib and wheel metadata References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> Message-ID: <2019192621.7718748.1487095806195@mail.yahoo.com> >?humpty in term uses uses distlib which seems to mishandle wheel> metadata. (For example, it chokes if there's extra distribution meta and > makes it impossible for buildout to install python-dateutil from a wheel.) I looked into the "mishandling". It's that the other tools don't adhere to [the current state of] PEP 426 as closely as distlib does. For example, wheel writes JSON metadata to metadata.json in the .dist-info directory, whereas PEP 426 calls for that data to be in pydist.json. The non-JSON metadata in the wheel (the METADATA file) does not strictly adhere to any of the metadata PEPs 241, 314, 345 or 426 (it has a mixture of incompatible fields). I can change distlib to look for metadata.json, and relax the rules to be more liberal regarding which fields to accept, but adhering to the PEP isn't mishandling things, as I see it. Work on distlib has slowed right down since around the time when PEP 426 was deferred indefinitely, and there seems to be little interest in progressing via metadata or other standardisation - we have to go by what the de facto tools (setuptools, wheel) choose to do. It's not an ideal situation, and incompatibilities can crop up, as you've seen. Regards, Vinay Sajip -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Feb 14 13:15:59 2017 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 14 Feb 2017 10:15:59 -0800 Subject: [Distutils] distlib and wheel metadata In-Reply-To: <2019192621.7718748.1487095806195@mail.yahoo.com> References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> Message-ID: On Tue, Feb 14, 2017 at 10:10 AM, Vinay Sajip via Distutils-SIG wrote: >> humpty in term uses uses distlib which seems to mishandle wheel >> metadata. (For example, it chokes if there's extra distribution meta and >> makes it impossible for buildout to install python-dateutil from a wheel.) > > I looked into the "mishandling". It's that the other tools don't adhere to > [the current state of] PEP 426 as closely as distlib does. For example, > wheel writes JSON metadata to metadata.json in the .dist-info directory, > whereas PEP 426 calls for that data to be in pydist.json. The non-JSON > metadata in the wheel (the METADATA file) does not strictly adhere to any of > the metadata PEPs 241, 314, 345 or 426 (it has a mixture of incompatible > fields). > > I can change distlib to look for metadata.json, and relax the rules to be > more liberal regarding which fields to accept, but adhering to the PEP isn't > mishandling things, as I see it. I thought the current status was that it's called metadata.json exactly *because* it's not standardized, and you *shouldn't* look at it? It's too bad that the JSON thing didn't work out, but I think we're better off working on better specifying the one source of truth everything already uses (METADATA) instead of bringing in *new* partially-incompatible-and-poorly-specified formats. -n -- Nathaniel J. Smith -- https://vorpus.org From jim at jimfulton.info Tue Feb 14 13:36:47 2017 From: jim at jimfulton.info (Jim Fulton) Date: Tue, 14 Feb 2017 13:36:47 -0500 Subject: [Distutils] distlib and wheel metadata In-Reply-To: <2019192621.7718748.1487095806195@mail.yahoo.com> References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> Message-ID: On Tue, Feb 14, 2017 at 1:10 PM, Vinay Sajip wrote: > > humpty in term uses uses distlib which seems to mishandle wheel > > metadata. (For example, it chokes if there's extra distribution meta and > > makes it impossible for buildout to install python-dateutil from a > wheel.) > > I looked into the "mishandling". It's that the other tools don't adhere to > [the current state of] PEP 426 as closely as distlib does. For example, > wheel writes JSON metadata to metadata.json in the .dist-info directory, > whereas PEP 426 calls for that data to be in pydist.json. The non-JSON > metadata in the wheel (the METADATA file) does not strictly adhere to any > of the metadata PEPs 241, 314, 345 or 426 (it has a mixture of incompatible > fields). > > I can change distlib to look for metadata.json, and relax the rules to be > more liberal regarding which fields to accept, but adhering to the PEP > isn't mishandling things, as I see it. > Fair enough. Notice that I said "seems to". :-] I suppose whether to be strict or not depends on use case. In my case, I was just trying to install a wheel as an egg, so permissive is definately what *I* want. Other use cases might want to be more strict. > > Work on distlib has slowed right down since around the time when PEP 426 > was deferred indefinitely, and there seems to be little interest in > progressing via metadata or other standardisation - we have to go by what > the de facto tools (setuptools, wheel) choose to do. It's not an ideal > situation, and incompatibilities can crop up, as you've seen. > Nope. Honestly, though, I wish there was *one* *library* that defined the standard, which was the case for setuptools for a while (yeah, I know, the warts, really, I know) because I really don't think there's a desire to innovate or a reason for competition at this level. In the case of wheel, perhaps it makes sense for that implementation to be authoritative. Thanks. Jim -- Jim Fulton http://jimfulton.info -------------- next part -------------- An HTML attachment was scrubbed... URL: From dholth at gmail.com Tue Feb 14 13:38:01 2017 From: dholth at gmail.com (Daniel Holth) Date: Tue, 14 Feb 2017 18:38:01 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> Message-ID: I would accept a pull request to stop generating metadata.json in bdist_wheel. On Tue, Feb 14, 2017 at 1:16 PM Nathaniel Smith wrote: > On Tue, Feb 14, 2017 at 10:10 AM, Vinay Sajip via Distutils-SIG > wrote: > >> humpty in term uses uses distlib which seems to mishandle wheel > >> metadata. (For example, it chokes if there's extra distribution meta and > >> makes it impossible for buildout to install python-dateutil from a > wheel.) > > > > I looked into the "mishandling". It's that the other tools don't adhere > to > > [the current state of] PEP 426 as closely as distlib does. For example, > > wheel writes JSON metadata to metadata.json in the .dist-info directory, > > whereas PEP 426 calls for that data to be in pydist.json. The non-JSON > > metadata in the wheel (the METADATA file) does not strictly adhere to > any of > > the metadata PEPs 241, 314, 345 or 426 (it has a mixture of incompatible > > fields). > > > > I can change distlib to look for metadata.json, and relax the rules to be > > more liberal regarding which fields to accept, but adhering to the PEP > isn't > > mishandling things, as I see it. > > I thought the current status was that it's called metadata.json > exactly *because* it's not standardized, and you *shouldn't* look at > it? > > It's too bad that the JSON thing didn't work out, but I think we're > better off working on better specifying the one source of truth > everything already uses (METADATA) instead of bringing in *new* > partially-incompatible-and-poorly-specified formats. > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Tue Feb 14 14:40:17 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 14 Feb 2017 19:40:17 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> Message-ID: On 14 February 2017 at 18:36, Jim Fulton wrote: > I wish there was *one* *library* that defined the standard packaging should be that library, but it doesn't cover metadata precisely because that PEP 426 hasn't been accepted (it doesn't try to cover the historical metadata 1.x standards, or "de facto" standards that aren't backed by a PEP AIUI). Paul From vinay_sajip at yahoo.co.uk Tue Feb 14 14:40:56 2017 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 14 Feb 2017 19:40:56 +0000 (UTC) Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> Message-ID: <108655704.7820975.1487101256733@mail.yahoo.com> > Nope. Honestly, though, I wish there was *one* *library* that defined the standard, > which was the case for setuptools for a while (yeah, I know, the warts, really, I know) > because I really don't think there's a desire to innovate or a reason for competition > at this level. In the case of wheel, perhaps it makes sense for that implementation to > be authoritative. The problem, to me, is not whether it is authoritative - it's more that it's ad hoc, just like setuptools in some areas. For example, the decision to use "metadata.json" rather than "pydist.json" is arbitrary, and could change in the future, and anyone who relies on how things work now will have to play catch-up when that happens. That's sometimes just too much work for volunteer activity - dig into what the problem is, put through a fix (for now), rinse and repeat - all the while, little or no value is really added. In theory this is an "infrastructure" area where a single blessed implementation might be OK, but these de facto tools don't do everything one wants, so interoperability remains important. There's no reason why we shouldn't look to innovate even in this area - there's some talk of a GSoC project now to look at dependency resolution for pip - something that I had sort-of working in the distil tool long ago (as a proof of concept) [1]. We've gotten so used to how pip and setuptools work, and because they are "good enough", there is a real failure of imagination to see how things might be done better. Regards, Vinay Sajip [1] https://distil.readthedocs.io/en/0.1.0/overview.html#actual-improvements From jim at jimfulton.info Tue Feb 14 15:10:03 2017 From: jim at jimfulton.info (Jim Fulton) Date: Tue, 14 Feb 2017 15:10:03 -0500 Subject: [Distutils] distlib and wheel metadata In-Reply-To: <108655704.7820975.1487101256733@mail.yahoo.com> References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <108655704.7820975.1487101256733@mail.yahoo.com> Message-ID: On Tue, Feb 14, 2017 at 2:40 PM, Vinay Sajip wrote: > > Nope. Honestly, though, I wish there was *one* *library* that defined > the standard, > > which was the case for setuptools for a while (yeah, I know, the warts, > really, I know) > > because I really don't think there's a desire to innovate or a reason > for competition > > at this level. In the case of wheel, perhaps it makes sense for that > implementation to > > be authoritative. > > The problem, to me, is not whether it is authoritative - it's more that > it's ad hoc, just like > setuptools in some areas. For example, the decision to use "metadata.json" > rather than > "pydist.json" is arbitrary, and could change in the future, and anyone who > relies on how things > work now will have to play catch-up when that happens. Unless they depend on a public API provided by the wheel package. Of course, you could argue that the name of a file could be part of the API. In many ways, depending and building on a working implementation is better that drafting a standard from scratch. Packaging has moved forward largely by people who built things pragmatically that worked and solved every-day problems: setuptools/easy_install, buildout, pip, wheel... > That's sometimes just too much work for > volunteer activity - dig into what the problem is, put through a fix (for > now), rinse and > repeat - all the while, little or no value is really added. > > In theory this is an "infrastructure" area where a single blessed > implementation might be OK, > I think so. > but these de facto tools don't do everything one wants, so > interoperability remains important. > Or collaboration to improve the tool. That *should* have worked for setuptools, but sadly didn't, for various reasons. > There's no reason why we shouldn't look to innovate even in this area - > there's some talk of a > GSoC project now to look at dependency resolution Yay! (I saw that.) > for pip Gaaaa. Why can't this be in a library? (Hopefully it will be.) - something that I had sort-of working > in the distil tool long ago (as a proof of concept) [1]. Almost is a hard sell. If this was usable as a library, I'd be interested in trying to integrate it with buildout. If it worked, many buildout users would be greatful. Perhaps the GSoC project could use it as a reference or starting point. We've gotten so used to how pip and setuptools work, and because they are "good enough", there is a real > failure of imagination > to see how things might be done better. > I think there is a failure of energy. Packaging should largely be boring and most people don't want to work on it. I certainly don't, even though I have. But you picked a good example. There are major differences (I almost said competition) between pip and buildout. They provide two different models (traditional Python system installs vs Java-like component/path installs) that address different use cases. IMO, these systems should complement each other and build on common foundations. Maybe there are more cases for innovation at lower levels than I'm aware of. Jim -- Jim Fulton http://jimfulton.info -------------- next part -------------- An HTML attachment was scrubbed... URL: From vinay_sajip at yahoo.co.uk Tue Feb 14 15:21:12 2017 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 14 Feb 2017 20:21:12 +0000 (UTC) Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> Message-ID: <425841221.7853973.1487103672849@mail.yahoo.com> > I thought the current status was that it's called metadata.json > exactly *because* it's not standardized, and you *shouldn't* look at > it? Well, it was work-in-progress-standardised according to PEP 426 (since sometimes implementations have to work in parallel with working out the details of specifications). Given that PEP 426 wasn't done and dusted but being progressed, I would have thought it perfectly acceptable to use "pydist.json", as the only things that would be affected would be packaging tools working to the PEP. > It's too bad that the JSON thing didn't work out, but I think we're > better off working on better specifying the one source of truth > everything already uses (METADATA) instead of bringing in *new* > partially-incompatible-and-poorly-specified formats. When you say "everything already uses", do you mean setuptools and wheel? If nobody else is allowed to play, that's one thing. But otherwise, there need to be standards for interoperability. The METADATA file, now - exactly which standard does it follow? The one in the dateutil wheel that Jim referred to doesn't appear to conform to any of the metadata PEPs. It was rejected by old metadata code in distlib (which came of out the Python 3.3 era "packaging" package - not to be confused with Donald's of the same name - which is strict in its interpretation of those earlier PEPs). The METADATA format (key-value) is not really flexible enough for certain things which were in PEP 426 (e.g. dependency descriptions), and for these JSON seems a reasonable fit. There's no technical reason why "the JSON thing didn't work out", as far as I can see - it was just given up on for a more incremental approach (which has got no new PEPs other than 440, AFAICT). I understand that social reasons are often more important than technical reasons when it comes to success or failure of an approach; I'm just not sure that in this case, it wasn't given up on too early. Regards, Vinay Sajip From wes.turner at gmail.com Tue Feb 14 15:28:41 2017 From: wes.turner at gmail.com (Wes Turner) Date: Tue, 14 Feb 2017 14:28:41 -0600 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> Message-ID: On Tue, Feb 14, 2017 at 12:15 PM, Nathaniel Smith wrote: > On Tue, Feb 14, 2017 at 10:10 AM, Vinay Sajip via Distutils-SIG > wrote: > >> humpty in term uses uses distlib which seems to mishandle wheel > >> metadata. (For example, it chokes if there's extra distribution meta and > >> makes it impossible for buildout to install python-dateutil from a > wheel.) > > > > I looked into the "mishandling". It's that the other tools don't adhere > to > > [the current state of] PEP 426 as closely as distlib does. For example, > > wheel writes JSON metadata to metadata.json in the .dist-info directory, > > whereas PEP 426 calls for that data to be in pydist.json. The non-JSON > > metadata in the wheel (the METADATA file) does not strictly adhere to > any of > > the metadata PEPs 241, 314, 345 or 426 (it has a mixture of incompatible > > fields). > > > > I can change distlib to look for metadata.json, and relax the rules to be > > more liberal regarding which fields to accept, but adhering to the PEP > isn't > > mishandling things, as I see it. > > I thought the current status was that it's called metadata.json > exactly *because* it's not standardized, and you *shouldn't* look at > it? > > It's too bad that the JSON thing didn't work out, but I think we're > better off working on better specifying the one source of truth > everything already uses (METADATA) instead of bringing in *new* > partially-incompatible-and-poorly-specified formats. > JSON-LD https://www.google.com/search?q=python+package+metadata+jsonld https://www.google.com/search?q="pep426jsonld" PEP426 (Deferred) Switching to a JSON compatible format https://www.python.org/dev/peps/pep-0426/#switching-to-a-json-compatible-format PEP 426: Define a JSON-LD context as part of the proposal https://github.com/pypa/interoperability-peps/issues/31 This doesn't work with JSON-LD 1.0: ```json releases = { "v0.0.1": {"url": ... }, "v1.0.0": {"url": ...}, } This does work with JSON-LD 1.0: ```json releases = [ {"version": "v0.0.1", "url": ...}, {"version": "v1.0.0", "url": ...}, ] ... Then adding custom attributes could be as easy as defining a URI namespace and additional attribute names; because {distutils, setuptools, pip, pipenv(?)} only need to validate the properties necessary for the relevant packaging operation. Without any JSON-LD normalization, these aren't equal: {"url": "#here"} {"schema:url": "#here"} {"http://schema.org/url", "#here"} This is the JSON downstream tools currently have/want to consume (en masse, for SAT solving, etc): https://pypi.python.org/pypi/ipython/json - It's a graph. - JSON-LD is for graphs. - There are normalizations and signatures for JSON-LD (ld-signatures != JWS) - Downstream tools need not do anything with the @context. ("JSON-LD unaware") - Downstream tools which generate pydist.jsonld should validate schema in tests Downstream tools: - https://github.com/pypa/pip/issues/988 "Pip needs a dependency resolver" (-> JSON) - https://github.com/pypa/warehouse/issues/1638 "API to get checksums" (-> JSON) Q: How do we get this (platform and architecture-specific) metadata to warehouse, where it can be hosted? A JSONLD entrypoint in warehouse (for each project, for every project, for {my_subset}): https://pypi.python.org/pypi/ipython/jsonld > I would accept a pull request to stop generating metadata.json in bdist_wheel. What about a pull request to start generating metadata.jsonld or pydist.jsonld instead? - [ ] "@context": { }, - [ ] "@graph": { }, # optional #PEP426JSONLD > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Feb 14 16:21:36 2017 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 14 Feb 2017 13:21:36 -0800 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com>

Message-ID: On Feb 14, 2017 12:21, "Vinay Sajip" wrote: > I thought the current status was that it's called metadata.json > exactly *because* it's not standardized, and you *shouldn't* look at > it? Well, it was work-in-progress-standardised according to PEP 426 (since sometimes implementations have to work in parallel with working out the details of specifications). Given that PEP 426 wasn't done and dusted but being progressed, I would have thought it perfectly acceptable to use "pydist.json", as the only things that would be affected would be packaging tools working to the PEP. > It's too bad that the JSON thing didn't work out, but I think we're > better off working on better specifying the one source of truth > everything already uses (METADATA) instead of bringing in *new* > partially-incompatible-and-poorly-specified formats. When you say "everything already uses", do you mean setuptools and wheel? If nobody else is allowed to play, that's one thing. But otherwise, there need to be standards for interoperability. The METADATA file, now - exactly which standard does it follow? The one in the dateutil wheel that Jim referred to doesn't appear to conform to any of the metadata PEPs. It was rejected by old metadata code in distlib (which came of out the Python 3.3 era "packaging" package - not to be confused with Donald's of the same name - which is strict in its interpretation of those earlier PEPs). That's why I said we need to fix the standards to bring them back in sync with reality. I'm not arguing that there's no problem, I'm saying that replacing one serialization format with another won't actually address the problem, but does cause new complications. The METADATA format (key-value) is not really flexible enough for certain things which were in PEP 426 (e.g. dependency descriptions), and for these JSON seems a reasonable fit. There's no technical reason why "the JSON thing didn't work out", as far as I can see - it was just given up on for a more incremental approach (which has got no new PEPs other than 440, AFAICT). I understand that social reasons are often more important than technical reasons when it comes to success or failure of an approach; I'm just not sure that in this case, it wasn't given up on too early. The technical problem with PEP 426 is that unless you want to throw away pypi and start over, all tools need to understand the old METADATA files regardless. So it still needs to be specified, all the same code needs to be kept around, etc. Plus the most pressing issues are like "what does the field actually mean", which is totally independent of the serialization format. If there are particular fields that need more structured data, then there are options: we could have fields in METADATA whose values are JSON, or a sidecar file that supplements the main METADATA file with extra information. But adding a new way to specify fields like Name and Version really doesn't help anybody. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Tue Feb 14 16:36:41 2017 From: donald at stufft.io (Donald Stufft) Date: Tue, 14 Feb 2017 16:36:41 -0500 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> Message-ID: <49FBDC8D-DAE4-469D-ABB4-AC8FE3415545@stufft.io> > On Feb 14, 2017, at 1:15 PM, Nathaniel Smith wrote: > > It's too bad that the JSON thing didn't work out, but I think we're > better off working on better specifying the one source of truth > everything already uses (METADATA) instead of bringing in *new* > partially-incompatible-and-poorly-specified formats. > TBH I don?t think we?re going to stick with METADATA forever and it?s likely we, at some point, get to a JSON representation for this information but that is not today. We have far more pressing issues to deal with besides whether things are in one format or another. Yes, we still have a fair amount of behavior that is defined as ?whatever setuptools/distutils does?, but we?re slowly trying to break away from that. WRT to ?standard implementations? versus ?standards?, the idea of a ?standard implementation? being the source of truth and no longer needing to do all the work to define standards is a nice idea, but I think it is an idea that is never actually going to work out as well as real standardization. There is *always* going to be a need for tools that aren?t the blessed tools to interact with these items. Even if you can authoritatively say that this one Python implementation is the only implementation that any Python program will ever need, there is still the problem that people need to consume this information in languages that aren?t Python. Another problem there is it becomes incredibly difficult to know what is something that is supported as an actual feature and what is something that just sort of works because of that way that something was implemented. My goal with the packaging library is to more or less strictly implement accepted PEPs (and while I will make in progress PRs for PEPs that are currently being worked on, I won?t actually land a PR until the PEP is accepted). The only other real code there is extra utilities that make the realities of working with the specified PEPs easier (for example, we have a Version object which implements PEP 440 versions, but we also have a LegacyVersion object that implements what setuptools used to do). This not only gives us the benefit of a single implementation for people who just want to use that single blessed implementation, but it gives us the benefit of standards. This has already been useful in the packaging library where an implementation defect caused versions to get parsed slightly wrong, and we had the extensively documented PEP 440 to declare what the expected behavior was. I do not think the problem is "We've gotten so used to how pip and setuptools work, and because they are "good enough", there is a real failure of imagination to see how things might be done better.?. The hard work of doing this isn?t in writing an implementation that achieves it for 80% of projects, it?s for doing it in a way that achieves it for 95% of projects. Managing backwards compatibility is probably the single most important thing we can do here. There are almost 800,000 files on PyPI that someone can download and install, telling all of them they need to switch to some new system or things are going to break for them is simply not tenable. That being said, I don?t think there is anything stopping us from getting to a better point besides time and effort. ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Feb 14 17:26:17 2017 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 14 Feb 2017 14:26:17 -0800 Subject: [Distutils] distlib and wheel metadata In-Reply-To: <49FBDC8D-DAE4-469D-ABB4-AC8FE3415545@stufft.io> References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <49FBDC8D-DAE4-469D-ABB4-AC8FE3415545@stufft.io> Message-ID: On Tue, Feb 14, 2017 at 1:36 PM, Donald Stufft wrote: > WRT to ?standard implementations? versus ?standards?, the idea of a > ?standard implementation? being the source of truth and no longer needing to > do all the work to define standards is a nice idea, but I think it is an > idea that is never actually going to work out as well as real > standardization. There is *always* going to be a need for tools that aren?t > the blessed tools to interact with these items. Even if you can > authoritatively say that this one Python implementation is the only > implementation that any Python program will ever need, there is still the > problem that people need to consume this information in languages that > aren?t Python. Another even more fundamental reason that standards are important is to document semantics. Like, distlib or packaging or whatever can expose the "provides" field, but what does that actually mean? As a user of distlib/packaging, how should I change what I'm doing when I see that field? As a package author when should I set it? (I'm intentionally picking an example where the answer is "well the PEP says something about this but in reality it was never implemented and maybe has some security issues and no-one really knows" :-).) A "standard implementation" can abstract away some things, but by definition these are mostly the boring bits... -n -- Nathaniel J. Smith -- https://vorpus.org From vinay_sajip at yahoo.co.uk Tue Feb 14 18:01:48 2017 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 14 Feb 2017 23:01:48 +0000 (UTC) Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com>

Message-ID: <1372092083.7912708.1487113308989@mail.yahoo.com> > The technical problem with PEP 426 is that unless you want to throw away pypi and start over, > all tools need to understand the old METADATA files regardless. It might not be as bad as that. For example, that IMO was the mistake behind the original concept of distutils2 - it was never going to fly as it required everyone to switch over to distutils2's way of doing things, and wouldn't be able to deal with old releases etc. For a time, I maintained a pretty extensive parallel set of metadata, based on just the data passed to setup() by packages using distutils/setuptools. This included not just the data for installation but even the data for package build, where it was purely declarative at the arguments-to-setup() level. Where a package didn't do completely bespoke things in setup() - like create new files, move files around etc. then the parallel set of metadata would allow installation of even old releases, without executing any setuptools code at all. I've not had the bandwidth to keep working on distlib and the metadata (example [1]), and the volume of new stuff going onto PyPI meant I didn't have time to keep on top of it. But the approach had some promise, in my view, and certainly showed that purely declarative packages (which didn't use e.g. custom build and install distutils/setuptools commands) could be installed using a completely different tool [than distutils/setuptools] without package authors having to change anything (beyond staying purely declarative). The distil documentation [2] shows installing a number of distributions (existing releases) from PyPI with better dependency resolution than pip does now, and without "throwing away PyPI". Anyway, I guess it's water under the bridge. Regards, Vinay Sajip [1] https://www.red-dove.com/pypi/projects/J/Jinja2/package-2.7.3.json [2] https://distil.readthedocs.io/en/0.1.0/installing.html#installing-distributions From vinay_sajip at yahoo.co.uk Tue Feb 14 18:12:57 2017 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 14 Feb 2017 23:12:57 +0000 (UTC) Subject: [Distutils] distlib and wheel metadata In-Reply-To: <49FBDC8D-DAE4-469D-ABB4-AC8FE3415545@stufft.io> References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <49FBDC8D-DAE4-469D-ABB4-AC8FE3415545@stufft.io> Message-ID: <1966892435.7938940.1487113977182@mail.yahoo.com> > Managing backwards compatibility is probably the single most important thing we can do here. > There are almost 800,000 files on PyPI that someone can download and install, telling all > of them they need to switch to some new system or things are going to break for them is > simply not tenable. I agree. But if packaging is going at some point to break out of allowing completely bespoke code to run at installation time (i.e. executable code like a free-for-all setup.py, vs. something declarative and thus more restrictive) then IMO you have to sacrifice 100% backwards compatibility. See my comment in my other post about the ability to install old releases - I made that a goal of my experiments with the parallel metadata, to not require anything other than a declarative setup() in order to be able to install stuff using just the metadata, so that nobody has to switch anything in a big-bang style, but could transition over to a newer system at their leisure. Regards, Vinay Sajip From ncoghlan at gmail.com Wed Feb 15 06:33:41 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 15 Feb 2017 12:33:41 +0100 Subject: [Distutils] distlib and wheel metadata In-Reply-To: <425841221.7853973.1487103672849@mail.yahoo.com> References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 14 February 2017 at 21:21, Vinay Sajip via Distutils-SIG wrote: > > >> I thought the current status was that it's called metadata.json >> exactly *because* it's not standardized, and you *shouldn't* look at >> it? > > > Well, it was work-in-progress-standardised according to PEP 426 (since > sometimes implementations have to work in parallel with working out the > details of specifications). Given that PEP 426 wasn't done and dusted > but being progressed, I would have thought it perfectly acceptable to > use "pydist.json", as the only things that would be affected would be > packaging tools working to the PEP. I asked Daniel to *stop* using pydist.json, since wheel was emitting a point-in-time snapshot of PEP 426 (which includes a lot of potentially-nice-to-have things that nobody has actually implemented so far, like the semantic dependency declarations and the enhancements to the extras syntax), rather than the final version of the spec. >> It's too bad that the JSON thing didn't work out, but I think we're >> better off working on better specifying the one source of truth >> everything already uses (METADATA) instead of bringing in *new* >> partially-incompatible-and-poorly-specified formats. > > When you say "everything already uses", do you mean setuptools and wheel? > If nobody else is allowed to play, that's one thing. But otherwise, there > need to be standards for interoperability. The METADATA file, now - exactly > which standard does it follow? The one in the dateutil wheel that Jim > referred to doesn't appear to conform to any of the metadata PEPs. It was > rejected by old metadata code in distlib (which came of out the Python 3.3 > era "packaging" package - not to be confused with Donald's of the same name - > which is strict in its interpretation of those earlier PEPs). > > The METADATA format (key-value) is not really flexible enough for certain > things which were in PEP 426 (e.g. dependency descriptions), and for these > JSON seems a reasonable fit. The current de facto standard set by setuptools and bdist_wheel is: - dist-info/METADATA as defined at https://packaging.python.org/specifications/#package-distribution-metadata - dist-info/requires.txt runtime dependencies as defined at http://setuptools.readthedocs.io/en/latest/formats.html#requires-txt - dist-info/setup_requires.txt build time dependencies as defined at http://setuptools.readthedocs.io/en/latest/formats.html#setup-requires-txt The dependency fields in METADATA itself unfortunately aren't really useful for anything. There's definitely still a place for a pydist.json created by going through PEP 426, comparing it to what bdist_wheel already does to populate metadata.json, and either changing the PEP to match the existing practice, or else agreeing that we prefer what the PEP recommends, that we want to move in that direction, and that there's a definite commitment to implement the changes in at least setuptools and bdist_wheel (plus a migration strategy that allows for reasonably sensible consumption of old metadata). Such an update would necessarily be a fairly ruthless process, where we defer everything that can possibly be deferred. I already made one pass at that when I split out the metadata extensions into PEP 459, but at least one more such pass is needed before we can sign off on the spec as metadata 2.0 - even beyond any "open for discussion" questions, there are still things in there which were extracted and standardised separately in PEP 508. > There's no technical reason why "the JSON thing > didn't work out", as far as I can see - it was just given up on for a more > incremental approach (which has got no new PEPs other than 440, AFAICT). Yep, it's a logistical problem rather than a technical problem per se - new metadata formats need software publisher adoption to ensure the design is sensible before we commit to them long term, but software publishers are understandably reluctant to rely on new formats that limit their target audience to folks running the latest versions of the installation tools (outside constrained cases where the software publisher is also the main consumer of that software). For PEP 440 (version specifiers) and PEP 508 (dependency specifiers), this was handled by focusing on documenting practices that people already used (and checking existing PyPI projects for compatibility), rather than trying to actively change those practices. For pyproject.toml (e.g. enscons), the idea is to provide a setup.py shim that can take care of bootstrapping the new approach for the benefit of older tools that assume the use of setup.py (similar to what was done with setup.cfg and d2to1). The equivalent for PEP 426 would probably be legacy-to-pydist and pydist-to-legacy converters that setuptools, bdist_wheel and other publishing tools can use to ship legacy metadata alongside the standardised format (and I believe Daniel already has at least the former in order to generate metadata.json in bdist_wheel). With PEP 426 as currently written, a pydist-to-legacy converter isn't really feasible, since pydist proposes new concepts that can't be readily represented in the old format. > I understand that social reasons are often more important than technical reasons > when it comes to success or failure of an approach; I'm just not sure that > in this case, it wasn't given up on too early. I think of PEP 426 as "deferred indefinitely pending specific practical problems to provide clearer design constraints" rather than abandoned :) There are two recent developments that I think may provide those missing design constraints and hence motivation to finalise a metadata 2.0 specification: 1. the wheel-to-egg support in humpty (and hence zc.buiidout). That makes humpty a concrete non-traditional installer that would benefit from both a modernised standard metadata format, as well as common tools both to convert legacy metadata to the agreed modern format and to convert the modern format back to the legacy format for inclusion in the generated egg files (as then humpty could just re-use the shared tools, rather than having to maintain those capabilities itself). 2. the new pipenv project to provide a simpler alternative to the pip+virtualenv+pip-tools combination for environment management in web service development (and similar layered application architectures). As with the "install vs setup" split in setuptools, pipenv settled on an "only two kinds of requirement (deployment and development)" model for usability reasons, but it also distinguishes abstract dependencies stored in Pipfile from pinned concrete dependencies stored in Pipfile.lock. If we put those together with the existing interest in automating generation of policy compliant operating system distribution packages, it makes it easier to go through the proposed semantic dependency model in PEP 426 and ask "How would we populate these fields based on the metadata that projects *already* publish?". - "run requires": straightforward, as these are the standard dependencies used in most projects. Not entirely clear how to gently (or strongly!) discourage dependency pinning when publishing to PyPI (although the Pipfile and Pipfile.lock model used in pipenv may help with this) - "meta requires": not clear at all, as this was added to handle cases like PyObjC, where the main package is just a metapackage that makes a particular set of versioned subpackages easy to install. This may be better modeled as a separate "integrates" field, using a declaration syntax more akin to that used for Pipfile.lock rather than that used for normal requirements declarations. - "dev requires": corresponds to "dev-packages" in pipenv - "build requires": corresponds to "setup_requires" in setuptools, "build-system.requires" + any dynamic build dependencies in PEP 518 - "test requires": corresponds to "test" extra in https://packaging.python.org/specifications/#provides-extra-multiple-use The "doc" extra in https://packaging.python.org/specifications/#provides-extra-multiple-use would map to "build requires", but there's potential benefit to redistributors in separating it out, as we often split the docs out from the built software components (since there's little reason to install documentation on headless servers that are only going to be debugged remotely). The main argument against "test requires" and "doc requires" is that the extras system already works fine for those - "pip install MyProject[test]" and "pip install MyProject[doc]" are both already supported, so metadata 2.0 just needs to continue to reserve those as semantically significant extras names. "dev" requires could be handled the same way - anything you actually need to *build* an sdist or wheel archive from a source repository should be in "setup_requires" (setuptools) or "build-system.requires" (pyproject.toml), so "dev" would just be a conventional extra name rather than a top level field. That just leaves "build_requires", which turns out to interact awkwardly with the "extras" system: if you write "pip install MyProject[test]", does it install all the "test" dependencies, regardless of whether they're listed in run_requires or build_requires? If yes: then why are run_requires and build_requires separate? If no: then how do you request installation of the "test" build extra? Or are build extras prohibited entirely? That suggests that perhaps "build" should just be a conventional extra as well, and considered orthogonal to the other conventional extras. (I'm sure this idea has been suggested before, but I don't recall who suggested it or when) And if build, test, doc, and dev are all handled as extras, then the top level name "run_requires" no longer makes sense, and the field name should go back to just being "requires". Under that evaluation, we'd be left with only the following top level fields defined for dependency declarations: - "requires": list where entries are either a string containing a PEP 508 dependency specifier or else a hash map contain a "requires" key plus "extra" or "environment" fields as qualifiers - "integrates": replacement for "meta_requires" that only allows pinned dependencies (i.e. hash maps with "name" & "version" fields, or direct URL references, rather than a general PEP 508 specifier as a string) For converting old metadata, any concrete dependencies that are compatible with the "integrates" field format would be mapped that way, while everything else would be converted to "requires" entries. The semantic differences between normal runtime dependencies and "dev", "test", "doc" and "build" requirements would be handled as extras, regardless of whether you were using the old metadata format or the new one. Going the other direction would be similarly straightforward since (excluding extensions) the set of required conceptual entities has been reduced back to the set that already exists in the current metadata formats. While "requires" and "integrates" would be distinct fields in pydist.json, the decomposed fields in the latter would map back to their string-based counterparts in PEP 508 when converted to the legacy metadata formats. Cheers, Nick. P.S. I'm definitely open to a PR that amends the PEP 426 draft along these lines. I'll get to it eventually myself, but there are some other things I see as higher priority for my open source time at the moment (specifically the C locale handling behaviour of Python 3.6 in Fedora 26 and the related upstream proposal for Python 3.7 in PEP 538) -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From njs at pobox.com Wed Feb 15 06:58:51 2017 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 15 Feb 2017 03:58:51 -0800 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On Wed, Feb 15, 2017 at 3:33 AM, Nick Coghlan wrote: > - "requires": list where entries are either a string containing a PEP > 508 dependency specifier or else a hash map contain a "requires" key > plus "extra" or "environment" fields as qualifiers > - "integrates": replacement for "meta_requires" that only allows > pinned dependencies (i.e. hash maps with "name" & "version" fields, or > direct URL references, rather than a general PEP 508 specifier as a > string) What's accomplished by separating these? I really think we should strive to have fewer more orthogonal concepts whenever possible... -n -- Nathaniel J. Smith -- https://vorpus.org From dc_isar at yahoo.co.uk Tue Feb 14 00:27:15 2017 From: dc_isar at yahoo.co.uk (Chitra Dewan) Date: Tue, 14 Feb 2017 05:27:15 +0000 (UTC) Subject: [Distutils] Python installation not working References: <2038221091.6741722.1487050035224.ref@mail.yahoo.com> Message-ID: <2038221091.6741722.1487050035224@mail.yahoo.com> Hello, I am beginner in Python?I am facing problems in installing Python 3.5 ?on my windows vista x32 machine.I downloaded?python-3.5.2.exe from Python.org. It is downloaded as an exe. When I try to install it via ?"Run as administrator" , nothing happens. ?Same behavior with 3.6 version? kindly advise? ?Regards & Thanks, Chitra Dewan -------------- next part -------------- An HTML attachment was scrubbed... URL: From VenkatRamReddy.k at hcl.com Tue Feb 14 01:48:12 2017 From: VenkatRamReddy.k at hcl.com (Venkat Ram Reddy K) Date: Tue, 14 Feb 2017 06:48:12 +0000 Subject: [Distutils] py2exe package for 2.7 Message-ID: Hi Good Afternoon, This is Venkat from HCL Technologies. Actually I have created executable file(test.exe) by using py2exe package on python 2.7 version on Windows. After that I have ran my application from the path C:\Python27\dist\test.exe, It was executed and working properly. But the problem is, when I have copied test.exe to other folder(other than "C:\Python27\dist\")and tried to run the test.exe, it is not executing. Could you please help me in resolving the issue. Thanks, Venkat. ::DISCLAIMER:: ---------------------------------------------------------------------------------------------------------------------------------------------------- The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of authorized representative of HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any email and/or attachments, please check them for viruses and other defects. ---------------------------------------------------------------------------------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Wed Feb 15 08:00:59 2017 From: wes.turner at gmail.com (Wes Turner) Date: Wed, 15 Feb 2017 07:00:59 -0600 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On Wed, Feb 15, 2017 at 5:33 AM, Nick Coghlan wrote: > On 14 February 2017 at 21:21, Vinay Sajip via Distutils-SIG > wrote: > > > > > >> I thought the current status was that it's called metadata.json > >> exactly *because* it's not standardized, and you *shouldn't* look at > >> it? > > > > > > Well, it was work-in-progress-standardised according to PEP 426 (since > > sometimes implementations have to work in parallel with working out the > > details of specifications). Given that PEP 426 wasn't done and dusted > > but being progressed, I would have thought it perfectly acceptable to > > use "pydist.json", as the only things that would be affected would be > > packaging tools working to the PEP. > > I asked Daniel to *stop* using pydist.json, since wheel was emitting a > point-in-time snapshot of PEP 426 (which includes a lot of > potentially-nice-to-have things that nobody has actually implemented > so far, like the semantic dependency declarations and the enhancements > to the extras syntax), rather than the final version of the spec. > Would you send a link to the source for this? > > >> It's too bad that the JSON thing didn't work out, but I think we're > >> better off working on better specifying the one source of truth > >> everything already uses (METADATA) instead of bringing in *new* > >> partially-incompatible-and-poorly-specified formats. > > > > When you say "everything already uses", do you mean setuptools and wheel? > > If nobody else is allowed to play, that's one thing. But otherwise, there > > need to be standards for interoperability. The METADATA file, now - > exactly > > which standard does it follow? The one in the dateutil wheel that Jim > > referred to doesn't appear to conform to any of the metadata PEPs. It was > > rejected by old metadata code in distlib (which came of out the Python > 3.3 > > era "packaging" package - not to be confused with Donald's of the same > name - > > which is strict in its interpretation of those earlier PEPs). > > > > The METADATA format (key-value) is not really flexible enough for certain > > things which were in PEP 426 (e.g. dependency descriptions), and for > these > > JSON seems a reasonable fit. > > The current de facto standard set by setuptools and bdist_wheel is: > > - dist-info/METADATA as defined at > https://packaging.python.org/specifications/#package-distribution-metadata > - dist-info/requires.txt runtime dependencies as defined at > http://setuptools.readthedocs.io/en/latest/formats.html#requires-txt > - dist-info/setup_requires.txt build time dependencies as defined at > http://setuptools.readthedocs.io/en/latest/formats.html#setup-requires-txt > > The dependency fields in METADATA itself unfortunately aren't really > useful for anything. > Graph: Nodes and edges. > > There's definitely still a place for a pydist.json created by going > through PEP 426, comparing it to what bdist_wheel already does to > populate metadata.json, and either changing the PEP to match the > existing practice, or else agreeing that we prefer what the PEP > recommends, that we want to move in that direction, and that there's a > definite commitment to implement the changes in at least setuptools > and bdist_wheel (plus a migration strategy that allows for reasonably > sensible consumption of old metadata). > Which function reads metadata.json? Which function reads pydist.json? > > Such an update would necessarily be a fairly ruthless process, where > we defer everything that can possibly be deferred. I already made one > pass at that when I split out the metadata extensions into PEP 459, > but at least one more such pass is needed before we can sign off on > the spec as metadata 2.0 - even beyond any "open for discussion" > questions, there are still things in there which were extracted and > standardised separately in PEP 508. > > > There's no technical reason why "the JSON thing > > didn't work out", as far as I can see - it was just given up on for a > more > > incremental approach (which has got no new PEPs other than 440, AFAICT). > > Yep, it's a logistical problem rather than a technical problem per se > - new metadata formats need software publisher adoption to ensure the > design is sensible before we commit to them long term, but software > publishers are understandably reluctant to rely on new formats that > limit their target audience to folks running the latest versions of > the installation tools (outside constrained cases where the software > publisher is also the main consumer of that software). > An RDFS Vocabulary contains Classes and Properties with rdfs:ranges and rdfs:domains. There are many representations for RDF: RDF/XML, Turtle/N3, JSONLD. RDF is implementation-neutral. JSONLD is implementation-neutral. > > For PEP 440 (version specifiers) and PEP 508 (dependency specifiers), > this was handled by focusing on documenting practices that people > already used (and checking existing PyPI projects for compatibility), > rather than trying to actively change those practices. > > For pyproject.toml (e.g. enscons), the idea is to provide a setup.py > shim that can take care of bootstrapping the new approach for the > benefit of older tools that assume the use of setup.py (similar to > what was done with setup.cfg and d2to1). > > The equivalent for PEP 426 would probably be legacy-to-pydist and > pydist-to-legacy converters that setuptools, bdist_wheel and other > publishing tools can use to ship legacy metadata alongside the > standardised format (and I believe Daniel already has at least the > former in order to generate metadata.json in bdist_wheel). With PEP > 426 as currently written, a pydist-to-legacy converter isn't really > feasible, since pydist proposes new concepts that can't be readily > represented in the old format. > pydist-to-legacy would be a lossy transformation. > > > I understand that social reasons are often more important than technical > reasons > > when it comes to success or failure of an approach; I'm just not sure > that > > in this case, it wasn't given up on too early. > > I think of PEP 426 as "deferred indefinitely pending specific > practical problems to provide clearer design constraints" rather than > abandoned :) > Is it too late to request lowercased property names without dashes? If we're (I'm?) going to create @context URIs, compare: https://schema.python.org/v1#Provides-Extra { "@context": { "default": "https://schema.python.org/#", "schema": "http://schema.org/", # "name": "http://schema.org/name", # "url": "http://schema.org/url", # "verstr": # "extra": # "requirements" # "requirementstr" }, "@typeof": [ "py:PythonPackage"], "name": "IPython", "url": ["https://pypi.python.org/pypi/IPython", "https://pypi.org/project/ IPython"], "Provides-Extra": [ {"@typeof": "Requirement", "name": "notebook", "extra": ["notebook"], "requirements": [], #TODO "requirementstr": "extra == 'notebook'" }, {"name": "numpy", "extra": ["test"], "requirements": #TODO, "requirementstr": "python_version >= \"3.4\" and extra == 'test'" }, ... ] } > There are two recent developments that I think may provide those > missing design constraints and hence motivation to finalise a metadata > 2.0 specification: > > 1. the wheel-to-egg support in humpty (and hence zc.buiidout). That > makes humpty a concrete non-traditional installer that would benefit > from both a modernised standard metadata format, as well as common > tools both to convert legacy metadata to the agreed modern format and > to convert the modern format back to the legacy format for inclusion > in the generated egg files (as then humpty could just re-use the > shared tools, rather than having to maintain those capabilities > itself). class PackageMetadata def __init__(): self.data = collections.OrderedDict() @staticmethod def read_legacy() def read_metadata_json() def read_pydist_json() def read_pyproject_toml() def read_jsonld() def to_legacy(): def to_metadata_json() def to_pydist_json() def to_pyproject_toml() def to_jsonld() @classmethod def Legacy() def MetadataJson() def PydistJson() def PyprojectToml() def Jsonld(cls, *args, **kwargs) obj = cls(*args, **kwargs) obj.read_jsonld(*args, **kwargs) return obj @classmethod def from(cls, path, format='legacy|metadatajson|pydistjson|pyprojecttoml|jsonld'): # or this ... for maximum reusability, we really shouldn't need an adapter registry here; > 2. the new pipenv project to provide a simpler alternative to the > pip+virtualenv+pip-tools combination for environment management in web > service development (and similar layered application architectures). > As with the "install vs setup" split in setuptools, pipenv settled on > an "only two kinds of requirement (deployment and development)" model > for usability reasons, but it also distinguishes abstract dependencies > stored in Pipfile from pinned concrete dependencies stored in > Pipfile.lock. > Does the Pipfile/Pipfile.lock distinction overlap with 'integrates' as a replacement for meta_requires? > > If we put those together with the existing interest in automating > generation of policy compliant operating system distribution packages, > Downstream OS packaging could easily (and without permission) include extra attributes (properties specified with full URIS) in JSONLD metadata. > it makes it easier to go through the proposed semantic dependency > model in PEP 426 and ask "How would we populate these fields based on > the metadata that projects *already* publish?". > See 'class PackageMetadata' > > - "run requires": straightforward, as these are the standard > dependencies used in most projects. Not entirely clear how to gently > (or strongly!) discourage dependency pinning when publishing to PyPI > (although the Pipfile and Pipfile.lock model used in pipenv may help > with this) > - "meta requires": not clear at all, as this was added to handle cases > like PyObjC, where the main package is just a metapackage that makes a > particular set of versioned subpackages easy to install. This may be > better modeled as a separate "integrates" field, using a declaration > syntax more akin to that used for Pipfile.lock rather than that used > for normal requirements declarations. > - "dev requires": corresponds to "dev-packages" in pipenv > - "build requires": corresponds to "setup_requires" in setuptools, > "build-system.requires" + any dynamic build dependencies in PEP 518 > - "test requires": corresponds to "test" extra in > https://packaging.python.org/specifications/#provides-extra-multiple-use > > The "doc" extra in > https://packaging.python.org/specifications/#provides-extra-multiple-use > would map to "build requires", but there's potential benefit to > redistributors in separating it out, as we often split the docs out > from the built software components (since there's little reason to > install documentation on headless servers that are only going to be > debugged remotely). > > The main argument against "test requires" and "doc requires" is that > the extras system already works fine for those - "pip install > MyProject[test]" and "pip install MyProject[doc]" are both already > supported, so metadata 2.0 just needs to continue to reserve those as > semantically significant extras names. > > "dev" requires could be handled the same way - anything you actually > need to *build* an sdist or wheel archive from a source repository > should be in "setup_requires" (setuptools) or "build-system.requires" > (pyproject.toml), so "dev" would just be a conventional extra name > rather than a top level field. > > That just leaves "build_requires", which turns out to interact > awkwardly with the "extras" system: if you write "pip install > MyProject[test]", does it install all the "test" dependencies, > regardless of whether they're listed in run_requires or > build_requires? > > If yes: then why are run_requires and build_requires separate? > If no: then how do you request installation of the "test" build extra? > Or are build extras prohibited entirely? > > That suggests that perhaps "build" should just be a conventional extra > as well, and considered orthogonal to the other conventional extras. > (I'm sure this idea has been suggested before, but I don't recall who > suggested it or when) > > And if build, test, doc, and dev are all handled as extras, then the > top level name "run_requires" no longer makes sense, and the field > name should go back to just being "requires". > Under that evaluation, we'd be left with only the following top level > fields defined for dependency declarations: > > - "requires": list where entries are either a string containing a PEP > 508 dependency specifier or else a hash map contain a "requires" key > plus "extra" or "environment" fields as qualifiers > +1 > - "integrates": replacement for "meta_requires" that only allows > pinned dependencies (i.e. hash maps with "name" & "version" fields, or > direct URL references, rather than a general PEP 508 specifier as a > string) > Pipfile.lock? What happens here when something is listed in both requires and integrates? Where/do these get merged on the "name" attr as a key, given a presumed namespace URI prefix (https://pypi.org/project/)? > > For converting old metadata, any concrete dependencies that are > compatible with the "integrates" field format would be mapped that > way, while everything else would be converted to "requires" entries. > What heuristic would help identify compatibility with the integrates field? > The semantic differences between normal runtime dependencies and > "dev", "test", "doc" and "build" requirements would be handled as > extras, regardless of whether you were using the old metadata format > or the new one. > +1 from me. I can't recall whether I've used {"dev", "test", "doc", and "build"} as extras names in the past; though I can remember thinking "wouldn't it be more intuitive to do it [that way]" Is this backward compatible? Extras still work as extras? > > Going the other direction would be similarly straightforward since > (excluding extensions) the set of required conceptual entities has > been reduced back to the set that already exists in the current > metadata formats. While "requires" and "integrates" would be distinct > fields in pydist.json, the decomposed fields in the latter would map > back to their string-based counterparts in PEP 508 when converted to > the legacy metadata formats. > > Cheers, > Nick. > > P.S. I'm definitely open to a PR that amends the PEP 426 draft along > these lines. I'll get to it eventually myself, but there are some > other things I see as higher priority for my open source time at the > moment (specifically the C locale handling behaviour of Python 3.6 in > Fedora 26 and the related upstream proposal for Python 3.7 in PEP 538) > I need to find a job; my time commitment here is inconsistent. I'm working on a project (nbmeta) for generating, displaying, and embedding RDFa and JSONLD in Jupyter notebooks (w/ _repr_html_() and an OrderedDict) which should refresh the JSONLD @context-writing skills necessary to define the RDFS vocabulary we could/should have at https://schema.python.org/ . - [ ] JSONLD PEP (<- PEP426) - [ ] examples / test cases - I've referenced IPython as an example package; are there other hard test cases for python packaging metadata conversion? (i.e. one that uses every feature of each metadata format)? - [ ] JSONLD @context - [ ] class PackageMetadata - [ ] wheel: (additionally) generate JSONLD metadata - [ ] schema.python.org: master, gh-pages (or e.g. " https://www.pypa.io/ns#") - [ ] warehouse: add a ./jsonld view (to elgacy?) https://github.com/pypa/interoperability-peps/issues/31 > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Feb 15 08:27:10 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 15 Feb 2017 14:27:10 +0100 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com>

Message-ID: On 15 February 2017 at 12:58, Nathaniel Smith wrote: > On Wed, Feb 15, 2017 at 3:33 AM, Nick Coghlan wrote: >> - "requires": list where entries are either a string containing a PEP >> 508 dependency specifier or else a hash map contain a "requires" key >> plus "extra" or "environment" fields as qualifiers >> - "integrates": replacement for "meta_requires" that only allows >> pinned dependencies (i.e. hash maps with "name" & "version" fields, or >> direct URL references, rather than a general PEP 508 specifier as a >> string) > > What's accomplished by separating these? I really think we should > strive to have fewer more orthogonal concepts whenever possible... It's mainly a matter of incorporating https://caremad.io/posts/2013/07/setup-vs-requirement/ into the core data model, as this distinction between abstract development dependencies and concrete deployment dependencies is incredibly important for any scenario that involves publisher-redistributor-consumer chains, but is entirely non-obvious to folks that are only familiar with the publisher-consumer case that comes up during development-for-personal-and-open-source-use. One particular area where this is problematic is in the widespread advice "always pin your dependencies" which is usually presented without the all important "for application or service deployment" qualifier. As a first approximation: pinning-for-app-or-service-deployment == good, pinning-for-local-testing == good, pinning-for-library-or-framework-publication-to-PyPI == bad. pipenv borrows the Ruby solution to modeling this by having Pipfile for abstract dependency declarations and Pipfile.lock for concrete integration testing ones, so the idea here is to propagate that model to pydist.json by separating the "requires" field with abstract development dependencies from the "integrates" field with concrete deployment dependencies. In the vast majority of publication-to-PyPi cases people won't need the "integrates" field, since what they're publishing on PyPI will just be their abstract dependencies, and any warning against using "==" will recommend using "~=" or ">=" instead. But there *are* legitimate uses of pinning-for-publication (like the PyObjC metapackage bundling all its subcomponents, or when building for private deployment infastructure), so there needs to be a way to represent "Yes, I'm pinning this dependency for publication, and I'm aware of the significance of doing so" Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From njs at pobox.com Wed Feb 15 09:11:47 2017 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 15 Feb 2017 06:11:47 -0800 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com>

Message-ID: On Wed, Feb 15, 2017 at 5:27 AM, Nick Coghlan wrote: > On 15 February 2017 at 12:58, Nathaniel Smith wrote: >> On Wed, Feb 15, 2017 at 3:33 AM, Nick Coghlan wrote: >>> - "requires": list where entries are either a string containing a PEP >>> 508 dependency specifier or else a hash map contain a "requires" key >>> plus "extra" or "environment" fields as qualifiers >>> - "integrates": replacement for "meta_requires" that only allows >>> pinned dependencies (i.e. hash maps with "name" & "version" fields, or >>> direct URL references, rather than a general PEP 508 specifier as a >>> string) >> >> What's accomplished by separating these? I really think we should >> strive to have fewer more orthogonal concepts whenever possible... > > It's mainly a matter of incorporating > https://caremad.io/posts/2013/07/setup-vs-requirement/ into the core > data model, as this distinction between abstract development > dependencies and concrete deployment dependencies is incredibly > important for any scenario that involves > publisher-redistributor-consumer chains, but is entirely non-obvious > to folks that are only familiar with the publisher-consumer case that > comes up during development-for-personal-and-open-source-use. Maybe I'm just being dense but, umm. I don't know what any of these words mean :-). I'm not unfamiliar with redistributors; part of my confusion is that this is a concept that AFAIK distro package systems don't have. Maybe it would help if you have a concrete example of a scenario where they would benefit from having this distinction? > One particular area where this is problematic is in the widespread > advice "always pin your dependencies" which is usually presented > without the all important "for application or service deployment" > qualifier. As a first approximation: > pinning-for-app-or-service-deployment == good, > pinning-for-local-testing == good, > pinning-for-library-or-framework-publication-to-PyPI == bad. > > pipenv borrows the Ruby solution to modeling this by having Pipfile > for abstract dependency declarations and Pipfile.lock for concrete > integration testing ones, so the idea here is to propagate that model > to pydist.json by separating the "requires" field with abstract > development dependencies from the "integrates" field with concrete > deployment dependencies. What's the benefit of putting this in pydist.json? I feel like for the usual deployment cases (a) going straight from Pipfile.lock -> venv is pretty much sufficient, with no need to put this into a package, but (b) if you really do want to put it into a package, then the natural approach would be to make an empty wheel like "my-django-app-deploy.whl" whose dependencies were the contents of Pipfile.lock. There's certainly a distinction to be made between the abstract dependencies and the exact locked dependencies, but to me the natural way to model that distinction is by re-using the distinction we already have been source packages and binary packages. The build process for this placeholder wheel is to "compile down" the abstract dependencies into concrete dependencies, and the resulting wheel encodes the result of this compilation. Again, no new concepts needed. > In the vast majority of publication-to-PyPi cases people won't need > the "integrates" field, since what they're publishing on PyPI will > just be their abstract dependencies, and any warning against using > "==" will recommend using "~=" or ">=" instead. But there *are* > legitimate uses of pinning-for-publication (like the PyObjC > metapackage bundling all its subcomponents, or when building for > private deployment infastructure), so there needs to be a way to > represent "Yes, I'm pinning this dependency for publication, and I'm > aware of the significance of doing so" Why can't PyObjC just use regular dependencies? That's what distro metapackages have done for decades, right? -n -- Nathaniel J. Smith -- https://vorpus.org From dholth at gmail.com Wed Feb 15 09:24:41 2017 From: dholth at gmail.com (Daniel Holth) Date: Wed, 15 Feb 2017 14:24:41 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com>

Message-ID: Wheel puts everything important in METADATA, except entry_points.txt. The requirements expressed there under 'Requires-Dist' are reliable, and the full METADATA format is documented in the pre-JSON revision of PEP 426. At runtime, once pkg_resources parses it, *.egg-info and *.dist-info look identical, because it's just a different way to represent the same data. Wheel's version of METADATA exists as the simplest way to add the critical 'extras' feature to distutils2-era *.dist-info/METADATA, necessary to losslessly represent setuptools packages in a more PEP-standard way. I could have completely redesigned the METADATA format instead of extending it, but then I would have run out of time and wheel would not exist. This function converts egg-info metadata to METADATA https://bitbucket.org/pypa/wheel/src/54ddbcc9cec25e1f4d111a142b8bfaa163130a61/wheel/metadata.py?at=default&fileviewer=file-view-default#metadata.py-240 This one converts to the JSON format. It looks like it might work with PKG-INFO or METADATA. https://bitbucket.org/pypa/wheel/src/54ddbcc9cec25e1f4d111a142b8bfaa163130a61/wheel/metadata.py?at=default&fileviewer=file-view-default#metadata.py-98 On Wed, Feb 15, 2017 at 8:27 AM Nick Coghlan wrote: > On 15 February 2017 at 12:58, Nathaniel Smith wrote: > > On Wed, Feb 15, 2017 at 3:33 AM, Nick Coghlan > wrote: > >> - "requires": list where entries are either a string containing a PEP > >> 508 dependency specifier or else a hash map contain a "requires" key > >> plus "extra" or "environment" fields as qualifiers > >> - "integrates": replacement for "meta_requires" that only allows > >> pinned dependencies (i.e. hash maps with "name" & "version" fields, or > >> direct URL references, rather than a general PEP 508 specifier as a > >> string) > > > > What's accomplished by separating these? I really think we should > > strive to have fewer more orthogonal concepts whenever possible... > > It's mainly a matter of incorporating > https://caremad.io/posts/2013/07/setup-vs-requirement/ into the core > data model, as this distinction between abstract development > dependencies and concrete deployment dependencies is incredibly > important for any scenario that involves > publisher-redistributor-consumer chains, but is entirely non-obvious > to folks that are only familiar with the publisher-consumer case that > comes up during development-for-personal-and-open-source-use. > > One particular area where this is problematic is in the widespread > advice "always pin your dependencies" which is usually presented > without the all important "for application or service deployment" > qualifier. As a first approximation: > pinning-for-app-or-service-deployment == good, > pinning-for-local-testing == good, > pinning-for-library-or-framework-publication-to-PyPI == bad. > > pipenv borrows the Ruby solution to modeling this by having Pipfile > for abstract dependency declarations and Pipfile.lock for concrete > integration testing ones, so the idea here is to propagate that model > to pydist.json by separating the "requires" field with abstract > development dependencies from the "integrates" field with concrete > deployment dependencies. > > In the vast majority of publication-to-PyPi cases people won't need > the "integrates" field, since what they're publishing on PyPI will > just be their abstract dependencies, and any warning against using > "==" will recommend using "~=" or ">=" instead. But there *are* > legitimate uses of pinning-for-publication (like the PyObjC > metapackage bundling all its subcomponents, or when building for > private deployment infastructure), so there needs to be a way to > represent "Yes, I'm pinning this dependency for publication, and I'm > aware of the significance of doing so" > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Feb 15 09:55:53 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 15 Feb 2017 15:55:53 +0100 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com> Message-ID: On 15 February 2017 at 14:00, Wes Turner wrote: > On Wed, Feb 15, 2017 at 5:33 AM, Nick Coghlan wrote: >> I asked Daniel to *stop* using pydist.json, since wheel was emitting a >> point-in-time snapshot of PEP 426 (which includes a lot of >> potentially-nice-to-have things that nobody has actually implemented >> so far, like the semantic dependency declarations and the enhancements >> to the extras syntax), rather than the final version of the spec. > > Would you send a link to the source for this? It came up when Vinay reported a problem with the way bdist_wheel was handling combined extras and environment marker definitions: https://bitbucket.org/pypa/wheel/issues/103/problem-with-currently-generated >> - dist-info/METADATA as defined at >> https://packaging.python.org/specifications/#package-distribution-metadata >> - dist-info/requires.txt runtime dependencies as defined at >> http://setuptools.readthedocs.io/en/latest/formats.html#requires-txt >> - dist-info/setup_requires.txt build time dependencies as defined at >> http://setuptools.readthedocs.io/en/latest/formats.html#setup-requires-txt >> >> The dependency fields in METADATA itself unfortunately aren't really >> useful for anything. > > Graph: Nodes and edges. Unfortunately, it's not that simple, since: - dependency declarations refer to time dependent node *sets*, not to specific edges - node resolution is not only time dependent, but also DNS and client configuration dependent - this is true even for "pinned" dependencies due to the way "==" handles post-releases and local build IDs - the legacy module based declarations are inconsistently populated and don't refer to nodes by a useful name - the new distribution package based declarations refer to nodes by a useful name, but largely aren't populated By contrast, METADATA *does* usefully define nodes in the graph, while requires.txt and setup_requires.txt can be used to extract edges when combined with suitable additional data sources (primarily a nominated index server or set of index servers to use for dependency specifier resolution). >> There's definitely still a place for a pydist.json created by going >> through PEP 426, comparing it to what bdist_wheel already does to >> populate metadata.json, and either changing the PEP to match the >> existing practice, or else agreeing that we prefer what the PEP >> recommends, that we want to move in that direction, and that there's a >> definite commitment to implement the changes in at least setuptools >> and bdist_wheel (plus a migration strategy that allows for reasonably >> sensible consumption of old metadata). > > Which function reads metadata.json? Likely eventually nothing, since anything important that it contains will be readable from either pydist.json or from the other legacy metadata files. > Which function reads pydist.json? Eventually everything, with tools falling back to dynamically generating it from legacy metadata formats as a transition plan to handle component releases made with older toolchains. > An RDFS Vocabulary contains Classes and Properties with rdfs:ranges and > rdfs:domains. > > There are many representations for RDF: RDF/XML, Turtle/N3, JSONLD. > > RDF is implementation-neutral. JSONLD is implementation-neutral. While true, both of these are still oriented towards working with a *resolved* graph snapshot, rather than a deliberately underspecified graph description that requires subsequent resolution within the node set of a particular index server (or set of index servers). Just incorporating the time dimension is already messy, even before accounting for the fact that the metadata carried with along the artifacts is designed to be independent of the particular server that happens to be hosting it. Tangent: if anyone is looking for an open source stack for working with distributed graph storage manipulation from Python, the combination of http://janusgraph.org/ and https://pypi.org/project/gremlinpython/ is well worth a look ;) >> The equivalent for PEP 426 would probably be legacy-to-pydist and >> pydist-to-legacy converters that setuptools, bdist_wheel and other >> publishing tools can use to ship legacy metadata alongside the >> standardised format (and I believe Daniel already has at least the >> former in order to generate metadata.json in bdist_wheel). With PEP >> 426 as currently written, a pydist-to-legacy converter isn't really >> feasible, since pydist proposes new concepts that can't be readily >> represented in the old format. > > pydist-to-legacy would be a lossy transformation. Given appropriate use of the "extras" system and a couple of new METADATA fields, it doesn't have to be, at least for the initial version - that's the new design constraint I'm proposing for everything that isn't defined as a metadata extension. The rationale being that if legacy dependency metadata can be reliably generated from the new format, that creates an incentive for *new* tools to adopt it ("generate the new format, get the legacy formats for free"), while also offering a clear migration path for existing publishing tools (refactor their metadata generation to produce the new format only, then derive the legacy metadata files from that) and consumption tools (consume the new fields immediately, look at consuming the new files later). >> > I understand that social reasons are often more important than technical >> > reasons >> > when it comes to success or failure of an approach; I'm just not sure >> > that >> > in this case, it wasn't given up on too early. >> >> I think of PEP 426 as "deferred indefinitely pending specific >> practical problems to provide clearer design constraints" rather than >> abandoned :) > > Is it too late to request lowercased property names without dashes? That's already the case in PEP 426 as far as I know. > class PackageMetadata > def __init__(): > self.data = collections.OrderedDict() > @staticmethod > def read_legacy() > def read_metadata_json() > def read_pydist_json() > def read_pyproject_toml() > def read_jsonld() > > def to_legacy(): > def to_metadata_json() > def to_pydist_json() > def to_pyproject_toml() > def to_jsonld() > > @classmethod > def Legacy() > def MetadataJson() > def PydistJson() > def PyprojectToml() > def Jsonld(cls, *args, **kwargs) > obj = cls(*args, **kwargs) > obj.read_jsonld(*args, **kwargs) > return obj > > @classmethod > def from(cls, path, > format='legacy|metadatajson|pydistjson|pyprojecttoml|jsonld'): > # or this > > > ... for maximum reusability, we really shouldn't need an adapter registry > here; I'm not really worried about the Python API at this point, I'm interested in the isomorphism of the data formats to help streamline the migration (as that's the current main problem with PEP 426). But yes, just as packaging grew "LegacyVersion" *after* PEP 440 defined the strict forward looking semantics, it will likely grow some additional tools for reading and converting the legacy formats once there's a clear pydist.json specification to document the semantics of the translated fields. >> 2. the new pipenv project to provide a simpler alternative to the >> pip+virtualenv+pip-tools combination for environment management in web >> service development (and similar layered application architectures). >> As with the "install vs setup" split in setuptools, pipenv settled on >> an "only two kinds of requirement (deployment and development)" model >> for usability reasons, but it also distinguishes abstract dependencies >> stored in Pipfile from pinned concrete dependencies stored in >> Pipfile.lock. > > Does the Pipfile/Pipfile.lock distinction overlap with 'integrates' as a > replacement for meta_requires? Somewhat - the difference is that where the concrete dependencies in Pipfile.lock are derived from the abstract dependencies in Pipfile, the separation in pydist.json would be a declaration of "Yes, I really did mean to publish this with a concrete dependency, it's not an accident". >> If we put those together with the existing interest in automating >> generation of policy compliant operating system distribution packages, > > > Downstream OS packaging could easily (and without permission) include extra > attributes (properties specified with full URIS) in JSONLD metadata. We can already drop arbitrary files into dist-info directories if we really want to, but in practice that extra metadata tends to end up in the system level package database rather than in the Python metadata. >> - "integrates": replacement for "meta_requires" that only allows >> pinned dependencies (i.e. hash maps with "name" & "version" fields, or >> direct URL references, rather than a general PEP 508 specifier as a >> string) > > > Pipfile.lock? > > What happens here when something is listed in both requires and integrates? Simplest would be to treat it the same way that tools treat mentioning the same component in multiple requirements entries (since that's really what you'd be doing). > Where/do these get merged on the "name" attr as a key, given a presumed > namespace URI prefix (https://pypi.org/project/)? For installation purposes, they'd be combined into a single requirements set. >> For converting old metadata, any concrete dependencies that are >> compatible with the "integrates" field format would be mapped that >> way, while everything else would be converted to "requires" entries. > > What heuristic would help identify compatibility with the integrates field? PEP 440 version matching (==), arbitrary equality (===), and direct references (@...), with the latter being disallowed on PyPI (but fine when using a private index server). >> The semantic differences between normal runtime dependencies and >> "dev", "test", "doc" and "build" requirements would be handled as >> extras, regardless of whether you were using the old metadata format >> or the new one. > > +1 from me. > > I can't recall whether I've used {"dev", "test", "doc", and "build"} as > extras names in the past; though I can remember thinking "wouldn't it be > more intuitive to do it [that way]" > > Is this backward compatible? Extras still work as extras? Yeah, this is essentially the way Provide-Extra ended up being documented in https://packaging.python.org/specifications/#provides-extra-multiple-use That already specifies the expected semantics for "test" and "doc", so it would be a matter of adding "dev" and "build" (as well as surveying PyPI for components that already defined those extras) >> P.S. I'm definitely open to a PR that amends the PEP 426 draft along >> these lines. I'll get to it eventually myself, but there are some >> other things I see as higher priority for my open source time at the >> moment (specifically the C locale handling behaviour of Python 3.6 in >> Fedora 26 and the related upstream proposal for Python 3.7 in PEP 538) > > I need to find a job; my time commitment here is inconsistent. Yeah, I assume work takes precedence for everyone, which is why I spend time needling redistributors and major end users about the disparity between "level of use" and "level of investment" when it comes to the upstream Python packaging ecosystem. While progress on that front isn't particularly visible yet, the nature of the conversations are changing in a good > I'm working on a project (nbmeta) for generating, displaying, and embedding > RDFa and JSONLD in Jupyter notebooks (w/ _repr_html_() and an OrderedDict) > which should refresh the JSONLD @context-writing skills necessary to define > the RDFS vocabulary we could/should have at https://schema.python.org/ . I'm definitely open to ensuring the specs are RDF/JSONLD friendly, especially as some of the characteristics of that are beneficial in other kinds of mappings as well (e.g. lists-of-hash-maps-with-fixed-key-names are easier to work with than hash-maps-with-data-dependent-key-names for a whole lot of reasons). > - [ ] JSONLD PEP (<- PEP426) > - [ ] examples / test cases > - I've referenced IPython as an example package; are there other hard > test cases for python packaging metadata conversion? (i.e. one that uses > every feature of each metadata format)? PyObjC is my standard example for legitimate version pinning in a public project (it's a metapackage where each release just depends on particular versions of the individual components) django-mezzanine is one I like as a decent example of a reasonably large dependency tree for something that still falls short of a complete application setuptools is a decent example for basic use of environment markers I haven't found great examples for defining lots of extras or using complex environment marker options (but I also haven't really gone looking) > - [ ] JSONLD @context > - [ ] class PackageMetadata > - [ ] wheel: (additionally) generate JSONLD metadata > - [ ] schema.python.org: master, gh-pages (or e.g. > "https://www.pypa.io/ns#") > > - [ ] warehouse: add a ./jsonld view (to elgacy?) This definitely won't be an option for the legacy service, but it could be an interesting addition to Warehouse. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From p.f.moore at gmail.com Wed Feb 15 09:58:01 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 15 Feb 2017 14:58:01 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com>

Message-ID: On 15 February 2017 at 14:11, Nathaniel Smith wrote: >> It's mainly a matter of incorporating >> https://caremad.io/posts/2013/07/setup-vs-requirement/ into the core >> data model, as this distinction between abstract development >> dependencies and concrete deployment dependencies is incredibly >> important for any scenario that involves >> publisher-redistributor-consumer chains, but is entirely non-obvious >> to folks that are only familiar with the publisher-consumer case that >> comes up during development-for-personal-and-open-source-use. > > Maybe I'm just being dense but, umm. I don't know what any of these > words mean :-). I'm not unfamiliar with redistributors; part of my > confusion is that this is a concept that AFAIK distro package systems > don't have. Maybe it would help if you have a concrete example of a > scenario where they would benefit from having this distinction? I'm also finding this discussion bafflingly complex. I understand that distributions need a way to work with Python packages, but the issues involved seem completely divorced from the basic process of a user using pip to install a package with the dependencies it needs to work in their program. The package metadata standardisation process seems to be falling foul of a quest for perfection. Is there no 80% solution that covers the bulk of use cases (which, in my mind, are all around some user wanting to say "pip install" to grab some stuff off PyPI to build his project)? Or is the 80% solution precisely what we have at the moment, in which case can't we standardise what we have, and *then* look to extend to cover the additional requirements? I'm sure I'm missing something - but honestly, I'm not sure what it is. If I write something to go on PyPI, I assume that makes me a "publisher"? IMO, my audience is people who use my software (the "consumers" in your terms, I guess). While I'd be pleased if a distributor like Ubuntu or Fedora or Anaconda wanted to include my package in their distribution, I wouldn't see them as my end users - so while I'd be OK with tweaking my code/metadata to accommodate their needs, it's not a key goal for me. And learning all the metadata concepts related to packaging my project for distributors wouldn't appeal to me at all. I'd be happy for the distributions to to that and send me PRs, but the burden should be on them to do that. The complexities we're debating here seem to be based on the idea that *I* should understand the distributor's role in order to package my code "correctly". I'm not at all sure I agree with that. Maybe this is all a consequence of Python now being used in "big business", and the old individual developer scratching his or her own itch model is gone. And maybe that means PyPI is no longer a suitable place for such "use at your own risk" code But if that's the case, maybe we need to acknowledge that fact, before we end up with people getting the idea that "Python packaging is too complex for the average developer". Because it's starting to feel that way :-( Paul From vinay_sajip at yahoo.co.uk Wed Feb 15 10:31:04 2017 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Wed, 15 Feb 2017 15:31:04 +0000 (UTC) Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com>

Message-ID: <159421251.8715496.1487172664114@mail.yahoo.com> > the full METADATA format is documented in the pre-JSON revision of PEP 426. Can you confirm which exact revision in the PEPs repo you mean? I could guess at 0451397. That version does not refer to a field "Requires" (rather, the more recent "Requires-Dist"). Your conversion function reads the existing PKG-INFO, updates the Metadata-Version, and adds "Provides-Dist" and "Requires-Dist". It does not check whether the result conforms to that version of the PEP. For example, in the presence of "Requires" in PKG-INFO, you add "Requires-Dist", possibly leading to an ambiguity, because they sort of mean the same thing but could contain conflicting information (for example, different version constraints). The python-dateutils wheel which Jim referred to contained both "Requires" and "Requires-Dist" fields in its METADATA file, and, faced with a metadata set with both fields, the old packaging code used by distlib to handle the different metadata versions raised a "Unknown metadata set" error. In the face of ambiguity, it's refusing the temptation to guess :-) If the conversion function adds "Requires-Dist" but doesn't remove "Requires", I'm not sure it conforms to that version of the PEP. Regards, Vinay Sajip From dholth at gmail.com Wed Feb 15 10:40:47 2017 From: dholth at gmail.com (Daniel Holth) Date: Wed, 15 Feb 2017 15:40:47 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: <159421251.8715496.1487172664114@mail.yahoo.com> References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com>

<159421251.8715496.1487172664114@mail.yahoo.com> Message-ID: IIUC PEP 345, the predecessor of PEP 426, replaced Requires with Requires-Dist because the former was never very well specified, easier to re-name the field rather than redefine it. bdist_wheel's egg-info conversion assumes the only useful requirements are in the setuptools requires.txt. It would make sense to go ahead and delete the obsolete fields, I'm sure they were overlooked because they are not common in the wild. >From PEP 345: - Deprecated fields: - Requires (in favor of Requires-Dist) - Provides (in favor of Provides-Dist) - Obsoletes (in favor of Obsoletes-Dist) On Wed, Feb 15, 2017 at 10:31 AM Vinay Sajip wrote: > > the full METADATA format is documented in the pre-JSON revision of PEP > 426. > > Can you confirm which exact revision in the PEPs repo you mean? I could > guess at > 0451397. That version does not refer to a field "Requires" (rather, the > more recent > "Requires-Dist"). Your conversion function reads the existing PKG-INFO, > updates the > Metadata-Version, and adds "Provides-Dist" and "Requires-Dist". It does > not check > whether the result conforms to that version of the PEP. For example, in > the presence > of "Requires" in PKG-INFO, you add "Requires-Dist", possibly leading to an > ambiguity, > because they sort of mean the same thing but could contain conflicting > information > (for example, different version constraints). The python-dateutils wheel > which Jim > referred to contained both "Requires" and "Requires-Dist" fields in its > METADATA > file, and, faced with a metadata set with both fields, the old packaging > code used > by distlib to handle the different metadata versions raised a "Unknown > metadata set" > error. In the face of ambiguity, it's refusing the temptation to guess :-) > > If the conversion function adds "Requires-Dist" but doesn't remove > "Requires", I'm not > sure it conforms to that version of the PEP. > > Regards, > > Vinay Sajip > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Feb 15 10:41:48 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 15 Feb 2017 16:41:48 +0100 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com>

Message-ID: On 15 February 2017 at 15:11, Nathaniel Smith wrote: > On Wed, Feb 15, 2017 at 5:27 AM, Nick Coghlan wrote: >> It's mainly a matter of incorporating >> https://caremad.io/posts/2013/07/setup-vs-requirement/ into the core >> data model, as this distinction between abstract development >> dependencies and concrete deployment dependencies is incredibly >> important for any scenario that involves >> publisher-redistributor-consumer chains, but is entirely non-obvious >> to folks that are only familiar with the publisher-consumer case that >> comes up during development-for-personal-and-open-source-use. > > Maybe I'm just being dense but, umm. I don't know what any of these > words mean :-). I'm not unfamiliar with redistributors; part of my > confusion is that this is a concept that AFAIK distro package systems > don't have. Maybe it would help if you have a concrete example of a > scenario where they would benefit from having this distinction? It's about error messages and nudges in the UX: if PyPI rejects version pinning in "requires" by default, then that creates an opportunity to nudge people towards using "~=" or ">=" instead (as in the vast majority of cases, that will be a better option than pinning-for-publication). The inclusion of "integrates" then adds back the support for legitimate version pinning use cases in pydist.json in a way that makes it clear that it is a conceptually distinct operation from a normal dependency declaration. >> pipenv borrows the Ruby solution to modeling this by having Pipfile >> for abstract dependency declarations and Pipfile.lock for concrete >> integration testing ones, so the idea here is to propagate that model >> to pydist.json by separating the "requires" field with abstract >> development dependencies from the "integrates" field with concrete >> deployment dependencies. > > What's the benefit of putting this in pydist.json? I feel like for the > usual deployment cases (a) going straight from Pipfile.lock -> venv is > pretty much sufficient, with no need to put this into a package, but > (b) if you really do want to put it into a package, then the natural > approach would be to make an empty wheel like > "my-django-app-deploy.whl" whose dependencies were the contents of > Pipfile.lock. My goal with the split is to get to a state where: - exactly zero projects on PyPI use "==" or "===" in their requires metadata (because PyPI explicitly prohibits it) - the vast majority of projects on PyPI *don't* have an "integrates" section - those projects that do have an `integrates` section have a valid reason for it (like PyObjC) For anyone making the transition from application and web service development to library and framework development, the transition from "always pin exact versions of your dependencies for deployment" to "when publishing a library or framework, only rule out the combinations that you're pretty sure *won't* work" is one of the trickiest to deal with as current tools *don't alert you to the fact that there's a difference to be learned*. Restricting what can go into requires creates an opportunity to ask users whether they're publishing a pre-integrated project or not: if yes, then they add the "integrates" field and put their pinned dependencies there; if not, then they relax the "==" constraints to "~=" or ">=". Either way, PyPI will believe your answer, it's just refusing the temptation to guess that using "==" or "===" in the requires section is sufficient to indicate that you're deliberately publishing a pre-integrated project. > There's certainly a distinction to be made between the abstract > dependencies and the exact locked dependencies, but to me the natural > way to model that distinction is by re-using the distinction we > already have been source packages and binary packages. The build > process for this placeholder wheel is to "compile down" the abstract > dependencies into concrete dependencies, and the resulting wheel > encodes the result of this compilation. Again, no new concepts needed. Source vs binary isn't where the distinction applies, though. For example, it's legitimate for PyObjC to have pinned dependencies even when distributed in source form, as it's a metapackage used solely to integrate the various PyObjC subprojects into a single "release". >> In the vast majority of publication-to-PyPi cases people won't need >> the "integrates" field, since what they're publishing on PyPI will >> just be their abstract dependencies, and any warning against using >> "==" will recommend using "~=" or ">=" instead. But there *are* >> legitimate uses of pinning-for-publication (like the PyObjC >> metapackage bundling all its subcomponents, or when building for >> private deployment infastructure), so there needs to be a way to >> represent "Yes, I'm pinning this dependency for publication, and I'm >> aware of the significance of doing so" > > Why can't PyObjC just use regular dependencies? That's what distro > metapackages have done for decades, right? If PyObjC uses regular dependencies then there's no opportunity for PyPI to ask "Did you really mean that?" when people pin dependencies in "requires". That makes it likely we'll end up with a lot of unnecessarily restrictive "==" constraints in PyPI packages ("Works on my machine!"), which creates problems when attempting to auto-generate distro packages from upstream ones. The distro case isn't directly analagous, since there are a few key differences: - open publication platform rather than a pre-approved set of package maintainers - no documented packaging policies with related human review & approval processes - a couple of orders magnitude difference in the number of packages involved - at least in RPM, you can have a spec file with no source tarball, which makes it obvious it's a metapackage Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Wed Feb 15 10:48:44 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 15 Feb 2017 16:48:44 +0100 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com>

Message-ID: On 15 February 2017 at 15:58, Paul Moore wrote: > On 15 February 2017 at 14:11, Nathaniel Smith wrote: >>> It's mainly a matter of incorporating >>> https://caremad.io/posts/2013/07/setup-vs-requirement/ into the core >>> data model, as this distinction between abstract development >>> dependencies and concrete deployment dependencies is incredibly >>> important for any scenario that involves >>> publisher-redistributor-consumer chains, but is entirely non-obvious >>> to folks that are only familiar with the publisher-consumer case that >>> comes up during development-for-personal-and-open-source-use. >> >> Maybe I'm just being dense but, umm. I don't know what any of these >> words mean :-). I'm not unfamiliar with redistributors; part of my >> confusion is that this is a concept that AFAIK distro package systems >> don't have. Maybe it would help if you have a concrete example of a >> scenario where they would benefit from having this distinction? > > I'm also finding this discussion bafflingly complex. I understand that > distributions need a way to work with Python packages, but the issues > involved seem completely divorced from the basic process of a user > using pip to install a package with the dependencies it needs to work > in their program. As simple as I can make it: * pinning dependencies when publishing to PyPI is presumptively bad * PyPI itself (not client tools) should warn you that it's a bad idea * however, there are legitimate use cases for pinning in PyPI packages * so there should be a way to do it, but it should involve telling PyPI "I am an integration project, this is OK" Most people should never touch the "integrates" field, they should just change "==" to "~=" or ">=" to allow for future releases of their dependencies. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From p.f.moore at gmail.com Wed Feb 15 10:49:44 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 15 Feb 2017 15:49:44 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com>

Message-ID: On 15 February 2017 at 15:41, Nick Coghlan wrote: > My goal with the split is to get to a state where: > > - exactly zero projects on PyPI use "==" or "===" in their requires > metadata (because PyPI explicitly prohibits it) > - the vast majority of projects on PyPI *don't* have an "integrates" section > - those projects that do have an `integrates` section have a valid > reason for it (like PyObjC) So how many projects on PyPI currently have == or === in their requires? I've never seen one (although my sample size isn't large - but it does cover major packages in a large-ish range of application areas). I'm curious as to how major this problem is in practice. I (now) understand the theoretical argument for the proposal. Paul From freddyrietdijk at fridh.nl Wed Feb 15 10:50:18 2017 From: freddyrietdijk at fridh.nl (Freddy Rietdijk) Date: Wed, 15 Feb 2017 16:50:18 +0100 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com>

Message-ID: > Maybe it would help if you have a concrete example of a scenario where they would benefit from having this distinction? In the Nix package manager (source distribution with binary substitutes) and Nixpkgs package set we typically require the filename and hash of a package. In our expressions we typically pass an url (that includes the name), and the hash. The url is only needed when the file isn't in our store. This is convenient, because if an url is optional this allows you to pre-fetch or work with mirrors. All we care about is that we get the file, not how it is provided. This applies for source archives, but behind the scenes also for binary substitutes. With Nix, functions build a package, and dependencies are passed as function arguments with names that typically, but not necessarily, resemble the dependency name. Now, a function that builds a package, a package builder, only needs to be provided with abstract dependencies; it just needs to know what it should look for, "we need 'a' numpy, 'a' scipy, 'a compiler that has a certain interface and can do this job'", etc.. Version numbers can help in order to fail prematurely, but generally only bounds, not a pinned value. Its up to another tool to provide the builder with the actual packages, the concrete dependencies to the builder. And this tool might fetch it from PyPI, or from GitHub, or... The same goes for building, distributing and installing Python packages. Setuptools shouldn't bother with versions (except the constraints in case of libraries) or wherever a source comes from but just build or fail. Pip should just fetch/resolve and pass concrete dependencies to whatever builder (Setuptools, Flit), or whatever environment (virtualenv) needs it. It's quite frustrating as a downstream having to deal with packages where versions are pinned unnecessarily and therefore I've also requested on the Setuptools tracker a flag that ignores constraints [1] (though I fear I would have to pull up my sleeves for this one :) ) . [1] https://github.com/pypa/setuptools/issues/894 On Wed, Feb 15, 2017 at 3:11 PM, Nathaniel Smith wrote: > On Wed, Feb 15, 2017 at 5:27 AM, Nick Coghlan wrote: > > On 15 February 2017 at 12:58, Nathaniel Smith wrote: > >> On Wed, Feb 15, 2017 at 3:33 AM, Nick Coghlan > wrote: > >>> - "requires": list where entries are either a string containing a PEP > >>> 508 dependency specifier or else a hash map contain a "requires" key > >>> plus "extra" or "environment" fields as qualifiers > >>> - "integrates": replacement for "meta_requires" that only allows > >>> pinned dependencies (i.e. hash maps with "name" & "version" fields, or > >>> direct URL references, rather than a general PEP 508 specifier as a > >>> string) > >> > >> What's accomplished by separating these? I really think we should > >> strive to have fewer more orthogonal concepts whenever possible... > > > > It's mainly a matter of incorporating > > https://caremad.io/posts/2013/07/setup-vs-requirement/ into the core > > data model, as this distinction between abstract development > > dependencies and concrete deployment dependencies is incredibly > > important for any scenario that involves > > publisher-redistributor-consumer chains, but is entirely non-obvious > > to folks that are only familiar with the publisher-consumer case that > > comes up during development-for-personal-and-open-source-use. > > Maybe I'm just being dense but, umm. I don't know what any of these > words mean :-). I'm not unfamiliar with redistributors; part of my > confusion is that this is a concept that AFAIK distro package systems > don't have. Maybe it would help if you have a concrete example of a > scenario where they would benefit from having this distinction? > > > One particular area where this is problematic is in the widespread > > advice "always pin your dependencies" which is usually presented > > without the all important "for application or service deployment" > > qualifier. As a first approximation: > > pinning-for-app-or-service-deployment == good, > > pinning-for-local-testing == good, > > pinning-for-library-or-framework-publication-to-PyPI == bad. > > > > pipenv borrows the Ruby solution to modeling this by having Pipfile > > for abstract dependency declarations and Pipfile.lock for concrete > > integration testing ones, so the idea here is to propagate that model > > to pydist.json by separating the "requires" field with abstract > > development dependencies from the "integrates" field with concrete > > deployment dependencies. > > What's the benefit of putting this in pydist.json? I feel like for the > usual deployment cases (a) going straight from Pipfile.lock -> venv is > pretty much sufficient, with no need to put this into a package, but > (b) if you really do want to put it into a package, then the natural > approach would be to make an empty wheel like > "my-django-app-deploy.whl" whose dependencies were the contents of > Pipfile.lock. > > There's certainly a distinction to be made between the abstract > dependencies and the exact locked dependencies, but to me the natural > way to model that distinction is by re-using the distinction we > already have been source packages and binary packages. The build > process for this placeholder wheel is to "compile down" the abstract > dependencies into concrete dependencies, and the resulting wheel > encodes the result of this compilation. Again, no new concepts needed. > > > In the vast majority of publication-to-PyPi cases people won't need > > the "integrates" field, since what they're publishing on PyPI will > > just be their abstract dependencies, and any warning against using > > "==" will recommend using "~=" or ">=" instead. But there *are* > > legitimate uses of pinning-for-publication (like the PyObjC > > metapackage bundling all its subcomponents, or when building for > > private deployment infastructure), so there needs to be a way to > > represent "Yes, I'm pinning this dependency for publication, and I'm > > aware of the significance of doing so" > > Why can't PyObjC just use regular dependencies? That's what distro > metapackages have done for decades, right? > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas at kluyver.me.uk Wed Feb 15 11:07:15 2017 From: thomas at kluyver.me.uk (Thomas Kluyver) Date: Wed, 15 Feb 2017 16:07:15 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com>

<159421251.8715496.1487172664114@mail.yahoo.com> Message-ID: <1487174835.595075.881975160.671E3576@webmail.messagingengine.com> On Wed, Feb 15, 2017, at 03:40 PM, Daniel Holth wrote: > It would make sense to go ahead and delete the obsolete fields, I'm > sure they were overlooked because they are not common in the wild. > > From PEP 345: > * Deprecated fields: > * Requires (in favor of Requires-Dist) > * Provides (in favor of Provides-Dist) For reference, packages made with flit do use 'Provides' to indicate the name of the importable module or package that the distribution installs. This seems to me to be something worth exposing - in another thread, we're discussing downloading and scanning packages to get this information. But I accept that it's not very useful while only a tiny minority of packages do it. Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Wed Feb 15 11:14:23 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 15 Feb 2017 16:14:23 +0000 Subject: [Distutils] distlib and wheel metadata In-Reply-To: References: <2019192621.7718748.1487095806195.ref@mail.yahoo.com> <2019192621.7718748.1487095806195@mail.yahoo.com> <425841221.7853973.1487103672849@mail.yahoo.com>