From pradyunsg at gmail.com Wed Mar 1 05:28:34 2017 From: pradyunsg at gmail.com (Pradyun Gedam) Date: Wed, 01 Mar 2017 10:28:34 +0000 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: References: Message-ID: On Tue, Feb 28, 2017, 21:18 Jim Fulton wrote: On Tue, Feb 28, 2017 at 10:14 AM, Pradyun Gedam wrote: ... 4. (if time permits) Move any dependency resolution code out into a separate library. This would make it possible for other projects (like buildout or a future pip replacement) to reuse the dependency resolver. Thank you! Welcome! ... I do intend to reuse some of the work done by Robert Collins in PR #2716 on pip's GitHub repository. Are you aware of the proof of concept in distlib? I am. I had looked at it a few weeks back. IIRC it makes a dependency graph using distlib and operates with that. I haven't really understood how it gets the information about dependencies without downloading the packages... I'll give it another pass this weekend. https://distil.readthedocs.io/en/0.1.0/overview.html#actual-improvements Jim -- Jim Fulton http://jimfulton.info -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Wed Mar 1 05:53:09 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 1 Mar 2017 10:53:09 +0000 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: References: Message-ID: On 1 March 2017 at 10:28, Pradyun Gedam wrote: > I haven't really understood how it gets the information about dependencies > without downloading the packages... I'll give it another pass this weekend. If I recall, it reads static dependency data held on the red-dove site and maintained by downloading and running egg-info on the packages as changes occur. I don't think it's a sustainable approach for pip at the moment (my understanding is that it was a proof of concept for what having static metadata on PyPI would gain us). Paul From xav.fernandez at gmail.com Wed Mar 1 07:36:29 2017 From: xav.fernandez at gmail.com (Xavier Fernandez) Date: Wed, 1 Mar 2017 13:36:29 +0100 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: References: Message-ID: Great news ! Your plan seems reasonable. The first stage (RequirementSet refactor) seems to me to be the trickiest. Anyway I'm looking forward for your PRs :) Xavier -------------- next part -------------- An HTML attachment was scrubbed... URL: From robin at reportlab.com Wed Mar 1 09:17:28 2017 From: robin at reportlab.com (Robin Becker) Date: Wed, 1 Mar 2017 14:17:28 +0000 Subject: [Distutils] win amd_x64 Python 2.7.8 --> 2.7.13 woes Message-ID: <92171b3a-e9e8-ecf3-fc5b-8063e2561dad@chamonix.reportlab.co.uk> I find my extensions compiled for windows amd_x64 with python 2.7.8 no longer work after I updated python to 2.7.13. Is that expected? I had assumed that the cpy27 wheels that I make would work with any python 2.7, but this makes me doubt that. To get the reportlab tests to complete I rebuilt all my extensions and installed a newer version of pillow. In addition to that the uninstallation of the amd64 python 2.7.8 has also uninstalled the x86 version of python 2.7.8 :( -- Robin Becker From robin at reportlab.com Wed Mar 1 09:44:03 2017 From: robin at reportlab.com (Robin Becker) Date: Wed, 1 Mar 2017 14:44:03 +0000 Subject: [Distutils] win amd_x64 Python 2.7.8 --> 2.7.13 woes In-Reply-To: <92171b3a-e9e8-ecf3-fc5b-8063e2561dad@chamonix.reportlab.co.uk> References: <92171b3a-e9e8-ecf3-fc5b-8063e2561dad@chamonix.reportlab.co.uk> Message-ID: Ignore this; it was my duh :( of the day; seems the download button didn't give me the amd_x64 version so it carefully installed an x86 where I used to have my amd_x64. After banging my head with a hammer I downloaded the installers carefully and got things back to normal. On 1 March 2017 at 14:17, Robin Becker wrote: > I find my extensions compiled for windows amd_x64 with python 2.7.8 no > longer work after I updated python to 2.7.13. > > Is that expected? I had assumed that the cpy27 wheels that I make would > work with any python 2.7, but this makes me doubt that. > > To get the reportlab tests to complete I rebuilt all my extensions and > installed a newer version of pillow. > > In addition to that the uninstallation of the amd64 python 2.7.8 has also > uninstalled the x86 version of python 2.7.8 :( > -- > Robin Becker > -- Robin Becker -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Mar 1 15:02:09 2017 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 2 Mar 2017 09:02:09 +1300 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: References: Message-ID: On Wed, Mar 1, 2017 at 4:14 AM, Pradyun Gedam wrote: > Hello Everyone! > > Google released the list of accepted organizations for GSoC 2017 and PSF > is one of them. > I see pip is not yet listed as a PSF sub-org on http://python-gsoc.org/. This is pretty urgent to arrange: * "March 3* - Last day for Python sub-orgs to apply to participate with the PSF. (Assuming we get accepted by Google and can support sub-orgs, of course!) This deadline is for orgs who applies on their own and didn't make it, but still wish to participate under the umbrella. " The original deadline was Feb 7. There's a good chance that Pip will still be accepted after March 3, but I wouldn't gamble on it. There are instructions under "Project Ideas" on http://python-gsoc.org/ on how to get accepted as a sub-org. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Wed Mar 1 15:07:39 2017 From: donald at stufft.io (Donald Stufft) Date: Wed, 1 Mar 2017 15:07:39 -0500 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: References: Message-ID: <0C091B31-E7C8-4025-A07C-6B103F93CB31@stufft.io> > On Mar 1, 2017, at 3:02 PM, Ralf Gommers wrote: > > > > On Wed, Mar 1, 2017 at 4:14 AM, Pradyun Gedam > wrote: > Hello Everyone! > > Google released the list of accepted organizations for GSoC 2017 and PSF is one of them. > > I see pip is not yet listed as a PSF sub-org on http://python-gsoc.org/ . This is pretty urgent to arrange: > > "March 3 - Last day for Python sub-orgs to apply to participate with the PSF. > (Assuming we get accepted by Google and can support sub-orgs, of course!) > This deadline is for orgs who applies on their own and didn't make it, but still > wish to participate under the umbrella. " > > The original deadline was Feb 7. There's a good chance that Pip will still be accepted after March 3, but I wouldn't gamble on it. > > There are instructions under "Project Ideas" on http://python-gsoc.org/ on how to get accepted as a sub-org. > Oh. I?ve never done this before and Pradyun reached out so I had no idea I had to do this. I?ll go ahead and do this. ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From pradyunsg at gmail.com Wed Mar 1 15:13:51 2017 From: pradyunsg at gmail.com (Pradyun Gedam) Date: Wed, 01 Mar 2017 20:13:51 +0000 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: <0C091B31-E7C8-4025-A07C-6B103F93CB31@stufft.io> References: <0C091B31-E7C8-4025-A07C-6B103F93CB31@stufft.io> Message-ID: Thanks for the pointer Ralf! :) I was actually drafting a mail to send to Donald directly for thanking him for being willing to mentor me as well as pointing this out to him. I guess I can discard that draft now... On Thu, Mar 2, 2017, 01:37 Donald Stufft wrote: > > On Mar 1, 2017, at 3:02 PM, Ralf Gommers wrote: > > > > On Wed, Mar 1, 2017 at 4:14 AM, Pradyun Gedam wrote: > > Hello Everyone! > > Google released the list of accepted organizations for GSoC 2017 and PSF > is one of them. > > > I see pip is not yet listed as a PSF sub-org on http://python-gsoc.org/. > This is pretty urgent to arrange: > > * "March 3* - Last day for Python sub-orgs to apply to participate > with the PSF. > (Assuming we get accepted by Google and can support sub-orgs, of > course!) > This deadline is for orgs who applies on their own and didn't make it, > but still > wish to participate under the umbrella. " > > The original deadline was Feb 7. There's a good chance that Pip will still > be accepted after March 3, but I wouldn't gamble on it. > > There are instructions under "Project Ideas" on http://python-gsoc.org/ > on how to get accepted as a sub-org. > > > > Oh. I?ve never done this before and Pradyun reached out so I had no idea I > had to do this. I?ll go ahead and do this. > > > ? > > Donald Stufft > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Mar 1 16:31:34 2017 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 2 Mar 2017 10:31:34 +1300 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: <0C091B31-E7C8-4025-A07C-6B103F93CB31@stufft.io> References: <0C091B31-E7C8-4025-A07C-6B103F93CB31@stufft.io> Message-ID: On Thu, Mar 2, 2017 at 9:07 AM, Donald Stufft wrote: > > On Mar 1, 2017, at 3:02 PM, Ralf Gommers wrote: > > > > On Wed, Mar 1, 2017 at 4:14 AM, Pradyun Gedam wrote: > >> Hello Everyone! >> >> Google released the list of accepted organizations for GSoC 2017 and PSF >> is one of them. >> > > I see pip is not yet listed as a PSF sub-org on http://python-gsoc.org/. > This is pretty urgent to arrange: > > * "March 3* - Last day for Python sub-orgs to apply to participate > with the PSF. > (Assuming we get accepted by Google and can support sub-orgs, of > course!) > This deadline is for orgs who applies on their own and didn't make it, > but still > wish to participate under the umbrella. " > > The original deadline was Feb 7. There's a good chance that Pip will still > be accepted after March 3, but I wouldn't gamble on it. > > There are instructions under "Project Ideas" on http://python-gsoc.org/ > on how to get accepted as a sub-org. > > > > Oh. I?ve never done this before and Pradyun reached out so I had no idea I > had to do this. I?ll go ahead and do this. > I'm the GSoC admin for SciPy, so need to keep track of the various deadlines/todos. I'd be happy to ping you each time one approaches if that helps. There's a PSF GSoC mentors list that's not noisy and useful to join. You'll be added to the Google GSoC-mentors list automatically if you start mentoring in the program, but you may want to mute it or not use your primary email address for it (it's high-traffic, very low signal to noise and you can't unsubscribe). Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From xo.olive at gmail.com Thu Mar 2 01:50:48 2017 From: xo.olive at gmail.com (Xavier Olive) Date: Thu, 2 Mar 2017 07:50:48 +0100 Subject: [Distutils] How to change the linking command in a setuptools building process? Message-ID: I maintain a Cython binding to some OCaml code (through their respective C interface). For past versions, I managed to cheat and distribute a wheel file for Windows through cross-compilation. Now, I finally managed a clean and native way to produce the library for Windows 64 bits. For the 32 bits cross-compiled version, I had a specific target in my setup.py with the proper commands to execute. Back on Windows, I would like to stick to a setuptoolsic way of doing, but the thing is I need to replace the regular linking command link.exe with a different tool (resp. flexlink.exe, shipped with OCaml on Windows) Don't panic: flexlink.exe just builds some assembler shit before compiling and linking with the regular link.exe. It is the proper way to link OCaml executables and shared libraries under Windows. For MacOS and Linux, the traditional Extension pattern works like a charm as follows (mlobject is produced by OCaml a bit earlier in the file after some timestamp checks, asmrunlib is the full path to the equivalent of python36.dll for OCaml) : extensions = [ Extension("foo", ["foo.pyx", "interface_c.c"], language="c", include_dirs=INCLUDE, extra_compile_args=compileargs, extra_link_args=[mlobject, asmrunlib, ] ) ] Let's say I limit myself to Python>=3.5, I guess (by comparison with too big projects like NumPy) I would need to start by extending distutils._msvccompiler.MSVCCompiler and replace the self.linker = _find_exe("link.exe", paths) with something based on flexlink.exe. The problem is that I have no idea how they manage the plumbing work that comes next (connecting this extended compiler and making it look like the regular msvc to the setup process). I suppose it is not thoroughly documented anywhere and that if they were able to do more than that in NumPy, I should be able to reach my goal somehow. My setup.py is still reasonably basic and a solution that keeps the whole building/packaging process in one single file would be great! Xavier (Question first asked here https://stackoverflow.com/q/42519377/1595335 before being advised this mailing-list) From donald at stufft.io Thu Mar 2 11:12:31 2017 From: donald at stufft.io (Donald Stufft) Date: Thu, 2 Mar 2017 11:12:31 -0500 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: References: <0C091B31-E7C8-4025-A07C-6B103F93CB31@stufft.io> Message-ID: <77E7DCE9-BDE5-4BC7-AC16-973703041C10@stufft.io> Ok, so It appears besides me we need another one or two mentors to act as backup mentors. I guess in the event I?m not available or so. Probably ideally the backup mentor would either be familiar with pip?s codebase or else familiar with the ideas behind a backtracking resolver. I do have someone who can do it if needed, but I figured I?d poke distutils-sig first to see if anyone else wanted to do it as well. They suggest that at least one mentor be exclusive to the student but that the other mentors can work with multiple students. For pip we only have the one (yay Pradyun) and I?m not mentoring anyone else so we should be good on the exclusive front (of course, if someone is interested to help with this, they can also be exclusive). > On Mar 1, 2017, at 4:31 PM, Ralf Gommers wrote: > > > I'm the GSoC admin for SciPy, so need to keep track of the various deadlines/todos. I'd be happy to ping you each time one approaches if that helps. That would be awesome. I?m poking at the sites now to figure out everything I need to do to make sure all the administration bits are done properly, but having a double check that I don?t miss something would be great. > > There's a PSF GSoC mentors list that's not noisy and useful to join. You'll be added to the Google GSoC-mentors list automatically if you start mentoring in the program, but you may want to mute it or not use your primary email address for it (it's high-traffic, very low signal to noise and you can't unsubscribe). Ok cool. ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcappos at nyu.edu Thu Mar 2 11:31:55 2017 From: jcappos at nyu.edu (Justin Cappos) Date: Thu, 2 Mar 2017 11:31:55 -0500 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: <77E7DCE9-BDE5-4BC7-AC16-973703041C10@stufft.io> References: <0C091B31-E7C8-4025-A07C-6B103F93CB31@stufft.io> <77E7DCE9-BDE5-4BC7-AC16-973703041C10@stufft.io> Message-ID: I'd be happy to help to provide mentorship for the backtracking dependency resolver aspect. I don't know pip's code well though. Thanks, Justin On Thu, Mar 2, 2017 at 11:12 AM, Donald Stufft wrote: > Ok, so It appears besides me we need another one or two mentors to act as > backup mentors. I guess in the event I?m not available or so. Probably > ideally the backup mentor would either be familiar with pip?s codebase or > else familiar with the ideas behind a backtracking resolver. I do have > someone who can do it if needed, but I figured I?d poke distutils-sig first > to see if anyone else wanted to do it as well. > > They suggest that at least one mentor be exclusive to the student but that > the other mentors can work with multiple students. For pip we only have the > one (yay Pradyun) and I?m not mentoring anyone else so we should be good on > the exclusive front (of course, if someone is interested to help with this, > they can also be exclusive). > > On Mar 1, 2017, at 4:31 PM, Ralf Gommers wrote: > > > I'm the GSoC admin for SciPy, so need to keep track of the various > deadlines/todos. I'd be happy to ping you each time one approaches if that > helps. > > > > That would be awesome. I?m poking at the sites now to figure out > everything I need to do to make sure all the administration bits are done > properly, but having a double check that I don?t miss something would be > great. > > > There's a PSF GSoC mentors list that's not noisy and useful to join. > You'll be added to the Google GSoC-mentors list automatically if you start > mentoring in the program, but you may want to mute it or not use your > primary email address for it (it's high-traffic, very low signal to noise > and you can't unsubscribe). > > > Ok cool. > > ? > Donald Stufft > > > > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Thu Mar 2 11:42:32 2017 From: donald at stufft.io (Donald Stufft) Date: Thu, 2 Mar 2017 11:42:32 -0500 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: References: <0C091B31-E7C8-4025-A07C-6B103F93CB31@stufft.io> <77E7DCE9-BDE5-4BC7-AC16-973703041C10@stufft.io> Message-ID: <13A8A08D-3E89-4C61-8E70-ACB07B2F5EB6@stufft.io> > On Mar 2, 2017, at 11:31 AM, Justin Cappos wrote: > > I'd be happy to help to provide mentorship for the backtracking dependency resolver aspect. I don't know pip's code well though. > Awesome, that would work out well actually I think, because while I know pip?s code base, the actual resolver bits are not my strong suite (one of the main reasons I hadn?t done this work already is the research to actually figure out the right resolver tech and how it functions). ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From pradyunsg at gmail.com Sat Mar 4 12:25:54 2017 From: pradyunsg at gmail.com (Pradyun Gedam) Date: Sat, 04 Mar 2017 17:25:54 +0000 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: References: Message-ID: On Wed, 1 Mar 2017 at 15:58 Pradyun Gedam wrote: > > > On Tue, Feb 28, 2017, 21:18 Jim Fulton wrote: > > On Tue, Feb 28, 2017 at 10:14 AM, Pradyun Gedam > wrote: > ... > > 4. (if time permits) Move any dependency resolution code out into a > separate library. > > This would make it possible for other projects (like buildout or a > future pip replacement) to reuse the dependency resolver. > > > Thank you! > > > Welcome! > > > ... > > I do intend to reuse some of the work done by Robert Collins in PR #2716 > on pip's GitHub repository. > > > Are you aware of the proof of concept in distlib? > > > I am. I had looked at it a few weeks back. IIRC it makes a dependency > graph using distlib and operates with that. > > I haven't really understood how it gets the information about dependencies > without downloading the packages... I'll give it another pass this weekend. > I went through it. As Paul Moore said, it is hitting http://www.red-dove.com/pypi/ which has metdata on what the requirements are of a package. (saying this on the basis of [1]) Since PyPI does not have such information in a static declarative format, that approach is not feasible. pip will have to download packages and execute setup.py to know what the dependencies are. [1]: https://www.red-dove.com/pypi/projects/S/Sphinx/package-1.3.json > > > https://distil.readthedocs.io/en/0.1.0/overview.html#actual-improvements > > Jim > > -- > Jim Fulton > http://jimfulton.info > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Sat Mar 4 12:28:32 2017 From: donald at stufft.io (Donald Stufft) Date: Sat, 4 Mar 2017 12:28:32 -0500 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: References: Message-ID: <8D134761-2C9B-49B8-83F1-E2434F8BBD61@stufft.io> > On Mar 4, 2017, at 12:25 PM, Pradyun Gedam wrote: > > Since PyPI does not have such information in a static declarative format, that approach is not feasible. pip will have to download packages and execute setup.py to know what the dependencies are. I will note, that we can expose that information in PyPI for *wheels*, but not for sdists currently. It would be a lot more work though because it?d essentially require a whole new repository API and I doubt Pradyun wants to tackle that right now :) Keeping a future in mind where we can get a least some of that information without downloading would be good though, at least to keep in mind when structuring code. ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From pradyunsg at gmail.com Sun Mar 5 01:17:46 2017 From: pradyunsg at gmail.com (Pradyun Gedam) Date: Sun, 05 Mar 2017 06:17:46 +0000 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: <8D134761-2C9B-49B8-83F1-E2434F8BBD61@stufft.io> References: <8D134761-2C9B-49B8-83F1-E2434F8BBD61@stufft.io> Message-ID: On Sat, 4 Mar 2017 at 22:58 Donald Stufft wrote: > > On Mar 4, 2017, at 12:25 PM, Pradyun Gedam wrote: > > Since PyPI does not have such information in a static declarative format, > that approach is not feasible. pip will have to download packages and > execute setup.py to know what the dependencies are. > > > > I will note, that we can expose that information in PyPI for *wheels*, but > not for sdists currently. It would be a lot more work though because it?d > essentially require a whole new repository API and I doubt Pradyun wants to > tackle that right now :) > Yeah... For now, it's just dependency resolution in pip. > Keeping a future in mind where we can get a least some of that information > without downloading would be good though, at least to keep in mind when > structuring code. > Duly noted. > ? > > Donald Stufft > -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Sun Mar 5 11:51:44 2017 From: donald at stufft.io (Donald Stufft) Date: Sun, 5 Mar 2017 11:51:44 -0500 Subject: [Distutils] Deprecating download counts in API? Message-ID: So, as most folks are aware PyPI has long had a cumulative download count available in it?s API. This has been on and off again broken for a *long* time and arguably the numbers in there have been ?wrong? even when it was working because we had no way to reproduce them from scratch (and thus whenever a bug occurred we?d flat out lose data or add incorrect data with no way to correct it). In the meantime, we?ve gotten a much better source of querying for download information available inside of Google?s BigQuery database [1][2]. Not only is this able to be recreated ?from scratch? so we can, if needed, fix massive data bugs but it provides MUCH more information than the previous downloads and a very powerful query language to go along with it. Unless there is some sort of massive outcry, I plan to deprecate and ultimately remove the download counts available in the PyPI API, instead preferring people to start using the BigQuery data instead. This more or less reflects the current state of things, since it has been on and off broken (typically broken) for something like a year now. [1] https://mail.python.org/pipermail/distutils-sig/2016-May/028986.html [2] https://langui.sh/2016/12/09/data-driven-decisions/ ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From glyph at twistedmatrix.com Mon Mar 6 01:41:12 2017 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Sun, 5 Mar 2017 22:41:12 -0800 Subject: [Distutils] Deprecating download counts in API? In-Reply-To: References: Message-ID: <37810ED1-9984-4CDC-9E16-F6B46ADB624A@twistedmatrix.com> > On Mar 5, 2017, at 8:51 AM, Donald Stufft wrote: > > So, as most folks are aware PyPI has long had a cumulative download count available in it?s API. This has been on and off again broken for a *long* time and arguably the numbers in there have been ?wrong? even when it was working because we had no way to reproduce them from scratch (and thus whenever a bug occurred we?d flat out lose data or add incorrect data with no way to correct it). > > In the meantime, we?ve gotten a much better source of querying for download information available inside of Google?s BigQuery database [1][2]. Not only is this able to be recreated ?from scratch? so we can, if needed, fix massive data bugs but it provides MUCH more information than the previous downloads and a very powerful query language to go along with it. > > Unless there is some sort of massive outcry, I plan to deprecate and ultimately remove the download counts available in the PyPI API, instead preferring people to start using the BigQuery data instead. This more or less reflects the current state of things, since it has been on and off broken (typically broken) for something like a year now. I fully realize that if I really wanted this, I could do it myself, and the last thing you need is someone signing you up for more work :). But, as someone who's been vaguely annoyed that `vanity` doesn't work for a while, I wonder: shouldn't it be easy for someone familiar with both systems to simply implement the existing "download count" API as a legacy / compatibility wrapper around BigQuery? If that isn't trivial, doesn't that point to something flawed in the way the data is presented in BigQuery? That said, I'm fully OK with the answer that even a tiny bit of work is too much, and the limited volunteer effort of PyPI should be spent elsewhere. -glyph -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominic.lund at lj-oz.com Sun Mar 5 16:43:43 2017 From: dominic.lund at lj-oz.com (Dominic Lund) Date: Mon, 6 Mar 2017 08:43:43 +1100 Subject: [Distutils] help required Message-ID: I have a third party python module I have just downloaded It is in a zip file in my download directory How do I use pip to install it? Or can I 'install' it manually? -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Mon Mar 6 04:04:02 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 6 Mar 2017 09:04:02 +0000 Subject: [Distutils] help required In-Reply-To: References: Message-ID: On 5 March 2017 at 21:43, Dominic Lund wrote: > I have a third party python module I have just downloaded > > It is in a zip file in my download directory > > How do I use pip to install it? > > Or can I ?install? it manually? Does the documentation for the module not tell you how to do so? If not, it's difficult to advise you with so little information. "pip install " might work, but be aware that that command will likely run some of the code in the zipfile, so if you're not sure it's the right thing to do, you should at least be sure that the code has come from somewhere you trust. For anyone here to help you, you'd need to at a minimum let us know what the module is, where you got it from, and what you have already tried (and what happened). Paul From donald at stufft.io Mon Mar 6 06:34:10 2017 From: donald at stufft.io (Donald Stufft) Date: Mon, 6 Mar 2017 06:34:10 -0500 Subject: [Distutils] Deprecating download counts in API? In-Reply-To: <37810ED1-9984-4CDC-9E16-F6B46ADB624A@twistedmatrix.com> References: <37810ED1-9984-4CDC-9E16-F6B46ADB624A@twistedmatrix.com> Message-ID: > On Mar 6, 2017, at 1:41 AM, Glyph Lefkowitz wrote: > > >> On Mar 5, 2017, at 8:51 AM, Donald Stufft > wrote: >> >> So, as most folks are aware PyPI has long had a cumulative download count available in it?s API. This has been on and off again broken for a *long* time and arguably the numbers in there have been ?wrong? even when it was working because we had no way to reproduce them from scratch (and thus whenever a bug occurred we?d flat out lose data or add incorrect data with no way to correct it). >> >> In the meantime, we?ve gotten a much better source of querying for download information available inside of Google?s BigQuery database [1][2]. Not only is this able to be recreated ?from scratch? so we can, if needed, fix massive data bugs but it provides MUCH more information than the previous downloads and a very powerful query language to go along with it. >> >> Unless there is some sort of massive outcry, I plan to deprecate and ultimately remove the download counts available in the PyPI API, instead preferring people to start using the BigQuery data instead. This more or less reflects the current state of things, since it has been on and off broken (typically broken) for something like a year now. > > I fully realize that if I really wanted this, I could do it myself, and the last thing you need is someone signing you up for more work :). But, as someone who's been vaguely annoyed that `vanity` doesn't work for a while, I wonder: shouldn't it be easy for someone familiar with both systems to simply implement the existing "download count" API as a legacy / compatibility wrapper around BigQuery? If that isn't trivial, doesn't that point to something flawed in the way the data is presented in BigQuery? > > That said, I'm fully OK with the answer that even a tiny bit of work is too much, and the limited volunteer effort of PyPI should be spent elsewhere. > > -glyph > It?s not hard at all, it?d just be (standard SQL mode): SELECT file.filename, COUNT(*) AS downloads FROM `the-psf.pypi.downloads*` WHERE file.project = "twisted" GROUP BY file.filename You can probably guess how to handle modifications to this query since it?s roughly just regular old SQL. There are a few reasons I don?t want to just do this in PyPI though. This query will take somewhere between 30 and 60 seconds to complete, so I can?t do it inline with the the HTTP request, and I?d need to have a periodic job go through and issue about 100k queries (or a single query with almost a million results) and then load that into the database. More importantly though, we don?t have an unlimited amount of BigQuery on PyPI. We get blocks of credits granted periodically and so the faster we use up ?spend? the more regularly I have to track down my contacts inside of Google and get them to re-up the credit. This adds an incentive to to try and reduce our spending where we can to limit the frequency and the amount of time I need to go between asking for more credits. Due to BigQuery?s billing model you get billed based upon how much data your query has to process which means that a query that fetches data for all time, will be the most expensive kind of query and gets more expensive every day. On the flip side, the BigQuery data is publicly query able and the account being used to query ?pays? for that query and every account gets 1TB of querying for free (and additional TBs are $5 per TB). Currently it takes ~215GB of data to do a ?full? query for twisted (the exact query I listed above) and I haven?t fully backfilled all of the data yet (I?m working on it). You can kind of extrapolate that out to what it would ?cost" to do that same query for all 100k projects even before I do the backfill (which would drastically raise the ?cost? of PyPI here). The smart thing to do with BigQuery is to do date limited querying so that your query doesn?t have to load as much data. For instance, adapting the above query so that it only queries the last 30 days (still using standard SQL) you would do: SELECT file.filename, COUNT(*) AS downloads FROM `the-psf.pypi.downloads*` WHERE file.project = "twisted" AND _TABLE_SUFFIX BETWEEN FORMAT_DATE("%Y%m%d", DATE_ADD(CURRENT_DATE(), INTERVAL -31 day)) AND FORMAT_DATE("%Y%m%d", DATE_ADD(CURRENT_DATE(), INTERVAL -1 day)) GROUP BY file.filename This touches a much more reasonable 27GB of data. For reference, we currently ?spend? about $50/month on BigQuery so doing like, daily updates of this data for everyone would be a drastic increase in the amount of BigQuery spending we do. So the tl;dr is I think it?s a better solution for vanity to talk to the BigQuery API itself, ideally limiting itself to a recent timeframe by default, and possibly adding a flag to get at the all time data for people who are OK with either using vanity less often or are willing to spend a couple bucks if they?re querying the full amount of data every day. Where Warehouse is starting to query BigQuery, I?m purposely limiting it to only the last N days (typically 30) so as not to regularly query the entire data set. ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Mon Mar 6 06:36:23 2017 From: donald at stufft.io (Donald Stufft) Date: Mon, 6 Mar 2017 06:36:23 -0500 Subject: [Distutils] Deprecating download counts in API? In-Reply-To: References: <37810ED1-9984-4CDC-9E16-F6B46ADB624A@twistedmatrix.com> Message-ID: > On Mar 6, 2017, at 6:34 AM, Donald Stufft wrote: > > On the flip side, the BigQuery data is publicly query able and the account being used to query ?pays? for that query and every account gets 1TB of querying for free (and additional TBs are $5 per TB). To be clear, each account gets 1TB of querying for free per month, not 1TB for the life of the account. ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From jim at jimfulton.info Mon Mar 6 13:09:56 2017 From: jim at jimfulton.info (Jim Fulton) Date: Mon, 6 Mar 2017 13:09:56 -0500 Subject: [Distutils] Announcing new Buildout documentation Message-ID: The old horrible doctest-based buildout documentation has finally been replaced: http://docs.buildout.org Jim -- Jim Fulton http://jimfulton.info -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Mon Mar 6 16:43:05 2017 From: wes.turner at gmail.com (Wes Turner) Date: Mon, 6 Mar 2017 15:43:05 -0600 Subject: [Distutils] Announcing new Buildout documentation In-Reply-To: References: Message-ID: Thanks! https://github.com/buildout/buildout/commits/master/doc On Monday, March 6, 2017, Jim Fulton wrote: > The old horrible doctest-based buildout documentation has finally been > replaced: > > http://docs.buildout.org > > Jim > > -- > Jim Fulton > http://jimfulton.info > -------------- next part -------------- An HTML attachment was scrubbed... URL: From glyph at twistedmatrix.com Mon Mar 6 22:24:19 2017 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Mon, 6 Mar 2017 19:24:19 -0800 Subject: [Distutils] Deprecating download counts in API? In-Reply-To: References: <37810ED1-9984-4CDC-9E16-F6B46ADB624A@twistedmatrix.com> Message-ID: <6424B8AA-2F62-41E6-ACA7-047CCF90B3F6@twistedmatrix.com> > On Mar 6, 2017, at 3:34 AM, Donald Stufft wrote: > > >> On Mar 6, 2017, at 1:41 AM, Glyph Lefkowitz > wrote: >> >> >>> On Mar 5, 2017, at 8:51 AM, Donald Stufft > wrote: >>> >>> Unless there is some sort of massive outcry, I plan to deprecate and ultimately remove the download counts available in the PyPI API, [...] >> >> [...] But, as someone who's been vaguely annoyed that `vanity` doesn't work for a while, I wonder: shouldn't it be easy for someone familiar with both systems to simply implement the existing "download count" API as a legacy / compatibility wrapper around BigQuery? [...] > > It?s not hard at all, it?d just be [...] Thanks for that super detailed and exhaustive explanation, I have a much better handle on the issues involved now. Sorry if you'd written it before and I'd missed it - I can now very clearly see why you want to get rid of it! -glyph -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail at timgolden.me.uk Tue Mar 7 09:24:14 2017 From: mail at timgolden.me.uk (Tim Golden) Date: Tue, 7 Mar 2017 14:24:14 +0000 Subject: [Distutils] install_requires setup.py install vs pip install Message-ID: <86780446-e173-f102-c407-e6e026b31c94@timgolden.me.uk> I have a setup.py which looks like this: from setuptools import setup setup( name='install_requires', py_modules = ["install_requires"], install_requires=['PyQt5'], ) For the purposes of the discussion, there is an install_requires.py in the same directory. I have created and activated a standard Python 3.5 venv on Windows: py -3.5 -mvenv .venv .venv\scripts\activate.bat python -mpip install --upgrade pip (I don't believe the Python version or the venv matter here, but including them for reproducibility). If I pip install the module, the PyQt5 install dependency is found and installed: (.venv) C:\work-in-progress\install_requires>pip install . Processing c:\work-in-progress\install_requires Collecting PyQt5 (from install-requires==0.0.0) Using cached PyQt5-5.8.1-5.8.0-cp35.cp36.cp37-none-win_amd64.whl Collecting sip==4.19 (from PyQt5->install-requires==0.0.0) Using cached sip-4.19-cp35-none-win_amd64.whl Installing collected packages: sip, PyQt5, install-requires Running setup.py install for install-requires ... done Successfully installed PyQt5-5.8.1 install-requires-0.0.0 sip-4.19 If, instead, I setup.py install the module, I get the following messages: Processing dependencies for install-requires==0.0.0 Searching for PyQt5 Reading https://pypi.python.org/simple/PyQt5/ No local packages or download links found for PyQt5 error: Could not find suitable distribution for Requirement.parse('PyQt5') However, if I substitute instead "requests" or "simplejson" (both well-known packages) then setup.py install succeeds. My cursory inspection of https://pypi.python.org/simple/pyqt5/ doesn't reveal anything obviously different except for the complexity of the filenames. I've searched around, including in the archives of this group, but can't find that this is a known issue. If I had to guess from the evidence, it would be that pip ships a more sophisticated parser of complex wheel filenames than setuptools. Can anyone advise, please? TJG From leorochael at gmail.com Tue Mar 7 09:38:34 2017 From: leorochael at gmail.com (Leonardo Rochael Almeida) Date: Tue, 7 Mar 2017 11:38:34 -0300 Subject: [Distutils] install_requires setup.py install vs pip install In-Reply-To: <86780446-e173-f102-c407-e6e026b31c94@timgolden.me.uk> References: <86780446-e173-f102-c407-e6e026b31c94@timgolden.me.uk> Message-ID: Hi Tim, The reason setuptools can't process your package is because setuptools itself doesn't yet know how to install wheels[1] which pip knows how to install, and PyQT5 is only available as wheels on PyPI (the files with `.whl` extension in the `simple` URL you linked). [1] https://github.com/pypa/setuptools/issues/78 The reason why setuptools can install "requests" or "simplejson" is that their pages contain `.tar.gz` files with the source distributions beside the `.whl` files. Incidentally, there are PyQT5 source distributions, and they're available in their own website[2]. IMO they should be present in PyPI as well. (Though those archive names with `_gpl` in the middle might confuse setuptools, and they might prefer to deal with "Could not find suitable distribution" error message than some obscure compilation error arising from missing system packages). [2] https://www.riverbankcomputing.com/software/pyqt/download5/ Cheers, Leo On 7 March 2017 at 11:24, Tim Golden wrote: > I have a setup.py which looks like this: > > from setuptools import setup > setup( > name='install_requires', > py_modules = ["install_requires"], > install_requires=['PyQt5'], > ) > > For the purposes of the discussion, there is an install_requires.py in the > same directory. > > I have created and activated a standard Python 3.5 venv on Windows: > > py -3.5 -mvenv .venv > .venv\scripts\activate.bat > python -mpip install --upgrade pip > > (I don't believe the Python version or the venv matter here, but including > them for reproducibility). > > If I pip install the module, the PyQt5 install dependency is found and > installed: > > (.venv) C:\work-in-progress\install_requires>pip install . > Processing c:\work-in-progress\install_requires > Collecting PyQt5 (from install-requires==0.0.0) > Using cached PyQt5-5.8.1-5.8.0-cp35.cp36.cp37-none-win_amd64.whl > Collecting sip==4.19 (from PyQt5->install-requires==0.0.0) > Using cached sip-4.19-cp35-none-win_amd64.whl > Installing collected packages: sip, PyQt5, install-requires > Running setup.py install for install-requires ... done > Successfully installed PyQt5-5.8.1 install-requires-0.0.0 sip-4.19 > > If, instead, I setup.py install the module, I get the following messages: > > Processing dependencies for install-requires==0.0.0 > Searching for PyQt5 > Reading https://pypi.python.org/simple/PyQt5/ > No local packages or download links found for PyQt5 > error: Could not find suitable distribution for Requirement.parse('PyQt5') > > However, if I substitute instead "requests" or "simplejson" (both > well-known packages) then setup.py install succeeds. My cursory inspection > of https://pypi.python.org/simple/pyqt5/ doesn't reveal anything > obviously different except for the complexity of the filenames. > > I've searched around, including in the archives of this group, but can't > find that this is a known issue. If I had to guess from the evidence, it > would be that pip ships a more sophisticated parser of complex wheel > filenames than setuptools. > > Can anyone advise, please? > > TJG > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail at timgolden.me.uk Tue Mar 7 09:53:24 2017 From: mail at timgolden.me.uk (Tim Golden) Date: Tue, 7 Mar 2017 14:53:24 +0000 Subject: [Distutils] install_requires setup.py install vs pip install In-Reply-To: References: <86780446-e173-f102-c407-e6e026b31c94@timgolden.me.uk> Message-ID: <7e4a8faa-a832-38f1-5717-31191214b4e8@timgolden.me.uk> On 07/03/2017 14:38, Leonardo Rochael Almeida wrote: > Hi Tim, > > The reason setuptools can't process your package is because setuptools > itself doesn't yet know how to install wheels[1] which pip knows how to > install, and PyQT5 is only available as wheels on PyPI (the files with > `.whl` extension in the `simple` URL you linked). > > [1] https://github.com/pypa/setuptools/issues/78 > > The reason why setuptools can install "requests" or "simplejson" is that > their pages contain `.tar.gz` files with the source distributions beside > the `.whl` files. > > Incidentally, there are PyQT5 source distributions, and they're > available in their own website[2]. > > IMO they should be present in PyPI as well. > > (Though those archive names with `_gpl` in the middle might confuse > setuptools, and they might prefer to deal with "Could not find suitable > distribution" error message than some obscure compilation error arising > from missing system packages). > > [2] https://www.riverbankcomputing.com/software/pyqt/download5/ Thanks, Leo. That was a much simple explanation than I'd been considering! I didn't think to look at the output for requests etc. Now that I do, it's clearly building eggs from sdists. Knowing this, I have ways forward. (This is actually about the mu editor which is aimed at teachers and other less techie people: https://github.com/mu-editor/mu) Thanks again for your help TJG From Gabriel.Ganne at enea.com Wed Mar 8 03:43:36 2017 From: Gabriel.Ganne at enea.com (Gabriel Ganne) Date: Wed, 8 Mar 2017 08:43:36 +0000 Subject: [Distutils] custom setup.py link arguments order Message-ID: Hi, I'm currently writing a python C module which has a chained dependency: - mymodule requires libb - libb requires liba To that effect, within setup.py, I link against both liba and libb libraries=['a', 'b'], Also, as I'm working on Ubuntu, I want to add -Wl,--no-as-needed to make sure that the symbols not immediately needed will still be stripped. extra_link_args=['-Wl,--no-as-needed'], However, it seems that the extra_link_args are systematically appended at the end of the link line, but for this to work, the '-Wl,--no-as-needed' argument need to be *before* the link against my two libraries. How can I choose the order of my link arguments that I pass to gcc using setup.py ? Best regards, -- Gabriel Ganne -- Gabriel Ganne -------------- next part -------------- An HTML attachment was scrubbed... URL: From ja.geb at me.com Tue Mar 7 06:06:35 2017 From: ja.geb at me.com (Jannis Gebauer) Date: Tue, 07 Mar 2017 12:06:35 +0100 Subject: [Distutils] Data on requirement files on GitHub Message-ID: <8D76AAE7-A57C-4CB4-97A1-96979CCA12B1@me.com> Hi, I ran a couple of queries against GitHubs public big query dataset [0] last week. I?m interested in requirement files in particular, so I ran a query extracting all available requirement files. Since queries against this dataset are rather expensive ($7 on all repos), I thought I?d share the raw data here [1]. The data contains the repo name, the requirements file path and the contents of the file. Every line represents a JSON blob, read it with: with open('data.json') as f: for line in f.readlines(): data = json.loads(line) Maybe that?s of interest to some of you. If you have any ideas on what to do with the data, please let me know. ? Jannis Gebauer [0]: https://cloud.google.com/bigquery/public-data/github [1]: https://github.com/jayfk/requirements-dataset -------------- next part -------------- An HTML attachment was scrubbed... URL: From prometheus235 at gmail.com Wed Mar 8 11:36:16 2017 From: prometheus235 at gmail.com (Nick Timkovich) Date: Wed, 8 Mar 2017 10:36:16 -0600 Subject: [Distutils] Data on requirement files on GitHub In-Reply-To: <8D76AAE7-A57C-4CB4-97A1-96979CCA12B1@me.com> References: <8D76AAE7-A57C-4CB4-97A1-96979CCA12B1@me.com> Message-ID: Looks like a fun chunk of data, what's the query you used? Can you add a README to the repo with some description if others want to iterate on it (maybe look into setup.py's?) Nick On Tue, Mar 7, 2017 at 5:06 AM, Jannis Gebauer wrote: > Hi, > > I ran a couple of queries against GitHubs public big query dataset [0] > last week. I?m interested in requirement files in particular, so I ran a > query extracting all available requirement files. > > Since queries against this dataset are rather expensive ($7 on all repos), > I thought I?d share the raw data here [1]. The data contains the repo name, > the requirements file path and the contents of the file. Every line > represents a JSON blob, read it with: > > with open('data.json') as f: > for line in f.readlines(): > data = json.loads(line) > > Maybe that?s of interest to some of you. > > If you have any ideas on what to do with the data, please let me know. > > ? > > Jannis Gebauer > > > > [0]: https://cloud.google.com/bigquery/public-data/github > [1]: https://github.com/jayfk/requirements-dataset > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lkraider at gmail.com Thu Mar 9 17:39:26 2017 From: lkraider at gmail.com (Paul Eipper) Date: Thu, 9 Mar 2017 19:39:26 -0300 Subject: [Distutils] Data on requirement files on GitHub In-Reply-To: References: <8D76AAE7-A57C-4CB4-97A1-96979CCA12B1@me.com> Message-ID: I had some fun parsing and plotting the data (very simple, just the top packages for now). See here: https://github.com/lkraider/requirements-dataset/blob/master/index.ipynb Let me know if you would accept a pull request so others can use that as a starting point. att, -- Paul Eipper On Wed, Mar 8, 2017 at 1:36 PM, Nick Timkovich wrote: > Looks like a fun chunk of data, what's the query you used? Can you add a > README to the repo with some description if others want to iterate on it > (maybe look into setup.py's?) > > Nick > > On Tue, Mar 7, 2017 at 5:06 AM, Jannis Gebauer wrote: > >> Hi, >> >> I ran a couple of queries against GitHubs public big query dataset [0] >> last week. I?m interested in requirement files in particular, so I ran a >> query extracting all available requirement files. >> >> Since queries against this dataset are rather expensive ($7 on all >> repos), I thought I?d share the raw data here [1]. The data contains the >> repo name, the requirements file path and the contents of the file. Every >> line represents a JSON blob, read it with: >> >> with open('data.json') as f: >> for line in f.readlines(): >> data = json.loads(line) >> >> Maybe that?s of interest to some of you. >> >> If you have any ideas on what to do with the data, please let me know. >> >> ? >> >> Jannis Gebauer >> >> >> >> [0]: https://cloud.google.com/bigquery/public-data/github >> [1]: https://github.com/jayfk/requirements-dataset >> >> _______________________________________________ >> Distutils-SIG maillist - Distutils-SIG at python.org >> https://mail.python.org/mailman/listinfo/distutils-sig >> >> > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lkraider at gmail.com Thu Mar 9 17:41:11 2017 From: lkraider at gmail.com (Paul Eipper) Date: Thu, 9 Mar 2017 19:41:11 -0300 Subject: [Distutils] Data on requirement files on GitHub In-Reply-To: References: <8D76AAE7-A57C-4CB4-97A1-96979CCA12B1@me.com> Message-ID: PS: took 2 hours to parse the dataset into the linearized version (stored as "parsed.json") on my notebook. -- Paul Eipper On Thu, Mar 9, 2017 at 7:39 PM, Paul Eipper wrote: > I had some fun parsing and plotting the data (very simple, just the top > packages for now). See here: > https://github.com/lkraider/requirements-dataset/blob/master/index.ipynb > > Let me know if you would accept a pull request so others can use that as a > starting point. > > att, > > > -- > Paul Eipper > > On Wed, Mar 8, 2017 at 1:36 PM, Nick Timkovich > wrote: > >> Looks like a fun chunk of data, what's the query you used? Can you add a >> README to the repo with some description if others want to iterate on it >> (maybe look into setup.py's?) >> >> Nick >> >> On Tue, Mar 7, 2017 at 5:06 AM, Jannis Gebauer wrote: >> >>> Hi, >>> >>> I ran a couple of queries against GitHubs public big query dataset [0] >>> last week. I?m interested in requirement files in particular, so I ran a >>> query extracting all available requirement files. >>> >>> Since queries against this dataset are rather expensive ($7 on all >>> repos), I thought I?d share the raw data here [1]. The data contains the >>> repo name, the requirements file path and the contents of the file. Every >>> line represents a JSON blob, read it with: >>> >>> with open('data.json') as f: >>> for line in f.readlines(): >>> data = json.loads(line) >>> >>> Maybe that?s of interest to some of you. >>> >>> If you have any ideas on what to do with the data, please let me know. >>> >>> ? >>> >>> Jannis Gebauer >>> >>> >>> >>> [0]: https://cloud.google.com/bigquery/public-data/github >>> [1]: https://github.com/jayfk/requirements-dataset >>> >>> _______________________________________________ >>> Distutils-SIG maillist - Distutils-SIG at python.org >>> https://mail.python.org/mailman/listinfo/distutils-sig >>> >>> >> >> _______________________________________________ >> Distutils-SIG maillist - Distutils-SIG at python.org >> https://mail.python.org/mailman/listinfo/distutils-sig >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Thu Mar 9 22:57:14 2017 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 9 Mar 2017 21:57:14 -0600 Subject: [Distutils] Data on requirement files on GitHub In-Reply-To: <8D76AAE7-A57C-4CB4-97A1-96979CCA12B1@me.com> References: <8D76AAE7-A57C-4CB4-97A1-96979CCA12B1@me.com> Message-ID: https://en.wikipedia.org/wiki/BigQuery BigQuery Dashboards - http://bigqueri.es/c/github-archive - https://redash.io/data-sources/google-bigquery - https://github.com/getredash/redash - https://github.com/getredash/redash/blob/master/requirements.txt - https://github.com/getredash/redash/blob/master/Dockerfile - https://github.com/docker/docker/blob/master/builder/dockerfile/parser/parser.go - https://github.com/DBuildService/dockerfile-parse/issues - https://github.com/getredash/redash/blob/master/docker-compose.yml Software Configuration Management / Dependency Management applications for BigQuery: - https://opensource.googleblog.com/2017/03/operation-rosehub.html - "Googlers used BigQuery and GitHub to patch thousands of vulnerable projects" https://www.reddit.com/r/bigquery/comments/5x0x5z/googlers_used_bigquery_and_github_to_patch/ BigQuery Python Libraries google-cloud-bigquery - | Src: https://github.com/GoogleCloudPlatform/google-cloud-python - | Pypi: https://pypi.python.org/pypi/google-cloud-bigquery - | Docs: https://cloud.google.com/bigquery/docs/reference/libraries#client-libraries-resources-python google-api-python-client - | Src: https://github.com/google/google-api-python-client - | Pypi: https://pypi.python.org/pypi/google-api-python-client - pandas.io.gbq uses google-api-python-client: - Docs: http://pandas.pydata.org/pandas-docs/stable/io.html#google-bigquery-experimental - read_gbq() http://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.gbq.read_gbq.html#pandas.io.gbq.read_gbq - to_gbq() http://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.gbq.to_gbq.html#pandas-io-gbq-to-gbq Open Source Big Data Components for things like BigQuery: Apache Drill - | Wikipedia: https://en.wikipedia.org/wiki/Apache_Drill - Apache Drill is similar to Google Dremel (which powers Google BigQuery) - https://pypi.python.org/pypi/drillpy Apache Beam - | Wikipedia: https://en.wikipedia.org/wiki/Apache_Beam - | Src: https://github.com/apache/beam - | Docs: https://beam.apache.org/documentation/sdks/python/ - | Docs: https://beam.apache.org/get-started/quickstart-py/ - | Docs: https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples - Google Cloud Dataflow is now of Apache Beam - https://cloud.google.com/dataflow/model/bigquery-io Parsing (and MAINTAINING) Pip Requirements.txt Files: - | Src: https://github.com/pypa/pip/tree/master/pip/req - https://github.com/pypa/pip/issues/3884#issuecomment-236454008 - https://github.com/pypa/pip/issues/1479 - -> Pipfile, Pipfile.lock (``pipenv install pkgname --dev``) - https://github.com/pyupio/safety-db#tools - https://pyup.io/ - https://libraries.io/github/librariesio/pydeps - https://github.com/librariesio/pydeps - https://libraries.io/ - Pipfile, Pipfile.lock - | PyPI: https://pypi.python.org/pypi/pipenv - | PyPI: https://pypi.python.org/pypi/requirements-parser - | PyPI: https://pypi.python.org/pypi/pipfile - | Src: https://github.com/kennethreitz/pipenv - These save to the Pipfile: - ``pipenv install pkgname`` - ``pipenv install pkgname --dev`` - https://github.com/kennethreitz/pipenv/blob/master/pipenv/utils.py - pip reqs.txt <--> Pipfile ... Thought I'd get these together; hopefully they're useful. Cool Jupyter notebook! ( https://github.com/lkraider/requirements-dataset/blob/master/index.ipynb ) On Tue, Mar 7, 2017 at 5:06 AM, Jannis Gebauer wrote: > Hi, > > I ran a couple of queries against GitHubs public big query dataset [0] > last week. I?m interested in requirement files in particular, so I ran a > query extracting all available requirement files. > > Since queries against this dataset are rather expensive ($7 on all repos), > I thought I?d share the raw data here [1]. The data contains the repo name, > the requirements file path and the contents of the file. Every line > represents a JSON blob, read it with: > > with open('data.json') as f: > for line in f.readlines(): > data = json.loads(line) > > Maybe that?s of interest to some of you. > > If you have any ideas on what to do with the data, please let me know. > > ? > > Jannis Gebauer > > > > [0]: https://cloud.google.com/bigquery/public-data/github > [1]: https://github.com/jayfk/requirements-dataset > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Mar 10 04:26:41 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 10 Mar 2017 19:26:41 +1000 Subject: [Distutils] PEP 426 moved back to Draft status Message-ID: Hi folks, After a few years of dormancy, I've finally moved the metadata 2.0 specification back to Draft status: https://github.com/python/peps/commit/8ae8b612d4ea8b3bf5d8a7b795ae8aec48bbb7a3 Based on our last round of discussion, I've culled a lot of the complexity around dependency declarations, cutting it back to just 4 pre-declared extras (dev, doc, build, test), and some reserved extras that can be used to say "don't install this, even though you normally would" (self, runtime). I've also deleted a lot of the text related to thing that we now don't need to worry about until the first few standard metadata extensions are being defined. I think the biggest thing it needs right now is a major editing pass from someone that isn't me to help figure out which explanatory sections can be culled completely, while still having the specification itself make sense. >From a technical point of view, the main "different from today" piece that we have left is the Provide & Obsoleted-By fields, and I'm seriously wondering if it might make sense to just delete those entirely for now, and reconsider them later as a potential metadata extension. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Mar 10 09:52:34 2017 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 10 Mar 2017 06:52:34 -0800 Subject: [Distutils] PEP 426 moved back to Draft status In-Reply-To: References: Message-ID: On Fri, Mar 10, 2017 at 1:26 AM, Nick Coghlan wrote: > Hi folks, > > After a few years of dormancy, I've finally moved the metadata 2.0 > specification back to Draft status: > https://github.com/python/peps/commit/8ae8b612d4ea8b3bf5d8a7b795ae8aec48bbb7a3 We have lots of metadata files in the wild that already claim to be version 2.0. If you're reviving this I think you might need to change the version number? > Based on our last round of discussion, I've culled a lot of the complexity > around dependency declarations, cutting it back to just 4 pre-declared > extras (dev, doc, build, test), I think we can drop 'build' in favor of pyproject.toml? Actually all of the pre-declared extras are really relevant for sdists rather than wheels. Maybe they should all move into pyproject.toml? > and some reserved extras that can be used to > say "don't install this, even though you normally would" (self, runtime). Hmm. While it's not the most urgent problem we face, I really think in the long run we need to move the extras system to something like: https://mail.python.org/pipermail/distutils-sig/2015-October/027364.html The current extras system is inherently broken with respect to upgrades, and reified extras would solve this, along with several other intractable problems (e.g. numpy ABI tracking). So from that perspective, I'm wary of adding new special case "magic" to the extras system. Adding conventional names for things like test-dependencies is fine, that doesn't pose any new obstacles to a future migration. But adding complexity to the "extras language" like "*", "self", "runtime", etc. does make it harder to change how extras work in the future. I feel like most of the value we get out of these could be had by just standardizing the existing convention that packages should have an explicit "all" extra that includes all the feature-based extras, but not the special development extras? This also provides flexibility for cases like, a package where there are two extras that conflict with each other -- the package authors can pick which one they recommend to put into "all". > I've also deleted a lot of the text related to thing that we now don't need > to worry about until the first few standard metadata extensions are being > defined. > > I think the biggest thing it needs right now is a major editing pass from > someone that isn't me to help figure out which explanatory sections can be > culled completely, while still having the specification itself make sense. > > From a technical point of view, the main "different from today" piece that > we have left is the Provide & Obsoleted-By fields, and I'm seriously > wondering if it might make sense to just delete those entirely for now, and > reconsider them later as a potential metadata extension. Overall the vibe I get from the Provides and Obsoleted-By sections is that these are surprisingly complicated and could really do with their own PEP, yeah, where the spec will have room to breathe and properly cover all the details. In particular, the language in the "provides" spec about how the interpretation of the metadata depends on whether you get it from a public index server versus somewhere else makes me really nervous. Experience suggests that splitting up packaging PEPs is basically never a bad idea, right? :-) As a general note I guess I should say that I'm still not convinced that migrating to json is worth the effort, but you've heard those arguments before and I don't have anything new to add now, so :-). -n -- Nathaniel J. Smith -- https://vorpus.org From ncoghlan at gmail.com Fri Mar 10 10:55:49 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 11 Mar 2017 01:55:49 +1000 Subject: [Distutils] PEP 426 moved back to Draft status In-Reply-To: References: Message-ID: On 11 March 2017 at 00:52, Nathaniel Smith wrote: > On Fri, Mar 10, 2017 at 1:26 AM, Nick Coghlan wrote: > > Hi folks, > > > > After a few years of dormancy, I've finally moved the metadata 2.0 > > specification back to Draft status: > > https://github.com/python/peps/commit/8ae8b612d4ea8b3bf5d8a7b795ae8a > ec48bbb7a3 > > We have lots of metadata files in the wild that already claim to be > version 2.0. If you're reviving this I think you might need to change > the version number? > They're mostly in metadata.json files, though. That said, version numbers are cheap, so I'm happy to skip straight to 3.0 if folks think it makes more sense. > > Based on our last round of discussion, I've culled a lot of the > complexity > > around dependency declarations, cutting it back to just 4 pre-declared > > extras (dev, doc, build, test), > > I think we can drop 'build' in favor of pyproject.toml? > No, as that's a human edited input file, not an output file from the sdist generation process. > Actually all of the pre-declared extras are really relevant for sdists > rather than wheels. Maybe they should all move into pyproject.toml? > Think "static release metadata in an API response from PyPI" for this particular specification, rather than something you'd necessarily check into source control. That's actually one of the big benefits of doing this post pyproject.toml - with that taking care of the build system bootstrapping problem, it frees up pydist.json to be entirely an artifact of the sdist generation process (and then copying it along to the wheel archives and the installed package as well). That said, that's actually an important open question: is pydist.json always preserved unmodified through the sdist->wheel->install and sdist->install process? There's a lot to be said for treating the file as immutable, and instead adding *other* metadata files as a component moves through the distribution process. If so, then it may actually be more appropriate to call the rendered file "pysdist.json", since it contains the sdist metadata specifically, rather than arbitrary distribution metadata. > > > and some reserved extras that can be used to > > say "don't install this, even though you normally would" (self, runtime). > > Hmm. While it's not the most urgent problem we face, I really think in > the long run we need to move the extras system to something like: > > https://mail.python.org/pipermail/distutils-sig/2015- > October/027364.html > > The current extras system is inherently broken with respect to > upgrades, and reified extras would solve this, along with several > other intractable problems (e.g. numpy ABI tracking). > > So from that perspective, I'm wary of adding new special case "magic" > to the extras system. Adding conventional names for things like > test-dependencies is fine, that doesn't pose any new obstacles to a > future migration. But adding complexity to the "extras language" like > "*", "self", "runtime", etc. does make it harder to change how extras > work in the future. > Technically the only part of that which the PEP really locks in is barring the use of "self" and "runtime" as extras names (which needs to be validated by a check against currently published metadata to see if anyone is already using them). '*' is already illegal due to the naming rules, and the '-extra' syntax is also an illegal name, so neither of those actually impacts the metadata format, only what installation tools allow. The main purpose of having them in the PEP is to disallow using those spellings for anything else and instead reserve them for the purposes described in the PEP. I'd also be fairly strongly opposed to converting extras from an optional dependency management system to a "let multiple PyPI packages target the same site-packages subdirectory" because we already know that's a nightmare from the Linux distro experience (having a clear "main" package that owns the parent directory with optional subpackages solves *some* of the problems, but my main reaction is still "Run awaaay"). It especially isn't needed just to solve the "pip forgets what extras it installed" problem - that technically doesn't even need a PEP to resolve, it just needs pip to drop a pip specific file into the PEP 376 dist-info directory that says what extras to request when doing future upgrades. Similarly, the import system offers so much flexibility in checking for optional packages at startup and lying about where imports are coming from that it would be hard to convince me that installation customisation to use particular optional dependencies *had* to be done at install time. > I feel like most of the value we get out of these could be had by just > standardizing the existing convention that packages should have an > explicit "all" extra that includes all the feature-based extras, That's the first I've heard of that convention, so it may not be as widespread as you thought it was :) > but > not the special development extras? This also provides flexibility for > cases like, a package where there are two extras that conflict with > each other -- the package authors can pick which one they recommend to > put into "all". > That's actually the main problem I had with '*' - it didn't work anywhere near as nicely once the semantic dependencies were migrated over to being part of the extras system. Repeating the same dependencies under multiple extra names in order to model pseudo-sets seems error prone and messy to me, though. So perhaps we should add the notion of "extra_sets" as a first class entity, where they're named sets of declared extras? And if you don't declare an "all" set explicitly, you get an implied one that consists of all your declared extras. For migration of existing metadata that uses "all" as a normal extra, the translation would be: - declared extras are added to "all" in order until all of the dependencies in all are covered or all declared extras are included - any dependency in "all" that isn't in another extra gets added to a new "_all" extra - "extras" and "extra_sets" are populated accordingly Tools consuming the metadata would then just need to read "extra_sets" and expand any named sets before passing the list of extras over to their existing dependency processing machinery. > I've also deleted a lot of the text related to thing that we now don't > need > > to worry about until the first few standard metadata extensions are being > > defined. > > > > I think the biggest thing it needs right now is a major editing pass from > > someone that isn't me to help figure out which explanatory sections can > be > > culled completely, while still having the specification itself make > sense. > > > > From a technical point of view, the main "different from today" piece > that > > we have left is the Provide & Obsoleted-By fields, and I'm seriously > > wondering if it might make sense to just delete those entirely for now, > and > > reconsider them later as a potential metadata extension. > > Overall the vibe I get from the Provides and Obsoleted-By sections is > that these are surprisingly complicated and could really do with their > own PEP, yeah, where the spec will have room to breathe and properly > cover all the details. > > In particular, the language in the "provides" spec about how the > interpretation of the metadata depends on whether you get it from a > public index server versus somewhere else makes me really nervous. > Yeah, virtual provides are a security nightmare on a public index server - distros are only able to get away with it because they maintain relatively strict control over the package review process. > Experience suggests that splitting up packaging PEPs is basically > never a bad idea, right? :-) > Indeed :) OK, I'll put them on the chopping block too, under the assumption they may come back as an extension some day if it ever makes it to the top of someone's list of "thing that bothers them enough about Python packaging to do something about it". > As a general note I guess I should say that I'm still not convinced > that migrating to json is worth the effort, but you've heard those > arguments before and I don't have anything new to add now, so :-). > The main benefit I see will be to empower utility APIs like distlib (and potentially Warehouse itself) to better hide both the historical and migratory cruft by translating everything to the PEP 426 format, even if the source artifact only includes the legacy metadata. Unless the plumbing actually breaks, nobody other than the plumber cares when it's a mess, as long as the porcelain is shiny and clean :) Cheers, Nick. P.S. Something I'm getting out of this experience: if you can afford to sit on your hands for 3-4 years, that's a *really good way* to avoid falling prey to "second system syndrome" [1] :) P.P.S Having no budget to pay anyone else and only limited time and attention of your own also turns out to make it easier to avoid ;) [1] http://coliveira.net/software/what-is-second-system-syndrome/ -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Fri Mar 10 13:14:00 2017 From: brett at python.org (Brett Cannon) Date: Fri, 10 Mar 2017 18:14:00 +0000 Subject: [Distutils] PEP 426 moved back to Draft status In-Reply-To: References: Message-ID: On Fri, 10 Mar 2017 at 07:56 Nick Coghlan wrote: > On 11 March 2017 at 00:52, Nathaniel Smith wrote: > > On Fri, Mar 10, 2017 at 1:26 AM, Nick Coghlan wrote: > > Hi folks, > > > > After a few years of dormancy, I've finally moved the metadata 2.0 > > specification back to Draft status: > > > https://github.com/python/peps/commit/8ae8b612d4ea8b3bf5d8a7b795ae8aec48bbb7a3 > > We have lots of metadata files in the wild that already claim to be > version 2.0. If you're reviving this I think you might need to change > the version number? > > > They're mostly in metadata.json files, though. That said, version numbers > are cheap, so I'm happy to skip straight to 3.0 if folks think it makes > more sense. > +1 on jumping. > > > > Based on our last round of discussion, I've culled a lot of the > complexity > > around dependency declarations, cutting it back to just 4 pre-declared > > extras (dev, doc, build, test), > > I think we can drop 'build' in favor of pyproject.toml? > > > No, as that's a human edited input file, not an output file from the sdist > generation process. > > > Actually all of the pre-declared extras are really relevant for sdists > rather than wheels. Maybe they should all move into pyproject.toml? > > > Think "static release metadata in an API response from PyPI" for this > particular specification, rather than something you'd necessarily check > into source control. > Or "stuff PyPI has to parse, not you". ;) > That's actually one of the big benefits of doing this post pyproject.toml > - with that taking care of the build system bootstrapping problem, it > frees up pydist.json to be entirely an artifact of the sdist generation > process (and then copying it along to the wheel archives and the installed > package as well). > > That said, that's actually an important open question: is pydist.json > always preserved unmodified through the sdist->wheel->install and > sdist->install process? > Is there a reason not to? > > There's a lot to be said for treating the file as immutable, and instead > adding *other* metadata files as a component moves through the distribution > process. If so, then it may actually be more appropriate to call the > rendered file "pysdist.json", since it contains the sdist metadata > specifically, rather than arbitrary distribution metadata. > Since this is meant for tool consumption and not human consumption, breaking the steps into individual files so that they are considered immutable by tools farther down the toolchain makes sense to me. > > > > and some reserved extras that can be used to > > say "don't install this, even though you normally would" (self, runtime). > > Hmm. While it's not the most urgent problem we face, I really think in > the long run we need to move the extras system to something like: > > > https://mail.python.org/pipermail/distutils-sig/2015-October/027364.html > > The current extras system is inherently broken with respect to > upgrades, and reified extras would solve this, along with several > other intractable problems (e.g. numpy ABI tracking). > > So from that perspective, I'm wary of adding new special case "magic" > to the extras system. Adding conventional names for things like > test-dependencies is fine, that doesn't pose any new obstacles to a > future migration. But adding complexity to the "extras language" like > "*", "self", "runtime", etc. does make it harder to change how extras > work in the future. > > > Technically the only part of that which the PEP really locks in is barring > the use of "self" and "runtime" as extras names (which needs to be > validated by a check against currently published metadata to see if anyone > is already using them). > Do you have something planned for these names? > > '*' is already illegal due to the naming rules, and the '-extra' syntax is > also an illegal name, so neither of those actually impacts the metadata > format, only what installation tools allow. The main purpose of having them > in the PEP is to disallow using those spellings for anything else and > instead reserve them for the purposes described in the PEP. > > I'd also be fairly strongly opposed to converting extras from an optional > dependency management system to a "let multiple PyPI packages target the > same site-packages subdirectory" because we already know that's a nightmare > from the Linux distro experience (having a clear "main" package that owns > the parent directory with optional subpackages solves *some* of the > problems, but my main reaction is still "Run awaaay"). > > It especially isn't needed just to solve the "pip forgets what extras it > installed" problem - that technically doesn't even need a PEP to resolve, > it just needs pip to drop a pip specific file into the PEP 376 dist-info > directory that says what extras to request when doing future upgrades. > Similarly, the import system offers so much flexibility in checking for > optional packages at startup and lying about where imports are coming from > that it would be hard to convince me that installation customisation to use > particular optional dependencies *had* to be done at install time. > > > I feel like most of the value we get out of these could be had by just > standardizing the existing convention that packages should have an > explicit "all" extra that includes all the feature-based extras, > > > That's the first I've heard of that convention, so it may not be as > widespread as you thought it was :) > > > but > not the special development extras? This also provides flexibility for > cases like, a package where there are two extras that conflict with > each other -- the package authors can pick which one they recommend to > put into "all". > > > That's actually the main problem I had with '*' - it didn't work anywhere > near as nicely once the semantic dependencies were migrated over to being > part of the extras system. > > Repeating the same dependencies under multiple extra names in order to > model pseudo-sets seems error prone and messy to me, though. > > So perhaps we should add the notion of "extra_sets" as a first class > entity, where they're named sets of declared extras? And if you don't > declare an "all" set explicitly, you get an implied one that consists of > all your declared extras. > I think that's a tool decision that doesn't tie into the PEP (unless you're going to ban the use of the name "all"). > > For migration of existing metadata that uses "all" as a normal extra, the > translation would be: > > - declared extras are added to "all" in order until all of the > dependencies in all are covered or all declared extras are included > - any dependency in "all" that isn't in another extra gets added to a new > "_all" extra > - "extras" and "extra_sets" are populated accordingly > > Tools consuming the metadata would then just need to read "extra_sets" and > expand any named sets before passing the list of extras over to their > existing dependency processing machinery. > If this is meant to be generated by pyproject.toml consumers then I think it should be up to the build tools to support that concept. Then the build tools can statically declare the union of some extras to get extra sets since the information isn't changing once the pydist.json file is generated (dynamic calculation is only necessary if the value could change between data generation and consumption). > > > I've also deleted a lot of the text related to thing that we now don't > need > > to worry about until the first few standard metadata extensions are being > > defined. > > > > I think the biggest thing it needs right now is a major editing pass from > > someone that isn't me to help figure out which explanatory sections can > be > > culled completely, while still having the specification itself make > sense. > > > > From a technical point of view, the main "different from today" piece > that > > we have left is the Provide & Obsoleted-By fields, and I'm seriously > > wondering if it might make sense to just delete those entirely for now, > and > > reconsider them later as a potential metadata extension. > > Overall the vibe I get from the Provides and Obsoleted-By sections is > that these are surprisingly complicated and could really do with their > own PEP, yeah, where the spec will have room to breathe and properly > cover all the details. > > In particular, the language in the "provides" spec about how the > interpretation of the metadata depends on whether you get it from a > public index server versus somewhere else makes me really nervous. > > > Yeah, virtual provides are a security nightmare on a public index server - > distros are only able to get away with it because they maintain relatively > strict control over the package review process. > > > Experience suggests that splitting up packaging PEPs is basically > never a bad idea, right? :-) > > > Indeed :) > > OK, I'll put them on the chopping block too, under the assumption they may > come back as an extension some day if it ever makes it to the top of > someone's list of "thing that bothers them enough about Python packaging to > do something about it". > > > As a general note I guess I should say that I'm still not convinced > that migrating to json is worth the effort, but you've heard those > arguments before and I don't have anything new to add now, so :-). > > > The main benefit I see will be to empower utility APIs like distlib (and > potentially Warehouse itself) to better hide both the historical and > migratory cruft by translating everything to the PEP 426 format, even if > the source artifact only includes the legacy metadata. Unless the plumbing > actually breaks, nobody other than the plumber cares when it's a mess, as > long as the porcelain is shiny and clean :) > > Cheers, > Nick. > > P.S. Something I'm getting out of this experience: if you can afford to > sit on your hands for 3-4 years, that's a *really good way* to avoid > falling prey to "second system syndrome" [1] :) > > P.P.S Having no budget to pay anyone else and only limited time and > attention of your own also turns out to make it easier to avoid ;) > Yes, getting to stew on an idea for any length of time lets those random ideas one gets to properly die when they are bad. ;) -------------- next part -------------- An HTML attachment was scrubbed... URL: From dholth at gmail.com Fri Mar 10 16:03:20 2017 From: dholth at gmail.com (Daniel Holth) Date: Fri, 10 Mar 2017 21:03:20 +0000 Subject: [Distutils] PEP 426 moved back to Draft status In-Reply-To: References: Message-ID: You lost me a bit at 'extra sets'. FYI it is already possible to depend on your own extras in another extra. Extra pseudo code: spampackage extra['spam'] = 'spampackage[eggs]' extra['eggs'] = ... +1 on extras. The extras feature has the wonderful property that people understand it. Lots of projects have a 'test' extra instead of tests_require for example, and you don't have to look up how to install them. On Fri, Mar 10, 2017 at 1:14 PM Brett Cannon wrote: On Fri, 10 Mar 2017 at 07:56 Nick Coghlan wrote: On 11 March 2017 at 00:52, Nathaniel Smith wrote: On Fri, Mar 10, 2017 at 1:26 AM, Nick Coghlan wrote: > Hi folks, > > After a few years of dormancy, I've finally moved the metadata 2.0 > specification back to Draft status: > https://github.com/python/peps/commit/8ae8b612d4ea8b3bf5d8a7b795ae8aec48bbb7a3 We have lots of metadata files in the wild that already claim to be version 2.0. If you're reviving this I think you might need to change the version number? They're mostly in metadata.json files, though. That said, version numbers are cheap, so I'm happy to skip straight to 3.0 if folks think it makes more sense. +1 on jumping. > Based on our last round of discussion, I've culled a lot of the complexity > around dependency declarations, cutting it back to just 4 pre-declared > extras (dev, doc, build, test), I think we can drop 'build' in favor of pyproject.toml? No, as that's a human edited input file, not an output file from the sdist generation process. Actually all of the pre-declared extras are really relevant for sdists rather than wheels. Maybe they should all move into pyproject.toml? Think "static release metadata in an API response from PyPI" for this particular specification, rather than something you'd necessarily check into source control. Or "stuff PyPI has to parse, not you". ;) That's actually one of the big benefits of doing this post pyproject.toml - with that taking care of the build system bootstrapping problem, it frees up pydist.json to be entirely an artifact of the sdist generation process (and then copying it along to the wheel archives and the installed package as well). That said, that's actually an important open question: is pydist.json always preserved unmodified through the sdist->wheel->install and sdist->install process? Is there a reason not to? There's a lot to be said for treating the file as immutable, and instead adding *other* metadata files as a component moves through the distribution process. If so, then it may actually be more appropriate to call the rendered file "pysdist.json", since it contains the sdist metadata specifically, rather than arbitrary distribution metadata. Since this is meant for tool consumption and not human consumption, breaking the steps into individual files so that they are considered immutable by tools farther down the toolchain makes sense to me. > and some reserved extras that can be used to > say "don't install this, even though you normally would" (self, runtime). Hmm. While it's not the most urgent problem we face, I really think in the long run we need to move the extras system to something like: https://mail.python.org/pipermail/distutils-sig/2015-October/027364.html The current extras system is inherently broken with respect to upgrades, and reified extras would solve this, along with several other intractable problems (e.g. numpy ABI tracking). So from that perspective, I'm wary of adding new special case "magic" to the extras system. Adding conventional names for things like test-dependencies is fine, that doesn't pose any new obstacles to a future migration. But adding complexity to the "extras language" like "*", "self", "runtime", etc. does make it harder to change how extras work in the future. Technically the only part of that which the PEP really locks in is barring the use of "self" and "runtime" as extras names (which needs to be validated by a check against currently published metadata to see if anyone is already using them). Do you have something planned for these names? '*' is already illegal due to the naming rules, and the '-extra' syntax is also an illegal name, so neither of those actually impacts the metadata format, only what installation tools allow. The main purpose of having them in the PEP is to disallow using those spellings for anything else and instead reserve them for the purposes described in the PEP. I'd also be fairly strongly opposed to converting extras from an optional dependency management system to a "let multiple PyPI packages target the same site-packages subdirectory" because we already know that's a nightmare from the Linux distro experience (having a clear "main" package that owns the parent directory with optional subpackages solves *some* of the problems, but my main reaction is still "Run awaaay"). It especially isn't needed just to solve the "pip forgets what extras it installed" problem - that technically doesn't even need a PEP to resolve, it just needs pip to drop a pip specific file into the PEP 376 dist-info directory that says what extras to request when doing future upgrades. Similarly, the import system offers so much flexibility in checking for optional packages at startup and lying about where imports are coming from that it would be hard to convince me that installation customisation to use particular optional dependencies *had* to be done at install time. I feel like most of the value we get out of these could be had by just standardizing the existing convention that packages should have an explicit "all" extra that includes all the feature-based extras, That's the first I've heard of that convention, so it may not be as widespread as you thought it was :) but not the special development extras? This also provides flexibility for cases like, a package where there are two extras that conflict with each other -- the package authors can pick which one they recommend to put into "all". That's actually the main problem I had with '*' - it didn't work anywhere near as nicely once the semantic dependencies were migrated over to being part of the extras system. Repeating the same dependencies under multiple extra names in order to model pseudo-sets seems error prone and messy to me, though. So perhaps we should add the notion of "extra_sets" as a first class entity, where they're named sets of declared extras? And if you don't declare an "all" set explicitly, you get an implied one that consists of all your declared extras. I think that's a tool decision that doesn't tie into the PEP (unless you're going to ban the use of the name "all"). For migration of existing metadata that uses "all" as a normal extra, the translation would be: - declared extras are added to "all" in order until all of the dependencies in all are covered or all declared extras are included - any dependency in "all" that isn't in another extra gets added to a new "_all" extra - "extras" and "extra_sets" are populated accordingly Tools consuming the metadata would then just need to read "extra_sets" and expand any named sets before passing the list of extras over to their existing dependency processing machinery. If this is meant to be generated by pyproject.toml consumers then I think it should be up to the build tools to support that concept. Then the build tools can statically declare the union of some extras to get extra sets since the information isn't changing once the pydist.json file is generated (dynamic calculation is only necessary if the value could change between data generation and consumption). > I've also deleted a lot of the text related to thing that we now don't need > to worry about until the first few standard metadata extensions are being > defined. > > I think the biggest thing it needs right now is a major editing pass from > someone that isn't me to help figure out which explanatory sections can be > culled completely, while still having the specification itself make sense. > > From a technical point of view, the main "different from today" piece that > we have left is the Provide & Obsoleted-By fields, and I'm seriously > wondering if it might make sense to just delete those entirely for now, and > reconsider them later as a potential metadata extension. Overall the vibe I get from the Provides and Obsoleted-By sections is that these are surprisingly complicated and could really do with their own PEP, yeah, where the spec will have room to breathe and properly cover all the details. In particular, the language in the "provides" spec about how the interpretation of the metadata depends on whether you get it from a public index server versus somewhere else makes me really nervous. Yeah, virtual provides are a security nightmare on a public index server - distros are only able to get away with it because they maintain relatively strict control over the package review process. Experience suggests that splitting up packaging PEPs is basically never a bad idea, right? :-) Indeed :) OK, I'll put them on the chopping block too, under the assumption they may come back as an extension some day if it ever makes it to the top of someone's list of "thing that bothers them enough about Python packaging to do something about it". As a general note I guess I should say that I'm still not convinced that migrating to json is worth the effort, but you've heard those arguments before and I don't have anything new to add now, so :-). The main benefit I see will be to empower utility APIs like distlib (and potentially Warehouse itself) to better hide both the historical and migratory cruft by translating everything to the PEP 426 format, even if the source artifact only includes the legacy metadata. Unless the plumbing actually breaks, nobody other than the plumber cares when it's a mess, as long as the porcelain is shiny and clean :) Cheers, Nick. P.S. Something I'm getting out of this experience: if you can afford to sit on your hands for 3-4 years, that's a *really good way* to avoid falling prey to "second system syndrome" [1] :) P.P.S Having no budget to pay anyone else and only limited time and attention of your own also turns out to make it easier to avoid ;) Yes, getting to stew on an idea for any length of time lets those random ideas one gets to properly die when they are bad. ;) _______________________________________________ Distutils-SIG maillist - Distutils-SIG at python.org https://mail.python.org/mailman/listinfo/distutils-sig -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Mar 10 23:17:58 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 11 Mar 2017 14:17:58 +1000 Subject: [Distutils] PEP 426 moved back to Draft status In-Reply-To: References: Message-ID: On 11 March 2017 at 07:03, Daniel Holth wrote: > You lost me a bit at 'extra sets'. FYI it is already possible to depend on > your own extras in another extra. > > Extra pseudo code: > spampackage > extra['spam'] = 'spampackage[eggs]' > extra['eggs'] = ... > Oh, nice. In that case, we can drop the '*' idea and just make "all" another pre-declared extra with a SHOULD that says sdist build tools should implicitly populate it as: { "requires": "thisproject[extra1,extra2,extra3,extra4]" "extra": "all" } given an extras clause containing '["extra1",'extra2","extra3","extra4"]'. Endorsing that approach to handling "extra sets" does impose a design constraint though, which is that installation tools will need to special-case self-referential requirements so they don't get stuck in a recursive loop. (That will become a new MUST in the spec) That just leaves the question of how to install build & test requirements without installing the project itself, and I guess we don't actually need to handle that at the Python metadata level - it can be done by external tools. For example, in the pyp2rpm case, it's handled by the translation to BuildRequires and Requires terms at the RPM level, with RPM then handling the task of setting up the build environment correctly. > +1 on extras. The extras feature has the wonderful property that people > understand it. Lots of projects have a 'test' extra instead of > tests_require for example, and you don't have to look up how to install > them. > Yeah, it was really helpful to me to work through the "How would I replace this proposal with the existing extras system?", since the end result achieved everything I was aiming for without requiring any fundamentally new concepts or tech. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From amine.djillali at gmail.com Sat Mar 11 10:23:28 2017 From: amine.djillali at gmail.com (Adh) Date: Sat, 11 Mar 2017 16:23:28 +0100 Subject: [Distutils] Python 3.5 Message-ID: Hello, I do not arrive to install python 3.5 in the terminal, I use Ubuntu. Wich command do I have to tape to install it? Thank you for your answer -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrey at futoin.org Fri Mar 10 17:25:06 2017 From: andrey at futoin.org (Andrey Galkin) Date: Sat, 11 Mar 2017 00:25:06 +0200 Subject: [Distutils] Critical: PR for packaging.specifiers not found issue Message-ID: Can someone please take a look at https://github.com/pypa/setuptools/pull/990 ? Previously, the issue was reported by another user and then rejected: https://github.com/pypa/setuptools/issues/967 The problem is reproducible on both Python 2.7.13 and 3.5.3 shipped in Debian Stretch. Yes, it's not yet visible in other OSes including Ubuntu with previous patch versions of 2.7 & 3.5. I believe it's related to this change: bpo-27419: Standard __import__() no longer look up ?__import__? in globals or builtins for importing submodules or ?from import?. Fixed handling an error of non-string package name. https://bugs.python.org/issue27419 The packaging module does not export specifiers in __init__.py. It can be easily triggered with "pip -e source_dir". I can confirm the issue gets vanished once PR with one-liner is applied to latest setuptools located in virtualenv. Error output: Complete output from command python setup.py egg_info: Traceback (most recent call last): File "", line 1, in File "/vagrant/setup.py", line 67, in setup(**config) File "/usr/lib/python2.7/distutils/core.py", line 111, in setup _setup_distribution = dist = klass(attrs) File "/home/vagrant/.virtualenv-2.7/local/lib/python2.7/site-packages/setuptools/dist.py", line 320, in __init__ _Distribution.__init__(self, attrs) File "/usr/lib/python2.7/distutils/dist.py", line 287, in __init__ self.finalize_options() File "/home/vagrant/.virtualenv-2.7/local/lib/python2.7/site-packages/setuptools/dist.py", line 387, in finalize_options ep.load()(self, ep.name, value) File "/home/vagrant/.virtualenv-2.7/local/lib/python2.7/site-packages/setuptools/dist.py", line 166, in check_specifier except packaging.specifiers.InvalidSpecifier as error: AttributeError: 'module' object has no attribute 'specifiers' Complete output from command python setup.py egg_info: Traceback (most recent call last): File "/home/vagrant/.virtualenv-3.5/lib/python3.5/site-packages/setuptools/dist.py", line 165, in check_specifier packaging.specifiers.SpecifierSet(value) AttributeError: module 'packaging' has no attribute 'specifiers' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "", line 1, in File "/vagrant/setup.py", line 67, in setup(**config) File "/usr/lib/python3.5/distutils/core.py", line 108, in setup _setup_distribution = dist = klass(attrs) File "/home/vagrant/.virtualenv-3.5/lib/python3.5/site-packages/setuptools/dist.py", line 320, in __init__ _Distribution.__init__(self, attrs) File "/usr/lib/python3.5/distutils/dist.py", line 281, in __init__ self.finalize_options() File "/home/vagrant/.virtualenv-3.5/lib/python3.5/site-packages/setuptools/dist.py", line 387, in finalize_options ep.load()(self, ep.name, value) File "/home/vagrant/.virtualenv-3.5/lib/python3.5/site-packages/setuptools/dist.py", line 166, in check_specifier except packaging.specifiers.InvalidSpecifier as error: AttributeError: module 'packaging' has no attribute 'specifiers' From ben+python at benfinney.id.au Sat Mar 11 18:57:32 2017 From: ben+python at benfinney.id.au (Ben Finney) Date: Sun, 12 Mar 2017 10:57:32 +1100 Subject: [Distutils] Python 3.5 References: Message-ID: <85varfgydv.fsf@benfinney.id.au> Adh writes: > Hello, I do not arrive to install python 3.5 in the terminal, I use > Ubuntu. You should ask general usage questions in the main user forum for Python, . Please subscribe there, tell them which Ubuntu version you are using, what command you type and what is the result. They will help you from there. -- \ ?Simplicity is prerequisite for reliability.? ?Edsger W. | `\ Dijkstra | _o__) | Ben Finney From graffatcolmingov at gmail.com Sat Mar 11 21:26:20 2017 From: graffatcolmingov at gmail.com (Ian Cordasco) Date: Sat, 11 Mar 2017 20:26:20 -0600 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: <85r323gw48.fsf@benfinney.id.au> References: <85r323gw48.fsf@benfinney.id.au> Message-ID: On Mar 11, 2017 6:47 PM, "Ben Finney" wrote: Howdy all, What prospects are there for PyPI to have GnuPG-signed packages by default? Debian's UScan has the ability to find, download, and verify the GnuPG signature for a package source release. Lintian will remind the maintainer if a Debian source package is not taking advantage of this. However, this only works if upstream releases are actually accompanied by a valid GnuPG signature each time. The PyPI infrastructure supports this; why isn't it more widely encouraged? This thread from 2016 has a possible answer: while you can use GPG as is to verify that yes, "Donald Stufft" signed a particular package, you cannot use it to determine if "Donald Stufft" is *allowed* to sign for that package, a valid signature from me on the requests project should be just as invalid as an invalid signature from anyone on the requests project. The only namespacing provided by GPG itself is "trusted key" vs "not trusted key". [?] I am aware of a single tool anywhere that actively supports verifying the signatures that people upload to PyPI, and that is Debian's uscan program. [?] All in all, I think that there is not a whole lot of point to having this feature in PyPI, it is predicated a bunch of invalid assumptions (as detailed above) and I do not believe end users are actually even using the keys that are being uploaded. [?] Thus, I would like to remove this feature from PyPI [?]. The thread has some discussion, and Barry Warsaw makes the case for Debian's use for signed releases. The last (?) post in the thread has a kind of interim conclusion: My main concern when implementing this is how to communicate it to users [?]. [A move that gives the impression] "we're getting rid of this thing that only kinda works now in favor of something amazing that doesn't exist yet" is just not a popular move. In response to polite requests for signed releases, some upstream I've only ever seen condescending requests in the past but perhaps we have different definitions of "polite" or perhaps things have genuinely changed. maintainers are now pointing to that thread and closing bug reports as ?won't fix?. You may have noticed in that thread that there are plans for better mechanisms. Mechanisms that don't add significantly more burden to maintainers of the software we know and love who do this for free and with their spare time. What prospect is there in the Python community to get signed upstream releases become the obvious norm? Not every package on PyPI is redistributed via Linux packagers. Why then should someone publishing their tiny little first package have to go through the hassle of creating a GPG key? As a maintainer of Twine, I will never force someone to have learned how to install GPG on their platform, create a key that package maintainers won't belittle them for, and maintain the key's security in order to upload something to PyPI. Further GPG depends on trust. Do you mean to imply that Debian trusts PyPI packages with a signature more than those without? Even if the key used to sign it has never been signed by another person? What about keys signed by people you've never met? Someone can manufacture their own web of trust if they want to. Why is GPG seen as done kind of magic authenticity bullet? If you can find a tool that is easy to install on Linux, Windows, and Mac, which solves the problems above by virtue of having very good defaults, and is accessible to anyone with less than a few hours to waste on it... Then maybe I would collaborate to make it a requirement. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben+python at benfinney.id.au Sun Mar 12 03:15:03 2017 From: ben+python at benfinney.id.au (Ben Finney) Date: Sun, 12 Mar 2017 18:15:03 +1100 Subject: [Distutils] GnuPG signatures on PyPI: why so few? References: <85r323gw48.fsf@benfinney.id.au> Message-ID: <85h92zge4o.fsf@benfinney.id.au> (Ian, your messages are failing to properly quote material you're responding to. The message you posted has no quote leaders on my material, which looks like it was written by you; see the message at . If this is some mangling done by GMail, you may need to change its configuration or post using something else until it's fixed.) Ian Cordasco writes: > If you can find a tool that is easy to install on Linux, Windows, and Mac, > which solves the problems above by virtue of having very good defaults, and > is accessible to anyone with less than a few hours to waste on it... Then > maybe I would collaborate to make it a requirement. No-one here has argued that it be a requirement as things stand now. I'm talking about encouraging it as a norm, by improving tool support to make it easier. -- \ ?The fact of your own existence is the most astonishing fact | `\ you'll ever have to confront. Don't dare ever see your life as | _o__) boring, monotonous, or joyless.? ?Richard Dawkins, 2010-03-10 | Ben Finney From p.f.moore at gmail.com Sun Mar 12 07:49:16 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 12 Mar 2017 11:49:16 +0000 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: <85h92zge4o.fsf@benfinney.id.au> References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> Message-ID: On 12 March 2017 at 07:15, Ben Finney wrote: >> If you can find a tool that is easy to install on Linux, Windows, and Mac, >> which solves the problems above by virtue of having very good defaults, and >> is accessible to anyone with less than a few hours to waste on it... Then >> maybe I would collaborate to make it a requirement. > > No-one here has argued that it be a requirement as things stand now. I'm > talking about encouraging it as a norm, by improving tool support to > make it easier. One tool that needs improvement to be easier to use for this to happen is GPG itself. As a Windows user, I've "played" with it in the past, and found it frustratingly difficult. It's fiddly to set up, it's not officially supported on Windows, it's intrusive (needs an installer rather than having a portable version), and doesn't give me any assistance in managing the generated key that I might only need once every year or two, and not always on the same machine (and at least one of the machines involved has all access to "internet shared storage" blocked). If I were publishing code that was used extensively by others, and I was being paid to set up a production quality distribution, then I'd be fine with all this. But for putting up my hobby program for others to take a look at if they are interested, it's way too much to expect. (And I'd strongly resist suggestions that such hobby programs be refused permission to publish on PyPI - everything that's available on PyPI started off in just that way). Paul From ben+python at benfinney.id.au Sun Mar 12 08:13:37 2017 From: ben+python at benfinney.id.au (Ben Finney) Date: Sun, 12 Mar 2017 23:13:37 +1100 Subject: [Distutils] GnuPG signatures on PyPI: why so few? References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> Message-ID: <85d1dmhevi.fsf@benfinney.id.au> Paul Moore writes: > One tool that needs improvement to be easier to use for this to happen > is GPG itself. No disagreement from me on that. And indeed, the GnuPG project's chronic under-funding eventually drew attention from the new Core Infrastructure Initiative to improve it faster than was historically the case. This is thanks in large part to the amazing work of Nadia Eghbal in drawing attention to how critical free software, such as GnuPG, benefits society enormously and must receive reliable funding from the organisations who benefit. If anyone reading this works for any organisation that wants to ensure such critical free-software infrastructure continues to be consistently funded and maintained, encourage regular financial contribution to the Core Infrastructure Initiative or similar projects. > As a Windows user, I've "played" with it in the past, and found it > frustratingly difficult. I hope many people here will find the guide published by the FSF, Email Self-Defense , a useful walk through how to set it up properly. -- \ ?I must say that I find television very educational. The minute | `\ somebody turns it on, I go to the library and read a book.? | _o__) ?Groucho Marx | Ben Finney From p.f.moore at gmail.com Sun Mar 12 10:35:44 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 12 Mar 2017 14:35:44 +0000 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: <85d1dmhevi.fsf@benfinney.id.au> References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> Message-ID: On 12 March 2017 at 12:13, Ben Finney wrote: > >> As a Windows user, I've "played" with it in the past, and found it >> frustratingly difficult. > > I hope many people here will find the guide published by the FSF, Email > Self-Defense , a useful walk > through how to set it up properly. That's about email, though, and as such irrelevant here. I have no interest in setting up GPG for my email. Part of what I meant by "intrusive" was "installs plugins for things like email and file encryption that I don't want". Part of my issue here is that people promoting signing tend to think of it as a way of life, rather than as an annoying little extra step that is needed for one specific activity (publishing to PyPI in the context of this thread). There's essentially nothing written from the POV of "you have no interest in signing, and are only doing it because someone's insisting that you do - so here's how to do the least possible to make them shut up". You may not agree with that attitude, but it is very common in my experience, and documents that start by trying to change the reader's opinion get discarded *remarkably* fast. But this is way off-topic, so I'll refrain from saying anything more. Paul From steve.dower at python.org Sun Mar 12 14:57:49 2017 From: steve.dower at python.org (Steve Dower) Date: Sun, 12 Mar 2017 11:57:49 -0700 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> Message-ID: FWIW, I dropped a portable version into the windows-installer externals that are pulled down by the release scripts (from svn.p.o). It does require me to import my key on new machines, but since I don't use it for anything but re-signing the releases it's worth it to avoid all the intrusions. So it's definitely possible, just a matter of finding and including the right dependencies to copy around. Cheers, Steve Top-posted from my Windows Phone -----Original Message----- From: "Paul Moore" Sent: ?3/?12/?2017 7:36 To: "Ben Finney" Cc: "Distutils" Subject: Re: [Distutils] GnuPG signatures on PyPI: why so few? On 12 March 2017 at 12:13, Ben Finney wrote: > >> As a Windows user, I've "played" with it in the past, and found it >> frustratingly difficult. > > I hope many people here will find the guide published by the FSF, Email > Self-Defense , a useful walk > through how to set it up properly. That's about email, though, and as such irrelevant here. I have no interest in setting up GPG for my email. Part of what I meant by "intrusive" was "installs plugins for things like email and file encryption that I don't want". Part of my issue here is that people promoting signing tend to think of it as a way of life, rather than as an annoying little extra step that is needed for one specific activity (publishing to PyPI in the context of this thread). There's essentially nothing written from the POV of "you have no interest in signing, and are only doing it because someone's insisting that you do - so here's how to do the least possible to make them shut up". You may not agree with that attitude, but it is very common in my experience, and documents that start by trying to change the reader's opinion get discarded *remarkably* fast. But this is way off-topic, so I'll refrain from saying anything more. Paul _______________________________________________ Distutils-SIG maillist - Distutils-SIG at python.org https://mail.python.org/mailman/listinfo/distutils-sig -------------- next part -------------- An HTML attachment was scrubbed... URL: From glyph at twistedmatrix.com Sun Mar 12 15:51:13 2017 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Sun, 12 Mar 2017 12:51:13 -0700 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: <85d1dmhevi.fsf@benfinney.id.au> References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> Message-ID: <2CE39A31-C1AC-4909-833B-4B09457FD785@twistedmatrix.com> > On Mar 12, 2017, at 5:13 AM, Ben Finney wrote: > > Paul Moore writes: > >> One tool that needs improvement to be easier to use for this to happen >> is GPG itself. > > No disagreement from me on that. And indeed, the GnuPG project's chronic > under-funding eventually drew attention from the new Core Infrastructure > Initiative to improve it > faster than was historically the case. > > This is thanks in large part to the amazing work of Nadia Eghbal > in drawing attention to how critical > free software, such as GnuPG, benefits society enormously and must > receive reliable funding from the organisations who benefit. > > If anyone reading this works for any organisation that wants to ensure > such critical free-software infrastructure continues to be consistently > funded and maintained, encourage regular financial contribution to the > Core Infrastructure Initiative > or similar projects. No disrespect to GPG's maintainers, who are indeed beleaguered and underfunded, but the poor usability of the tool isn't entirely down to a lack of resources. One reason we may not want to require or even encourage the use of GPG is that GPG is bad. Publishing your own heartfelt screed about why you used to like GPG but really, we need to abandon it now, has become the national sport of the information security community: https://blog.cryptographyengineering.com/2014/08/13/whats-matter-with-pgp/ https://blog.filippo.io/giving-up-on-long-term-pgp/ https://moxie.org/blog/gpg-and-me/ These posts are talking a lot about email, but many of the problems are just fundamental; in particular the "museum of 90s crypto" aspect is fundamentally un-solvable within the confines of the OpenPGP specification. "Unusable email clients" in this case could be replaced with "unusable packaging tooling". If you're retrieving packages from PyPI over TLS, they're already cryptographically signed at the time of retrieval, by an entity with a very good reputation in the community (the PSF) that you already have to trust anyway because that's where Python comes from. So if we could get away from GPG as a specific piece of tooling here and focus on the problem a detached GPG signature could solve, it's "direct trust of packagers rather than the index". The only way that Debian maintainers can supply this trust metadata right now is to manually populate debian/upstream/signing-key.asc. This is a terrible mechanism that is full of flaws, but requiring a human being to at least look at the keys is at least a potential benefit because maybe they'll notice that it's odd that the key got rotated. If PyPI required signatures from everybody then it would be very tempting to skip this manual step and just retrieve the signing key from the PyPI account uploading the packages, which is the exact same guarantee you had before via the crypto TLS gave you (i.e. the PSF via PyPI makes some highly ambiguous attestation as to the authenticity of the package, basically just "its name matches") but now you're involving a pile of highly-complex software with fundamentally worse crypto than OpenSSL would have given you. To summarize: Even if we only cared about supplying package upstreams to Debian (and that is a tiny part of PyPI's mission), right now, using the existing tooling of uscan and lintian, the only security value that could _possibly_ be conveyed here would be an out-of-band conversation between the maintainer and upstream about what their signing keys are and how the signing process works. Any kind of automation would make it less likely that would happen, which means that providing tool support to automate this process would actually make things worse. >> As a Windows user, I've "played" with it in the past, and found it >> frustratingly difficult. > > I hope many people here will find the guide published by the FSF, Email > Self-Defense , a useful walk > through how to set it up properly. > > -- > \ ?I must say that I find television very educational. The minute | > `\ somebody turns it on, I go to the library and read a book.? | > _o__) ?Groucho Marx | > Ben Finney > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Mar 13 03:45:28 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 13 Mar 2017 17:45:28 +1000 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: <2CE39A31-C1AC-4909-833B-4B09457FD785@twistedmatrix.com> References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> <2CE39A31-C1AC-4909-833B-4B09457FD785@twistedmatrix.com> Message-ID: On 13 March 2017 at 05:51, Glyph Lefkowitz wrote: > To summarize: Even if we only cared about supplying package upstreams to > Debian (and that is a tiny part of PyPI's mission), right now, using the > existing tooling of uscan and lintian, the only security value that could > _possibly_ be conveyed here would be an out-of-band conversation between > the maintainer and upstream about what their signing keys are and how the > signing process works. Any kind of automation would make it less likely > that would happen, which means that providing tool support to automate this > process would actually make things *worse*. > And much of the same benefits can be obtained by Debian and other third parties maintaining "known hashes" for historical PyPI releases and complaining if they ever change. The only aspect that end-to-end package signing can potentially help with is bypassing PyPI as a potential point of compromise for *new* never-before-seen releases, and much of *that* benefit can be gained by way of publishers providing a list of "expected artifact hashes" through a trusted channel that they control and the PyPI service can't influence. GPG signatures of the artifacts themselves is just one way of establishing that trusted information channel, and it's a particularly publisher-hostile one that's also pretty end-user-hostile as well. The TUF based approach in PEP 458 and PEP 480 has at least in principle support from both Donald and I, but in addition to still relying on HTTPS to bootstrap initial trust, it is also gated behind the Warehouse migration and shutting down the legacy PyPI implementation (which is a sufficiently tedious activity that we think the chances of achieving it with purely volunteer and part-time labour are basically zero). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From lele at metapensiero.it Mon Mar 13 05:47:54 2017 From: lele at metapensiero.it (Lele Gaifax) Date: Mon, 13 Mar 2017 10:47:54 +0100 Subject: [Distutils] Best practice to build binary wheels on Github+Travis and upload to PyPI Message-ID: <871su1pkxh.fsf@metapensiero.it> Hi all, I'd like to learn how to configure a project I keep on Github so that at release time it will trigger a build of binary wheels for different versions of Python 3 and eventually uploading them to PyPI. At first I tried to follow the Travis deploy instruction[1], but while that works for source distribution it cannot be used to deploy binary wheels because AFAICT Travis does not build ?manylinux1?-marked wheels. I then found the manylinux-demo project[2] that uses Docker and contains a a script able to build the wheels for every available version of Python. OTOH, it does not tackle to PyPI upload step. I will try to distill a custom recipe for my own needs looking at how other packages implemented this goal, but I wonder if there is already some documentation that could help me understanding better how to intersect the above steps. Thanks in advance for any hint, ciao, lele. [1] https://docs.travis-ci.com/user/deployment/pypi/ [2] https://github.com/pypa/python-manylinux-demo -- nickname: Lele Gaifax | Quando vivr? di quello che ho pensato ieri real: Emanuele Gaifas | comincer? ad aver paura di chi mi copia. lele at metapensiero.it | -- Fortunato Depero, 1929. From ralf.gommers at gmail.com Mon Mar 13 05:58:03 2017 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 13 Mar 2017 22:58:03 +1300 Subject: [Distutils] Best practice to build binary wheels on Github+Travis and upload to PyPI In-Reply-To: <871su1pkxh.fsf@metapensiero.it> References: <871su1pkxh.fsf@metapensiero.it> Message-ID: On Mon, Mar 13, 2017 at 10:47 PM, Lele Gaifax wrote: > Hi all, > > I'd like to learn how to configure a project I keep on Github so that at > release time it will trigger a build of binary wheels for different > versions > of Python 3 and eventually uploading them to PyPI. > > At first I tried to follow the Travis deploy instruction[1], but while that > works for source distribution it cannot be used to deploy binary wheels > because AFAICT Travis does not build ?manylinux1?-marked wheels. > > I then found the manylinux-demo project[2] that uses Docker and contains a > a script able to build the wheels for every available version of Python. > OTOH, > it does not tackle to PyPI upload step. > > I will try to distill a custom recipe for my own needs looking at how other > packages implemented this goal, but I wonder if there is already some > documentation that could help me understanding better how to intersect the > above steps. > Multibuild is probably the best place to start: https://github.com/matthew-brett/multibuild Here's a relatively simple and up-to-date example of how to produce wheels for Windows, Linux and OS X automatically using multibuild: https://github.com/MacPython/pywavelets-wheels Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Mar 13 06:32:48 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 13 Mar 2017 20:32:48 +1000 Subject: [Distutils] PEP 426 moved back to Draft status In-Reply-To: References: Message-ID: On 11 March 2017 at 14:17, Nick Coghlan wrote: > On 11 March 2017 at 07:03, Daniel Holth wrote: > >> You lost me a bit at 'extra sets'. FYI it is already possible to depend >> on your own extras in another extra. >> >> Extra pseudo code: >> spampackage >> extra['spam'] = 'spampackage[eggs]' >> extra['eggs'] = ... >> > > Oh, nice. In that case, we can drop the '*' idea and just make "all" > another pre-declared extra with a SHOULD that says sdist build tools should > implicitly populate it as: > > { > "requires": "thisproject[extra1,extra2,extra3,extra4]" > "extra": "all" > } > > given an extras clause containing '["extra1",'extra2","extra3","extra4"]'. > > Endorsing that approach to handling "extra sets" does impose a design > constraint though, which is that installation tools will need to > special-case self-referential requirements so they don't get stuck in a > recursive loop. (That will become a new MUST in the spec) > Next update: https://github.com/python/peps/commit/24cd02b34cea1bf35443048fd665485dffd0de93 - metadata version bumped to 3.0 - expected filename changed to pysdist.json and stated to be immutable once generated for a given release - project obsolescence changes deferred to a possible future metadata extension - no proposed changes to extras syntax and the "self" and "runtime" pseudo-extras dropped - "all" added as an implied extra for all declared extras - "alldev" added as an implied superset of "test", "build", "doc" and "dev" Even though it's not strictly necessary, I'd still kind of like to have a standard way to say "install all the dev dependencies, but not the package itself or its runtime dependencies". I guess if we take distro build tools as an example though, they handle that as a separate command (e.g. "dnf builddep" vs "dnf install") rather than as a variation on the normal install command. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Mon Mar 13 11:23:25 2017 From: donald at stufft.io (Donald Stufft) Date: Mon, 13 Mar 2017 11:23:25 -0400 Subject: [Distutils] FYI - "Trending" on Warehouse Message-ID: Just an FYI, I?ve replaced the long stagnation ?top downloads? on the Warehouse / pypi.org homepage with ?Trending? projects. Since ?trending? can mean a lot of different things as far as how it?s computed, here?s how I?m currently doing it [1]: Using a look back over the last 30 days of downloads I compute a ?zscore? for each project for yesterday (effectively, how many standard deviations away from from the mean yesterday was for that project in total downloads). The trending projects is then the top 5 projects in terms of zscore for yesterday (recomputed every day at ~3am UTC). Because it?s a lot easier for a project with an average of 5 downloads to jump to 100 than it is for a project with 50000 downloads jump to 1000000 I have tried to exclude any projects with very few downloads from this, so in order to qualify to be trending a project must receive at least 5,000 downloads in a month. If you happen to be some sort of sciencey person and you know of a better way to query what is effectively a table with a row for every download for every project to determine which ones are trending, feel free to open an issue or create a PR or something. I don?t really know what I?m doing here :) Anyways, that?s all! [1] https://github.com/pypa/warehouse/blob/a36435b9865000cdaae97b948af48c33f7d8fe8e/warehouse/packaging/tasks.py#L19-L102 ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve.dower at python.org Mon Mar 13 13:46:02 2017 From: steve.dower at python.org (Steve Dower) Date: Mon, 13 Mar 2017 10:46:02 -0700 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> <2CE39A31-C1AC-4909-833B-4B09457FD785@twistedmatrix.com> Message-ID: Another drive-by contribution: what if twine printed the hashes for anything it uploads with a message basically saying "here are the things you should publish somewhere for this release so people can check the validity of your packages after they download them"? I suspect many publishers have never considered this is something they could or should do. Some very basic prompting could easily lead to it becoming part of the normal workflow. Top-posted from my Windows Phone -----Original Message----- From: "Nick Coghlan" Sent: ?3/?13/?2017 0:53 To: "Glyph Lefkowitz" Cc: "DistUtils mailing list" ; "Ben Finney" Subject: Re: [Distutils] GnuPG signatures on PyPI: why so few? On 13 March 2017 at 05:51, Glyph Lefkowitz wrote: To summarize: Even if we only cared about supplying package upstreams to Debian (and that is a tiny part of PyPI's mission), right now, using the existing tooling of uscan and lintian, the only security value that could _possibly_ be conveyed here would be an out-of-band conversation between the maintainer and upstream about what their signing keys are and how the signing process works. Any kind of automation would make it less likely that would happen, which means that providing tool support to automate this process would actually make things worse. And much of the same benefits can be obtained by Debian and other third parties maintaining "known hashes" for historical PyPI releases and complaining if they ever change. The only aspect that end-to-end package signing can potentially help with is bypassing PyPI as a potential point of compromise for *new* never-before-seen releases, and much of *that* benefit can be gained by way of publishers providing a list of "expected artifact hashes" through a trusted channel that they control and the PyPI service can't influence. GPG signatures of the artifacts themselves is just one way of establishing that trusted information channel, and it's a particularly publisher-hostile one that's also pretty end-user-hostile as well. The TUF based approach in PEP 458 and PEP 480 has at least in principle support from both Donald and I, but in addition to still relying on HTTPS to bootstrap initial trust, it is also gated behind the Warehouse migration and shutting down the legacy PyPI implementation (which is a sufficiently tedious activity that we think the chances of achieving it with purely volunteer and part-time labour are basically zero). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Mar 13 19:41:01 2017 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 13 Mar 2017 16:41:01 -0700 Subject: [Distutils] PEP 426 moved back to Draft status In-Reply-To: References: Message-ID: On Fri, Mar 10, 2017 at 7:55 AM, Nick Coghlan wrote: > On 11 March 2017 at 00:52, Nathaniel Smith wrote: >> >> On Fri, Mar 10, 2017 at 1:26 AM, Nick Coghlan wrote: >> > Hi folks, >> > >> > After a few years of dormancy, I've finally moved the metadata 2.0 >> > specification back to Draft status: >> > >> > https://github.com/python/peps/commit/8ae8b612d4ea8b3bf5d8a7b795ae8aec48bbb7a3 >> >> We have lots of metadata files in the wild that already claim to be >> version 2.0. If you're reviving this I think you might need to change >> the version number? > > > They're mostly in metadata.json files, though. That said, version numbers > are cheap, so I'm happy to skip straight to 3.0 if folks think it makes more > sense. AFAICT bdist_wheel produces METADATA files with Metadata-Version: 2.0 by default, and has for some time. Certainly this one I just spot-checked does that. >> > Based on our last round of discussion, I've culled a lot of the >> > complexity >> > around dependency declarations, cutting it back to just 4 pre-declared >> > extras (dev, doc, build, test), >> >> I think we can drop 'build' in favor of pyproject.toml? > > > No, as that's a human edited input file, not an output file from the sdist > generation process. > >> >> Actually all of the pre-declared extras are really relevant for sdists >> rather than wheels. Maybe they should all move into pyproject.toml? > > > Think "static release metadata in an API response from PyPI" for this > particular specification, rather than something you'd necessarily check into > source control. That's actually one of the big benefits of doing this post > pyproject.toml - with that taking care of the build system bootstrapping > problem, it frees up pydist.json to be entirely an artifact of the sdist > generation process (and then copying it along to the wheel archives and the > installed package as well). > > That said, that's actually an important open question: is pydist.json always > preserved unmodified through the sdist->wheel->install and sdist->install > process? > > There's a lot to be said for treating the file as immutable, and instead > adding *other* metadata files as a component moves through the distribution > process. If so, then it may actually be more appropriate to call the > rendered file "pysdist.json", since it contains the sdist metadata > specifically, rather than arbitrary distribution metadata. I guess there are three possible kinds of build dependencies: - those that are known statically - those that are determined by running some code at sdist creation time - those that are determined by running some code at build time But all the examples I can think of fall into either bucket A (which pyproject.toml handles), or bucket C (which pydist.json can't handle). So it seems like the metadata here is either going to be redundant or wrong? I'm not sure I understand the motivation for wanting wheels to have a file which says "here's the metadata describing the sdist that you would have, if you had an sdist (which you don't)"? I guess it doesn't hurt anything, but it seems odd. > I'd also be fairly strongly opposed to converting extras from an optional > dependency management system to a "let multiple PyPI packages target the > same site-packages subdirectory" because we already know that's a nightmare > from the Linux distro experience (having a clear "main" package that owns > the parent directory with optional subpackages solves *some* of the > problems, but my main reaction is still "Run awaaay"). The "let multiple PyPI packages target the same site-packages directory" problem is orthogonal to the reified extras proposal. I actually think we can't avoid handling the same site-packages directory problem, but the solution is namespace packages and/or better Conflicts: metadata. Example illustrating why the site-packages conflict problem arises independently of reified extras: people want to distribute numpy built against different BLAS backends, especially MKL (which is good but zero-cost proprietary) versus OpenBLAS (which is not as good but is free). Right now that's possible by distributing 'numpy' and 'numpy-mkl' packages, but of course ugly stuff happens if you try to install both; some sort of Conflicts: metadata would help. If we instead have the packages be named 'numpy' and 'numpy[mkl]', then they're in exactly the same position with respect to conflicts. The very significant advantage is that we know that 'numpy[mkl]' "belongs to" the numpy project, so 'numpy[mkl]' can say 'Provides-Dist: numpy' without all the security issues that Provides-Dist otherwise runs into. Example illustrating why reifed extras are useful totally independently of site-packages conflicts: it would be REALLY NICE if numpy could say 'Provides-Dist: numpy[abi=7]' and then packages could depend on 'numpy[abi=7]' and have that do something sensible. This would be a pure virtual package. > It especially isn't needed just to solve the "pip forgets what extras it > installed" problem - that technically doesn't even need a PEP to resolve, it > just needs pip to drop a pip specific file into the PEP 376 dist-info > directory that says what extras to request when doing future upgrades. But that breaks if people use a package manager other than pip, which is something we want to support, right? And in any case it requires a bunch more redundant special-case logic inside pip, to basically make extras act like virtual packages. -n -- Nathaniel J. Smith -- https://vorpus.org From ncoghlan at gmail.com Tue Mar 14 00:23:55 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 14 Mar 2017 14:23:55 +1000 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> <2CE39A31-C1AC-4909-833B-4B09457FD785@twistedmatrix.com> Message-ID: On 14 March 2017 at 03:46, Steve Dower wrote: > Another drive-by contribution: what if twine printed the hashes for > anything it uploads with a message basically saying "here are the things > you should publish somewhere for this release so people can check the > validity of your packages after they download them"? > > I suspect many publishers have never considered this is something they > could or should do. Some very basic prompting could easily lead to it > becoming part of the normal workflow. > Huh, and with most PyPI publishers using public version control systems, their source control repo itself could even serve as "a trusted channel that they control and the PyPI service can't influence". For example, the artifact hashes could be written out by default to: .released_artifacts//.sha256 And if twine sees the hash file exists before it starts the upload, it could complain that the given artifact had already been published even before PyPI complains about it. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From glyph at twistedmatrix.com Tue Mar 14 01:48:19 2017 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Mon, 13 Mar 2017 22:48:19 -0700 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> <2CE39A31-C1AC-4909-833B-4B09457FD785@twistedmatrix.com> Message-ID: <99F7634D-17C5-4344-A6C3-0FF318FA5BFB@twistedmatrix.com> > On Mar 13, 2017, at 9:23 PM, Nick Coghlan wrote: > > On 14 March 2017 at 03:46, Steve Dower > wrote: > Another drive-by contribution: what if twine printed the hashes for anything it uploads with a message basically saying "here are the things you should publish somewhere for this release so people can check the validity of your packages after they download them"? > > I suspect many publishers have never considered this is something they could or should do. Some very basic prompting could easily lead to it becoming part of the normal workflow. > > Huh, and with most PyPI publishers using public version control systems, their source control repo itself could even serve as "a trusted channel that they control and the PyPI service can't influence". For example, the artifact hashes could be written out by default to: > > .released_artifacts//.sha256 > > And if twine sees the hash file exists before it starts the upload, it could complain that the given artifact had already been published even before PyPI complains about it. 1. This sounds like it could be very cool. 2. Except, as stated - i.e. hashes without signatures - this just means we all trust Github rather than PyPI :). 3. A simple signing scheme, like https://minilock.io but for plaintext signatures rather than encryption , could potentially address this problem. 4. Cool as that would be, someone would need to design that thing first, and that person would need to be a cryptographer. 5. Now all you need to do is design a globally addressable PKI system. Good luck everybody ;-). -glyph -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Tue Mar 14 01:55:08 2017 From: donald at stufft.io (Donald Stufft) Date: Tue, 14 Mar 2017 01:55:08 -0400 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: <99F7634D-17C5-4344-A6C3-0FF318FA5BFB@twistedmatrix.com> References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> <2CE39A31-C1AC-4909-833B-4B09457FD785@twistedmatrix.com> <99F7634D-17C5-4344-A6C3-0FF318FA5BFB@twistedmatrix.com> Message-ID: <84CD63B4-0EF0-475E-BA41-B6DA2C468A69@stufft.io> > On Mar 14, 2017, at 1:48 AM, Glyph Lefkowitz wrote: > > 3. A simple signing scheme, like https://minilock.io but for plaintext signatures rather than encryption , could potentially address this problem. This is basically the plan, using it in conjunction with TUF for the fiddly bits (Because simply signing files isn?t good enough). ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Mar 14 02:52:14 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 14 Mar 2017 16:52:14 +1000 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: <99F7634D-17C5-4344-A6C3-0FF318FA5BFB@twistedmatrix.com> References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> <2CE39A31-C1AC-4909-833B-4B09457FD785@twistedmatrix.com> <99F7634D-17C5-4344-A6C3-0FF318FA5BFB@twistedmatrix.com> Message-ID: On 14 March 2017 at 15:48, Glyph Lefkowitz wrote: > > 2. Except, as stated - i.e. hashes without signatures - this just means we > all trust Github rather than PyPI :). > Yeah, HTTPS would still be a common point of compromise - that kind of simple scheme would just let the repo hosting and PyPI serve as cross-checks on each other, such that you had to compromise both (or the original publisher's system) in order to corrupt both the published artifact *and* the publisher's record of the expected artifact hash. It would also be enough to let publishers check that the artifacts that PyPI is serving match what they originally uploaded - treating it as a QA problem as much as a security one. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Mar 14 03:34:21 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 14 Mar 2017 17:34:21 +1000 Subject: [Distutils] PEP 426 moved back to Draft status In-Reply-To: References: Message-ID: On 14 March 2017 at 09:41, Nathaniel Smith wrote: > On Fri, Mar 10, 2017 at 7:55 AM, Nick Coghlan wrote: > > On 11 March 2017 at 00:52, Nathaniel Smith wrote: > >> We have lots of metadata files in the wild that already claim to be > >> version 2.0. If you're reviving this I think you might need to change > >> the version number? > > > > They're mostly in metadata.json files, though. That said, version numbers > > are cheap, so I'm happy to skip straight to 3.0 if folks think it makes > more > > sense. > > AFAICT bdist_wheel produces METADATA files with Metadata-Version: 2.0 > by default, and has for some time. Certainly this one I just > spot-checked does that. > We could always retroactively declare "2.0" to just mean 1.3 + Provides-Extra + (optionally) Description-Content-Type (once that has been defined in a way that makes sense for PyPI). Either way, I'm convinced that the JSON based format should start out at 3.0. > > There's a lot to be said for treating the file as immutable, and instead > > adding *other* metadata files as a component moves through the > distribution > > process. If so, then it may actually be more appropriate to call the > > rendered file "pysdist.json", since it contains the sdist metadata > > specifically, rather than arbitrary distribution metadata. > > I guess there are three possible kinds of build dependencies: > - those that are known statically > - those that are determined by running some code at sdist creation time > - those that are determined by running some code at build time > > But all the examples I can think of fall into either bucket A (which > pyproject.toml handles), or bucket C (which pydist.json can't handle). > So it seems like the metadata here is either going to be redundant or > wrong? > pyproject.toml only handles the bootstrapping dependencies for the build system itself, it *doesn't* necessarily include all the build dependencies, which may be in tool specific files (like setup_requires in setup.py) or otherwise added by the build system without and record of it in pyproject.toml. The build system knows the latter when it generates the sdist, and it means PyPI can extract and republish them without having to actually invoke the build system. For dynamic dependencies where the environment marker system isn't flexible enough to express the installation conditions (so they can't be generated at sdist creation time), that will be something for the publishers of a particular project to resolve with the folks that want the ability to do builds in environments that are isolated from the internet, and hence can't download arbitrary additional dependencies at build time. > I'm not sure I understand the motivation for wanting wheels to have a > file which says "here's the metadata describing the sdist that you > would have, if you had an sdist (which you don't)"? I guess it doesn't > hurt anything, but it seems odd. > Wheels still have a corresponding source artifact, even if it hasn't been published anywhere using the Python-specific sdist format. Accordingly, I don't think it makes sense to be able to tell just from looking at a wheel file whether the generation process was: * tree -> sdist -> wheel; or * tree -> wheel > I'd also be fairly strongly opposed to converting extras from an optional > > dependency management system to a "let multiple PyPI packages target the > > same site-packages subdirectory" because we already know that's a > nightmare > > from the Linux distro experience (having a clear "main" package that owns > > the parent directory with optional subpackages solves *some* of the > > problems, but my main reaction is still "Run awaaay"). > > The "let multiple PyPI packages target the same site-packages > directory" problem is orthogonal to the reified extras proposal. I > actually think we can't avoid handling the same site-packages > directory problem, but the solution is namespace packages and/or > better Conflicts: metadata. > > Example illustrating why the site-packages conflict problem arises > independently of reified extras: people want to distribute numpy built > against different BLAS backends, especially MKL (which is good but > zero-cost proprietary) versus OpenBLAS (which is not as good but is > free). Right now that's possible by distributing 'numpy' and > 'numpy-mkl' packages, but of course ugly stuff happens if you try to > install both; some sort of Conflicts: metadata would help. If we > instead have the packages be named 'numpy' and 'numpy[mkl]', then > they're in exactly the same position with respect to conflicts. The > very significant advantage is that we know that 'numpy[mkl]' "belongs > to" the numpy project, so 'numpy[mkl]' can say 'Provides-Dist: numpy' > without all the security issues that Provides-Dist otherwise runs > into. > Do other components need to be rebuilt or relinked if the NumPy BLAS backend changes? If the answer is yes, then this is something I'd strongly prefer to leave to conda and other package management systems like Nix that better support parallel installation of multiple versions of C/C++ dependencies. If the answer is no, then it seems like a better solution might be to allow for rich dependencies, where numpy could depend on "_numpy_backends.openblas or _numpy_backends.mkl" and figure out the details of exactly what's available and which one it's going to use at import time. Either way, contorting the Extras system to try to cover such a significantly different set of needs doesn't seem like a good idea. > > Example illustrating why reifed extras are useful totally > independently of site-packages conflicts: it would be REALLY NICE if > numpy could say 'Provides-Dist: numpy[abi=7]' and then packages could > depend on 'numpy[abi=7]' and have that do something sensible. This > would be a pure virtual package. > PEP 459 has a whole separate "python.constraints" extension rather than trying to cover environmental constraints within the existing Extras system: https://www.python.org/dev/peps/pep-0459/#the-python-constraints-extension > > It especially isn't needed just to solve the "pip forgets what extras it > > installed" problem - that technically doesn't even need a PEP to > resolve, it > > just needs pip to drop a pip specific file into the PEP 376 dist-info > > directory that says what extras to request when doing future upgrades. > > But that breaks if people use a package manager other than pip, which > is something we want to support, right? And in any case it requires a > bunch more redundant special-case logic inside pip, to basically make > extras act like virtual packages. > OK, it would still need a PEP to make the file name and format standardised across tools. Either way, it's an "installed packages database" problem, not a software publication problem. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From dholth at gmail.com Tue Mar 14 10:35:13 2017 From: dholth at gmail.com (Daniel Holth) Date: Tue, 14 Mar 2017 14:35:13 +0000 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> <2CE39A31-C1AC-4909-833B-4B09457FD785@twistedmatrix.com> <99F7634D-17C5-4344-A6C3-0FF318FA5BFB@twistedmatrix.com> Message-ID: The wheel command implements but never fully realized the commands 'wheel keygen', 'wheel sign' for a bundled signature scheme (where the signature is inside the signed file) inspired by JAR signing and based on Ed25519 primitives + JSON web signature / JSON web key. The idea was to have wheel automatically generate a signing key and always generate signed wheels, since it's impossible to verify signatures if there are none. Successive releases from the same author would tend to use the same keys; a TOFU (trust on first use) model, a-la ssh, would warn you if the key changed. The public keys would be distributed over a separate https:// server (perhaps the publisher's personal web page, or an application could publish a list of public keys for its dependencies as-tested). Instead of checking the hash of an exact release artifact, you could use a similar syntax to check against a particular public key and cover yourself for future releases. Instead of key revocation, you could let the only valid signing keys be the ones currently available at the key URL, like oauth2 https://www.googleapis.com/oauth2/v3/certs The goal you'd want to shoot for is not 'is this package good' but 'am I being targeted'. A log of timestamp signatures for everything uploaded to PyPI could be very powerful here and might even be useful without publisher signatures, so that you could at least know that you are downloading the same reasonably old version of package X that everyone else is using. If there was a publisher signature, the timestamp server would sign the publisher's signature asserting 'this signature was valid at time X'. On Tue, Mar 14, 2017 at 2:52 AM Nick Coghlan wrote: > On 14 March 2017 at 15:48, Glyph Lefkowitz > wrote: > > > 2. Except, as stated - i.e. hashes without signatures - this just means we > all trust Github rather than PyPI :). > > > Yeah, HTTPS would still be a common point of compromise - that kind of > simple scheme would just let the repo hosting and PyPI serve as > cross-checks on each other, such that you had to compromise both (or the > original publisher's system) in order to corrupt both the published > artifact *and* the publisher's record of the expected artifact hash. > > It would also be enough to let publishers check that the artifacts that > PyPI is serving match what they originally uploaded - treating it as a QA > problem as much as a security one. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Mar 14 12:05:23 2017 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 14 Mar 2017 09:05:23 -0700 Subject: [Distutils] PEP 426 moved back to Draft status In-Reply-To: References: Message-ID: On Tue, Mar 14, 2017 at 12:34 AM, Nick Coghlan wrote: > On 14 March 2017 at 09:41, Nathaniel Smith wrote: >> >> On Fri, Mar 10, 2017 at 7:55 AM, Nick Coghlan wrote: >> > On 11 March 2017 at 00:52, Nathaniel Smith wrote: >> > There's a lot to be said for treating the file as immutable, and instead >> > adding *other* metadata files as a component moves through the >> > distribution >> > process. If so, then it may actually be more appropriate to call the >> > rendered file "pysdist.json", since it contains the sdist metadata >> > specifically, rather than arbitrary distribution metadata. >> >> I guess there are three possible kinds of build dependencies: >> - those that are known statically >> - those that are determined by running some code at sdist creation time >> - those that are determined by running some code at build time >> >> But all the examples I can think of fall into either bucket A (which >> pyproject.toml handles), or bucket C (which pydist.json can't handle). >> So it seems like the metadata here is either going to be redundant or >> wrong? > > > pyproject.toml only handles the bootstrapping dependencies for the build > system itself, it *doesn't* necessarily include all the build dependencies, > which may be in tool specific files (like setup_requires in setup.py) or > otherwise added by the build system without and record of it in > pyproject.toml. The build system knows the latter when it generates the > sdist, and it means PyPI can extract and republish them without having to > actually invoke the build system. Currently there are cases where people use setup_requires for what's actually static metadata, sure, but that's just because there hasn't been any alternative. The main actual *needs* are: - static build dependencies - dynamic build dependencies determined at build time So it seems to me that we should encourage people to move static dependencies into the static metadata (pyproject.toml), and when they don't then we can treat them like build-time dependencies, which is a problem we need to solve anyway. Having special metadata for "sdist creation-time dependencies" strikes me as papering over the needless complexity of the current system by adding more complexity on top. I can see how it'd have some short-term benefits but it seems net-harmful in the long run IMHO. (If we need a hack to cover the transition period from secretly-static-setup_requires to actually-static-pyproject.toml, maybe we could teach the setuptools sdist command to push setup_requires into pyproject.toml? That'd be a pretty simple hack that wouldn't increase the surface area of our interoperability problems.) >> >> I'm not sure I understand the motivation for wanting wheels to have a >> file which says "here's the metadata describing the sdist that you >> would have, if you had an sdist (which you don't)"? I guess it doesn't >> hurt anything, but it seems odd. > > > Wheels still have a corresponding source artifact, even if it hasn't been > published anywhere using the Python-specific sdist format. Accordingly, I > don't think it makes sense to be able to tell just from looking at a wheel > file whether the generation process was: > > * tree -> sdist -> wheel; or > * tree -> wheel My point is just that usually if I'm looking at artifact A, I don't care about metadata about artifact B :-). Suppose someone has one of these wheels with an sdist.json in it. My question is, under what circumstances are you imagining that they'd look at that sdist.json? What would they do with it? The only case I can think of is for provenance tracking of various kinds, but I don't think just throwing in the sdist metadata is a very good solution to that. If we want source->binary provenance tracking then I'd rather see something focused on that problem, like wheel metadata fields Sdist-SHA256, Build-Host, Build-Time, etc. This isn't what sdist metadata is designed for, so to the extent that it would help solve the problem it's by accident, incomplete. >> > I'd also be fairly strongly opposed to converting extras from an >> > optional >> > dependency management system to a "let multiple PyPI packages target the >> > same site-packages subdirectory" because we already know that's a >> > nightmare >> > from the Linux distro experience (having a clear "main" package that >> > owns >> > the parent directory with optional subpackages solves *some* of the >> > problems, but my main reaction is still "Run awaaay"). >> >> The "let multiple PyPI packages target the same site-packages >> directory" problem is orthogonal to the reified extras proposal. I >> actually think we can't avoid handling the same site-packages >> directory problem, but the solution is namespace packages and/or >> better Conflicts: metadata. >> >> Example illustrating why the site-packages conflict problem arises >> independently of reified extras: people want to distribute numpy built >> against different BLAS backends, especially MKL (which is good but >> zero-cost proprietary) versus OpenBLAS (which is not as good but is >> free). Right now that's possible by distributing 'numpy' and >> 'numpy-mkl' packages, but of course ugly stuff happens if you try to >> install both; some sort of Conflicts: metadata would help. If we >> instead have the packages be named 'numpy' and 'numpy[mkl]', then >> they're in exactly the same position with respect to conflicts. The >> very significant advantage is that we know that 'numpy[mkl]' "belongs >> to" the numpy project, so 'numpy[mkl]' can say 'Provides-Dist: numpy' >> without all the security issues that Provides-Dist otherwise runs >> into. > > > Do other components need to be rebuilt or relinked if the NumPy BLAS backend > changes? > > If the answer is yes, then this is something I'd strongly prefer to leave to > conda and other package management systems like Nix that better support > parallel installation of multiple versions of C/C++ dependencies. > > If the answer is no, then it seems like a better solution might be to allow > for rich dependencies, where numpy could depend on "_numpy_backends.openblas > or _numpy_backends.mkl" and figure out the details of exactly what's > available and which one it's going to use at import time. The answer is no, and it's unlikely that numpy will massively rewrite its internals because pip is missing a feature that every other packaging system has. > Either way, contorting the Extras system to try to cover such a > significantly different set of needs doesn't seem like a good idea. The advantage of the "reified extras" idea is that it actually *removes* features and complexity while *also* solving a bunch of problems that are intractable today. So from my point of view, it's the status quo that's contorted :-). >> >> Example illustrating why reifed extras are useful totally >> independently of site-packages conflicts: it would be REALLY NICE if >> numpy could say 'Provides-Dist: numpy[abi=7]' and then packages could >> depend on 'numpy[abi=7]' and have that do something sensible. This >> would be a pure virtual package. > > > PEP 459 has a whole separate "python.constraints" extension rather than > trying to cover environmental constraints within the existing Extras system: > https://www.python.org/dev/peps/pep-0459/#the-python-constraints-extension I feel like this is the old argument between whether the best way to handle a complex problem space is with a complex solution, or with several simple solutions that can be composed. We can't even get a dependency resolver that handles simple dist-to-dist dependencies, and you want to add a whole second kind of constraints with its own semantics? (Or really third kind, b/c extras are already a second kind once we start tracking them properly.) -n -- Nathaniel J. Smith -- https://vorpus.org From glyph at twistedmatrix.com Wed Mar 15 01:48:58 2017 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Tue, 14 Mar 2017 22:48:58 -0700 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> <2CE39A31-C1AC-4909-833B-4B09457FD785@twistedmatrix.com> <99F7634D-17C5-4344-A6C3-0FF318FA5BFB@twistedmatrix.com> Message-ID: <8D50A7F2-72AF-40E1-A68F-3BEDB6A9B7B9@twistedmatrix.com> The big problem here, of course, is "key management"; what happens when someone throws their laptop in a river. https://github.com/ahf/teneo indicates to me that it may be possible to use a KDF to get an Ed25519 key from a passphrase that the user remembers, minilock-style, largely mitigating that problem, assuming we can get users to remember stuff :-). -g > On Mar 14, 2017, at 7:35 AM, Daniel Holth wrote: > > The wheel command implements but never fully realized the commands 'wheel keygen', 'wheel sign' for a bundled signature scheme (where the signature is inside the signed file) inspired by JAR signing and based on Ed25519 primitives + JSON web signature / JSON web key. The idea was to have wheel automatically generate a signing key and always generate signed wheels, since it's impossible to verify signatures if there are none. Successive releases from the same author would tend to use the same keys; a TOFU (trust on first use) model, a-la ssh, would warn you if the key changed. The public keys would be distributed over a separate https:// server (perhaps the publisher's personal web page, or an application could publish a list of public keys for its dependencies as-tested). Instead of checking the hash of an exact release artifact, you could use a similar syntax to check against a particular public key and cover yourself for future releases. Instead of key revocation, you could let the only valid signing keys be the ones currently available at the key URL, like oauth2 https://www.googleapis.com/oauth2/v3/certs > > The goal you'd want to shoot for is not 'is this package good' but 'am I being targeted'. A log of timestamp signatures for everything uploaded to PyPI could be very powerful here and might even be useful without publisher signatures, so that you could at least know that you are downloading the same reasonably old version of package X that everyone else is using. If there was a publisher signature, the timestamp server would sign the publisher's signature asserting 'this signature was valid at time X'. > > On Tue, Mar 14, 2017 at 2:52 AM Nick Coghlan > wrote: > On 14 March 2017 at 15:48, Glyph Lefkowitz > wrote: > > 2. Except, as stated - i.e. hashes without signatures - this just means we all trust Github rather than PyPI :). > > Yeah, HTTPS would still be a common point of compromise - that kind of simple scheme would just let the repo hosting and PyPI serve as cross-checks on each other, such that you had to compromise both (or the original publisher's system) in order to corrupt both the published artifact *and* the publisher's record of the expected artifact hash. > > It would also be enough to let publishers check that the artifacts that PyPI is serving match what they originally uploaded - treating it as a QA problem as much as a security one. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Mar 15 06:37:45 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 15 Mar 2017 20:37:45 +1000 Subject: [Distutils] PEP 426 moved back to Draft status In-Reply-To: References: Message-ID: On 15 March 2017 at 02:05, Nathaniel Smith wrote: > Having special metadata for "sdist creation-time dependencies" strikes > me as papering over the needless complexity of the current system by > adding more complexity on top. I can see how it'd have some short-term > benefits but it seems net-harmful in the long run IMHO. > How do you propose Warehouse should publish the static metadata? How should distlib abstract over the different metadata formats? Or perhaps I should just drop the whole section about "pysdist.json" files? It's orthogonal to the essential purpose of the PEP, and it seems to be confusing people more than it's helping (we *can* put these files in the sdists they describe, but we don't *have* to). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From dholth at gmail.com Wed Mar 15 13:06:42 2017 From: dholth at gmail.com (Daniel Holth) Date: Wed, 15 Mar 2017 17:06:42 +0000 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: <8D50A7F2-72AF-40E1-A68F-3BEDB6A9B7B9@twistedmatrix.com> References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> <2CE39A31-C1AC-4909-833B-4B09457FD785@twistedmatrix.com> <99F7634D-17C5-4344-A6C3-0FF318FA5BFB@twistedmatrix.com> <8D50A7F2-72AF-40E1-A68F-3BEDB6A9B7B9@twistedmatrix.com> Message-ID: Or they could be printed as QR codes. An interesting secure time service: https://roughtime.googlesource.com/roughtime On Wed, Mar 15, 2017 at 1:49 AM Glyph Lefkowitz wrote: > The big problem here, of course, is "key management"; what happens when > someone throws their laptop in a river. > > https://github.com/ahf/teneo indicates to me that it may be possible to > use a KDF to get an Ed25519 key from a passphrase that the user remembers, > minilock-style, largely mitigating that problem, assuming we can get users > to remember stuff :-). > > -g > > On Mar 14, 2017, at 7:35 AM, Daniel Holth wrote: > > The wheel command implements but never fully realized the commands 'wheel > keygen', 'wheel sign' for a bundled signature scheme (where the signature > is inside the signed file) inspired by JAR signing and based on Ed25519 > primitives + JSON web signature / JSON web key. The idea was to have wheel > automatically generate a signing key and always generate signed wheels, > since it's impossible to verify signatures if there are none. Successive > releases from the same author would tend to use the same keys; a TOFU > (trust on first use) model, a-la ssh, would warn you if the key changed. > The public keys would be distributed over a separate https:// server > (perhaps the publisher's personal web page, or an application could publish > a list of public keys for its dependencies as-tested). Instead of checking > the hash of an exact release artifact, you could use a similar syntax to > check against a particular public key and cover yourself for future > releases. Instead of key revocation, you could let the only valid signing > keys be the ones currently available at the key URL, like oauth2 > https://www.googleapis.com/oauth2/v3/certs > > The goal you'd want to shoot for is not 'is this package good' but 'am I > being targeted'. A log of timestamp signatures for everything uploaded to > PyPI could be very powerful here and might even be useful without publisher > signatures, so that you could at least know that you are downloading the > same reasonably old version of package X that everyone else is using. If > there was a publisher signature, the timestamp server would sign the > publisher's signature asserting 'this signature was valid at time X'. > > On Tue, Mar 14, 2017 at 2:52 AM Nick Coghlan wrote: > > On 14 March 2017 at 15:48, Glyph Lefkowitz > wrote: > > > 2. Except, as stated - i.e. hashes without signatures - this just means we > all trust Github rather than PyPI :). > > > Yeah, HTTPS would still be a common point of compromise - that kind of > simple scheme would just let the repo hosting and PyPI serve as > cross-checks on each other, such that you had to compromise both (or the > original publisher's system) in order to corrupt both the published > artifact *and* the publisher's record of the expected artifact hash. > > It would also be enough to let publishers check that the artifacts that > PyPI is serving match what they originally uploaded - treating it as a QA > problem as much as a security one. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jwilk at jwilk.net Wed Mar 15 13:36:14 2017 From: jwilk at jwilk.net (Jakub Wilk) Date: Wed, 15 Mar 2017 18:36:14 +0100 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: References: <85r323gw48.fsf@benfinney.id.au> Message-ID: <20170315173614.d7umatuvdep4mzcl@jwilk.net> * Ian Cordasco , 2017-03-11, 20:26: >What prospects are there for PyPI to have GnuPG-signed packages by default? Could you clarify what do you mean by "by default"? Do you mean that people who want to upload unsigned packages would have to jump through extra hoops, or something else? >Debian's UScan has the ability to find, download, and verify the GnuPG >signature for a package source release. FWIW, it's not only Debian. OpenSUSE and Arch (and hopefully all other major distros) have tools to automatically verify upstream OpenPGP signatures, too. -- Jakub Wilk From lele at metapensiero.it Thu Mar 16 08:13:05 2017 From: lele at metapensiero.it (Lele Gaifax) Date: Thu, 16 Mar 2017 13:13:05 +0100 Subject: [Distutils] Best practice to build binary wheels on Github+Travis and upload to PyPI References: <871su1pkxh.fsf@metapensiero.it> Message-ID: <87a88l1ktq.fsf@metapensiero.it> Ralf Gommers writes: > Multibuild is probably the best place to start: > https://github.com/matthew-brett/multibuild Thank you Ralf, I will surely do some experiments building (pun intended) on top of that! ciao, lele. -- nickname: Lele Gaifax | Quando vivr? di quello che ho pensato ieri real: Emanuele Gaifas | comincer? ad aver paura di chi mi copia. lele at metapensiero.it | -- Fortunato Depero, 1929. From opensource at ronnypfannschmidt.de Fri Mar 17 05:58:50 2017 From: opensource at ronnypfannschmidt.de (Ronny Pfannschmidt) Date: Fri, 17 Mar 2017 10:58:50 +0100 Subject: [Distutils] [proposal] version-free + lookup-friendly dist-info location Message-ID: <8b536727-9029-dbce-6830-6e76551dfffd@ronnypfannschmidt.de> Hi everyone, while looking over the recent peps i noticed that we keep a few inherent inefficiencies in where to find dist-info folders because they include version numbers, to get a distribution we have to search for it which is no longer really sensible as we no longer have multi-version installation in any upcoming standard. in order to address that i'd like to propose to switch from "{distribution}-{version}.dist-info/" to "{distribution}.dist-info/" given that it has been used since quite a while i would prefer a quick feedback loop from the ML before thinking about writing a PEP. -- Ronny -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Fri Mar 17 06:32:44 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 17 Mar 2017 10:32:44 +0000 Subject: [Distutils] [proposal] version-free + lookup-friendly dist-info location In-Reply-To: <8b536727-9029-dbce-6830-6e76551dfffd@ronnypfannschmidt.de> References: <8b536727-9029-dbce-6830-6e76551dfffd@ronnypfannschmidt.de> Message-ID: On 17 March 2017 at 09:58, Ronny Pfannschmidt wrote: > while looking over the recent peps i noticed that we keep a few inherent > inefficiencies in where to find dist-info folders > > because they include version numbers, to get a distribution we have to > search for it > which is no longer really sensible as we no longer have multi-version > installation in any upcoming standard. > > in order to address that i'd like to propose to switch > > from "{distribution}-{version}.dist-info/" to "{distribution}.dist-info/" > > given that it has been used since quite a while i would prefer a quick > feedback loop from the ML before thinking about writing a PEP. +1 from me. And maybe explicitly state that installing multiple versions of a distribution is not supported. Although this opens a somewhat larger can of worms, in that you can install different versions in separate directories - say in system and user site-packages - and that has subtle issues but is technically not rejected at the moment. So maybe restrict it to stating that installing multiple versions of a distribution *in the same directory* is not supported and duck the bigger issue for now. Paul From leorochael at gmail.com Fri Mar 17 08:50:02 2017 From: leorochael at gmail.com (Leonardo Rochael Almeida) Date: Fri, 17 Mar 2017 09:50:02 -0300 Subject: [Distutils] [proposal] version-free + lookup-friendly dist-info location In-Reply-To: References: <8b536727-9029-dbce-6830-6e76551dfffd@ronnypfannschmidt.de> Message-ID: On 17 March 2017 at 07:32, Paul Moore wrote: > On 17 March 2017 at 09:58, Ronny Pfannschmidt < > opensource at ronnypfannschmidt.de> wrote: > > [...] > > in order to address that i'd like to propose to switch > > > > from "{distribution}-{version}.dist-info/" to > "{distribution}.dist-info/" > > > > given that it has been used since quite a while i would prefer a quick > > feedback loop from the ML before thinking about writing a PEP. > > +1 from me. And maybe explicitly state that installing multiple > versions of a distribution is not supported. Although this opens a > somewhat larger can of worms, in that you can install different > versions in separate directories - say in system and user > site-packages - and that has subtle issues but is technically not > rejected at the moment. People today rely on being able to install different versions of packages already installed in other directories. System, vs User site-packages, as you mentioned is one example. The `--system-site-packages` switch to `virtualenv` is another. In my experience, many projects rely on pre-packaged hard-to-build system packages, while using virtualenv to install more up-to-date versions project dependencies. So maybe restrict it to stating that > installing multiple versions of a distribution *in the same directory* > is not supported and duck the bigger issue for now. > This is already the case everywhere. Even setuptools' `easy_install`, while capable of installing multiple versions of the same project in the same site-packages directory, is in reality installing each one to it's own `.egg` directory inside `site-packages` and can keep only one of them "active" at a time. ("active" meaning: importable without an explicit incantation to request a different installed version). I'm +0 on this proposal (the lack of enthusiasm coming from the fact that multiple projects will be affected), but I'm -lots on any proposal forbidding installation of different versions in different directories. Cheers, Leo -------------- next part -------------- An HTML attachment was scrubbed... URL: From freddyrietdijk at fridh.nl Fri Mar 17 09:04:23 2017 From: freddyrietdijk at fridh.nl (Freddy Rietdijk) Date: Fri, 17 Mar 2017 14:04:23 +0100 Subject: [Distutils] [proposal] version-free + lookup-friendly dist-info location In-Reply-To: References: <8b536727-9029-dbce-6830-6e76551dfffd@ronnypfannschmidt.de> Message-ID: Users may want to split installations over multiple folders and creating a working environment by merging them (through symlinks / PYTHONPATH / sys.path / site / .pth). Python doesn't support multiple versions of the same module during runtime, and therefore I don't see any benefit in including a "unique" identifier like the version number (which in fact isn't unique, because one could also have multiple instances of the same version but e.g. build differently). In any case, forbidding installation of different versions in different directories is not possible because, as I mentioned, one can create an environment by simply merging site-packages folders. On Fri, Mar 17, 2017 at 1:50 PM, Leonardo Rochael Almeida < leorochael at gmail.com> wrote: > > > On 17 March 2017 at 07:32, Paul Moore wrote: > >> On 17 March 2017 at 09:58, Ronny Pfannschmidt > ronnypfannschmidt.de> wrote: >> > [...] >> > in order to address that i'd like to propose to switch >> > >> > from "{distribution}-{version}.dist-info/" to >> "{distribution}.dist-info/" >> > >> > given that it has been used since quite a while i would prefer a quick >> > feedback loop from the ML before thinking about writing a PEP. >> >> +1 from me. And maybe explicitly state that installing multiple >> versions of a distribution is not supported. Although this opens a >> somewhat larger can of worms, in that you can install different >> versions in separate directories - say in system and user >> site-packages - and that has subtle issues but is technically not >> rejected at the moment. > > > People today rely on being able to install different versions of packages > already installed in other directories. System, vs User site-packages, as > you mentioned is one example. > > The `--system-site-packages` switch to `virtualenv` is another. In my > experience, many projects rely on pre-packaged hard-to-build system > packages, while using virtualenv to install more up-to-date versions > project dependencies. > > So maybe restrict it to stating that >> installing multiple versions of a distribution *in the same directory* >> is not supported and duck the bigger issue for now. >> > > This is already the case everywhere. Even setuptools' `easy_install`, > while capable of installing multiple versions of the same project in the > same site-packages directory, is in reality installing each one to it's own > `.egg` directory inside `site-packages` and can keep only one of > them "active" at a time. ("active" meaning: importable without an explicit > incantation to request a different installed version). > > I'm +0 on this proposal (the lack of enthusiasm coming from the fact that > multiple projects will be affected), but I'm -lots on any proposal > forbidding installation of different versions in different directories. > > Cheers, > > Leo > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Fri Mar 17 09:25:19 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 17 Mar 2017 13:25:19 +0000 Subject: [Distutils] [proposal] version-free + lookup-friendly dist-info location In-Reply-To: References: <8b536727-9029-dbce-6830-6e76551dfffd@ronnypfannschmidt.de> Message-ID: On 17 March 2017 at 12:50, Leonardo Rochael Almeida wrote: > I'm +0 on this proposal (the lack of enthusiasm coming from the fact that > multiple projects will be affected), but I'm -lots on any proposal > forbidding installation of different versions in different directories. The comment I made was simply that having different versions in different directories *both of which are on sys.path at the same time* is invalid - it is now, and nothing I suggested changed anything, I was just suggesting documenting that fact. (That setup might work, as long as *all* of the files in the "inactive" version are completely shadowed by the active version, and yes that's normally the case, but nothing in the import or packaging infrastructure makes sure that's the case, so there's a risk of unexpected bugs). But whatever. If people don't want to document the restriction, that's OK. Doesn't mean it's going to work any better, of course :-) Paul From donald at stufft.io Fri Mar 17 09:34:13 2017 From: donald at stufft.io (Donald Stufft) Date: Fri, 17 Mar 2017 09:34:13 -0400 Subject: [Distutils] [proposal] version-free + lookup-friendly dist-info location In-Reply-To: References: <8b536727-9029-dbce-6830-6e76551dfffd@ronnypfannschmidt.de> Message-ID: <883AD099-4647-400D-8AD7-3895F5CA138C@stufft.io> > On Mar 17, 2017, at 9:25 AM, Paul Moore wrote: > > On 17 March 2017 at 12:50, Leonardo Rochael Almeida > wrote: >> I'm +0 on this proposal (the lack of enthusiasm coming from the fact that >> multiple projects will be affected), but I'm -lots on any proposal >> forbidding installation of different versions in different directories. > > The comment I made was simply that having different versions in > different directories *both of which are on sys.path at the same time* > is invalid - it is now, and nothing I suggested changed anything, I > was just suggesting documenting that fact. (That setup might work, as > long as *all* of the files in the "inactive" version are completely > shadowed by the active version, and yes that's normally the case, but > nothing in the import or packaging infrastructure makes sure that's > the case, so there's a risk of unexpected bugs). > > But whatever. If people don't want to document the restriction, that's > OK. Doesn't mean it's going to work any better, of course :-) > > Paul > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig Hmm, I believe it generally works fine does it not? The only situations I can think of where it does something funny are: (1) PEP 420 namespace packages where a file was added or removed in one of the versions (since that is impossible to differentiate from two different projects using the same namespace) (2) Uninstalling/Installing the same package during the lifetime of a process (which is already going to break in weird ways). What scenarios are you seeing two installs of the same package into different sys.path directories fail? ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Fri Mar 17 10:00:53 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 17 Mar 2017 14:00:53 +0000 Subject: [Distutils] [proposal] version-free + lookup-friendly dist-info location In-Reply-To: <883AD099-4647-400D-8AD7-3895F5CA138C@stufft.io> References: <8b536727-9029-dbce-6830-6e76551dfffd@ronnypfannschmidt.de> <883AD099-4647-400D-8AD7-3895F5CA138C@stufft.io> Message-ID: On 17 March 2017 at 13:34, Donald Stufft wrote: > Hmm, I believe it generally works fine does it not? The only situations I > can think of where it does something funny are: > > (1) PEP 420 namespace packages where a file was added or removed in one of > the versions (since that is impossible to differentiate from two different > projects using the same namespace) > (2) Uninstalling/Installing the same package during the lifetime of a > process (which is already going to break in weird ways). > > What scenarios are you seeing two installs of the same package into > different sys.path directories fail? I don't have an actual failure, although I do think I've seen reports in the past - it's definitely something that's come up in previous discussions. As a theoretical example: foo 1.0 looks like this: foo __init__.py bar.py foo 2.0 moves the functionality of foo/bar.py into baz.py foo __init__.py baz.py Put both of these on sys.path, then you can successfully import foo.bar and foo.baz. Which is of course wrong. Furthermore, which version of foo/__init__.py gets imported depends on which version of foo is first on sys.path, so one of bar and baz will be using the wrong foo. IMO, of course, the answer is simply "don't do that". But I'm OK with simply leaving things as they stand if no-one else thinks it's worth making an issue of it. Paul From opensource at ronnypfannschmidt.de Fri Mar 17 10:04:03 2017 From: opensource at ronnypfannschmidt.de (Ronny Pfannschmidt) Date: Fri, 17 Mar 2017 15:04:03 +0100 Subject: [Distutils] [proposal] version-free + lookup-friendly dist-info location In-Reply-To: References: <8b536727-9029-dbce-6830-6e76551dfffd@ronnypfannschmidt.de> <883AD099-4647-400D-8AD7-3895F5CA138C@stufft.io> Message-ID: <3617706d-5873-8a8f-8f9d-7e3187674628@ronnypfannschmidt.de> btw, thats irrelevant for dist-info either it gets found double and is a problem or its priority-overridden On 17.03.2017 15:00, Paul Moore wrote: > On 17 March 2017 at 13:34, Donald Stufft wrote: >> Hmm, I believe it generally works fine does it not? The only situations I >> can think of where it does something funny are: >> >> (1) PEP 420 namespace packages where a file was added or removed in one of >> the versions (since that is impossible to differentiate from two different >> projects using the same namespace) >> (2) Uninstalling/Installing the same package during the lifetime of a >> process (which is already going to break in weird ways). >> >> What scenarios are you seeing two installs of the same package into >> different sys.path directories fail? > I don't have an actual failure, although I do think I've seen reports > in the past - it's definitely something that's come up in previous > discussions. > > As a theoretical example: > > foo 1.0 looks like this: > > foo > __init__.py > bar.py > > foo 2.0 moves the functionality of foo/bar.py into baz.py > > foo > __init__.py > baz.py > > Put both of these on sys.path, then you can successfully import > foo.bar and foo.baz. Which is of course wrong. Furthermore, which > version of foo/__init__.py gets imported depends on which version of > foo is first on sys.path, so one of bar and baz will be using the > wrong foo. > > IMO, of course, the answer is simply "don't do that". But I'm OK with > simply leaving things as they stand if no-one else thinks it's worth > making an issue of it. > Paul > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig From p.f.moore at gmail.com Fri Mar 17 10:11:26 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 17 Mar 2017 14:11:26 +0000 Subject: [Distutils] [proposal] version-free + lookup-friendly dist-info location In-Reply-To: <3617706d-5873-8a8f-8f9d-7e3187674628@ronnypfannschmidt.de> References: <8b536727-9029-dbce-6830-6e76551dfffd@ronnypfannschmidt.de> <883AD099-4647-400D-8AD7-3895F5CA138C@stufft.io> <3617706d-5873-8a8f-8f9d-7e3187674628@ronnypfannschmidt.de> Message-ID: On 17 March 2017 at 14:04, Ronny Pfannschmidt wrote: > btw, thats irrelevant for dist-info > either it gets found double and is a problem > or its priority-overridden Agreed this is not relevant for dist-info (and sorry for diverting the discussion into a side-issue). For dist-info I see no need to version it, and I'm +1 on the renaming you propose. Paul From ncoghlan at gmail.com Fri Mar 17 10:35:05 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 18 Mar 2017 00:35:05 +1000 Subject: [Distutils] [proposal] version-free + lookup-friendly dist-info location In-Reply-To: <8b536727-9029-dbce-6830-6e76551dfffd@ronnypfannschmidt.de> References: <8b536727-9029-dbce-6830-6e76551dfffd@ronnypfannschmidt.de> Message-ID: On 17 March 2017 at 19:58, Ronny Pfannschmidt < opensource at ronnypfannschmidt.de> wrote: > Hi everyone, > > while looking over the recent peps i noticed that we keep a few inherent > inefficiencies in where to find dist-info folders > > because they include version numbers, to get a distribution we have to > search for it > which is no longer really sensible as we no longer have multi-version > installation in any upcoming standard. > Linux distros still use multi-version installation fairly regularly - it's how services like EPEL are able to offer parallel installs of frameworks and libraries that are also in base RHEL/CentOS without breaking anything. The associated code to populate __main__.__requires__ and hence get pkg_resources.require() to do the right thing isn't pretty, but it *does* work. While I expect tech like virtual environments, Software Collections, FlatPak, Snappy, etc, to eventually get us to the point where even Linux distros don't need parallel installs into the system site-packages any more, we're still a *looong* way from it being reasonable to assume that we can just drop parallel install support from the Python packaging tools in general. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Mar 17 10:40:33 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 18 Mar 2017 00:40:33 +1000 Subject: [Distutils] [proposal] version-free + lookup-friendly dist-info location In-Reply-To: References: <8b536727-9029-dbce-6830-6e76551dfffd@ronnypfannschmidt.de> <883AD099-4647-400D-8AD7-3895F5CA138C@stufft.io> Message-ID: On 18 March 2017 at 00:00, Paul Moore wrote: > As a theoretical example: > > foo 1.0 looks like this: > > foo > __init__.py > bar.py > > foo 2.0 moves the functionality of foo/bar.py into baz.py > > foo > __init__.py > baz.py > > Put both of these on sys.path, then you can successfully import > foo.bar and foo.baz. Which is of course wrong. Furthermore, which > version of foo/__init__.py gets imported depends on which version of > foo is first on sys.path, so one of bar and baz will be using the > wrong foo. > Unless the __init__.py has its own __path__ extension code, whichever version of "foo" is first on sys.path will "win", and you won't be able to import from the other one (so you'll be able to import "foo.bar" or "foo.baz", but not both). That's not an accident, it's behaviour that was deliberately kept for backwards compatibility reasons when PEP 420's native namespace package support was being designed. You only get the "you can import both of them" behaviour if "foo" is a namespace package, at which point "foo" itself doesn't really have a version any more. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Fri Mar 17 10:47:48 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 17 Mar 2017 14:47:48 +0000 Subject: [Distutils] [proposal] version-free + lookup-friendly dist-info location In-Reply-To: References: <8b536727-9029-dbce-6830-6e76551dfffd@ronnypfannschmidt.de> <883AD099-4647-400D-8AD7-3895F5CA138C@stufft.io> Message-ID: On 17 March 2017 at 14:40, Nick Coghlan wrote: > Unless the __init__.py has its own __path__ extension code, whichever > version of "foo" is first on sys.path will "win", and you won't be able to > import from the other one (so you'll be able to import "foo.bar" or > "foo.baz", but not both). That's not an accident, it's behaviour that was > deliberately kept for backwards compatibility reasons when PEP 420's native > namespace package support was being designed. Really? OK, I feel stupid now, I've been making a fuss over something that's actually not possible. I should have tested this. My apologies (in my defense, I could have sworn I remembered someone else making precisely this point sometime in the past, but I guess I'll have to put that down to advancing age and brain decay...) My apologies, I stand corrected. Paul From dholth at gmail.com Fri Mar 17 11:14:04 2017 From: dholth at gmail.com (Daniel Holth) Date: Fri, 17 Mar 2017 15:14:04 +0000 Subject: [Distutils] [proposal] version-free + lookup-friendly dist-info location In-Reply-To: References: <8b536727-9029-dbce-6830-6e76551dfffd@ronnypfannschmidt.de> <883AD099-4647-400D-8AD7-3895F5CA138C@stufft.io> Message-ID: At least *.egg-info works without-a-version and will be written that way when installed in development mode. As a shortcut when possible pkg_resources would read the package name and version number from the filename and not bother looking inside the metadata file until necessary. I don't recall whether the same can be said of *.dist-info. You will probably find that pkg_resources always does a listdir for each entry on sys.path when it runs. You might not be able to avoid that. On Fri, Mar 17, 2017 at 10:48 AM Paul Moore wrote: On 17 March 2017 at 14:40, Nick Coghlan wrote: > Unless the __init__.py has its own __path__ extension code, whichever > version of "foo" is first on sys.path will "win", and you won't be able to > import from the other one (so you'll be able to import "foo.bar" or > "foo.baz", but not both). That's not an accident, it's behaviour that was > deliberately kept for backwards compatibility reasons when PEP 420's native > namespace package support was being designed. Really? OK, I feel stupid now, I've been making a fuss over something that's actually not possible. I should have tested this. My apologies (in my defense, I could have sworn I remembered someone else making precisely this point sometime in the past, but I guess I'll have to put that down to advancing age and brain decay...) My apologies, I stand corrected. Paul _______________________________________________ Distutils-SIG maillist - Distutils-SIG at python.org https://mail.python.org/mailman/listinfo/distutils-sig -------------- next part -------------- An HTML attachment was scrubbed... URL: From robin at reportlab.com Fri Mar 17 13:19:02 2017 From: robin at reportlab.com (Robin Becker) Date: Fri, 17 Mar 2017 17:19:02 +0000 Subject: [Distutils] reproducible builds Message-ID: <97eeb4ad-09d7-9093-e827-04d3065f7b03@chamonix.reportlab.co.uk> An issue has been raised for reportlab to support a specific environment variable namely SOURCE_DATE_EPOCH. The intent is that we should get our time from this variable rather than time.localtime(time.time()) so that produced documents are more invariant. First off is this a reasonable request? The variable is defined by debian here https://reproducible-builds.org/specs/source-date-epoch/ What happens if other distros decide not to use this environment variable? Do I really want distro specific code in the package? In addition we already have our own mechanism for making the produced documents invariant although it might require an extension to support externally specified date & time as in the debian variable. In short where does the distro responsibility and package maintainers boundary need to be? -- Robin Becker From bussonniermatthias at gmail.com Fri Mar 17 13:33:37 2017 From: bussonniermatthias at gmail.com (Matthias Bussonnier) Date: Fri, 17 Mar 2017 10:33:37 -0700 Subject: [Distutils] reproducible builds In-Reply-To: <97eeb4ad-09d7-9093-e827-04d3065f7b03@chamonix.reportlab.co.uk> References: <97eeb4ad-09d7-9093-e827-04d3065f7b03@chamonix.reportlab.co.uk> Message-ID: On Fri, Mar 17, 2017 at 10:19 AM, Robin Becker wrote: > An issue has been raised for reportlab to support a specific environment > variable namely SOURCE_DATE_EPOCH. The intent is that we should get our time > from this variable rather than time.localtime(time.time()) so that produced > documents are more invariant. > > First off is this a reasonable request? The variable is defined by debian > here https://reproducible-builds.org/specs/source-date-epoch/ > > What happens if other distros decide not to use this environment variable? > Do I really want distro specific code in the package? For what it is worth, it seem like it will make its way into CPython as well: https://github.com/python/cpython/pull/296 And IFAICT, this env variable naming is already more than just debian. -- M > > In addition we already have our own mechanism for making the produced > documents > invariant although it might require an extension to support externally > specified date & time as in the debian variable. > > In short where does the distro responsibility and package maintainers > boundary need to be? > -- > Robin Becker > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig From opensource at ronnypfannschmidt.de Fri Mar 17 13:40:41 2017 From: opensource at ronnypfannschmidt.de (Ronny Pfannschmidt) Date: Fri, 17 Mar 2017 18:40:41 +0100 Subject: [Distutils] [proposal] version-free + lookup-friendly dist-info location In-Reply-To: References: <8b536727-9029-dbce-6830-6e76551dfffd@ronnypfannschmidt.de> Message-ID: <899f9a2f-fd31-6514-78d7-570a8614bc05@ronnypfannschmidt.de> On 17.03.2017 15:35, Nick Coghlan wrote: > On 17 March 2017 at 19:58, Ronny Pfannschmidt > > wrote: > > Hi everyone, > > while looking over the recent peps i noticed that we keep a few > inherent inefficiencies in where to find dist-info folders > > because they include version numbers, to get a distribution we > have to search for it > which is no longer really sensible as we no longer have > multi-version installation in any upcoming standard. > > Linux distros still use multi-version installation fairly regularly - > it's how services like EPEL are able to offer parallel installs of > frameworks and libraries that are also in base RHEL/CentOS without > breaking anything. > > The associated code to populate __main__.__requires__ and hence get > pkg_resources.require() to do the right thing isn't pretty, but it > *does* work. > as far as i understood, such dreaded code just fixes up sys.path, and thus the precedence will solve the issue dropping version strings from dist info does not prevent walking sys.path in order so i don't think it will break anything. note that im not talking about dropping general multi version install or setuptools multi version install, im talking about removing the version number from the dist-info folder as for all installation schemes that use it, there is exactly one version in precedence order on sys.path Cheers, Ronny > While I expect tech like virtual environments, Software Collections, > FlatPak, Snappy, etc, to eventually get us to the point where even > Linux distros don't need parallel installs into the system > site-packages any more, we're still a *looong* way from it being > reasonable to assume that we can just drop parallel install support > from the Python packaging tools in general. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com > | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas at kluyver.me.uk Fri Mar 17 13:49:09 2017 From: thomas at kluyver.me.uk (Thomas Kluyver) Date: Fri, 17 Mar 2017 17:49:09 +0000 Subject: [Distutils] reproducible builds In-Reply-To: References: <97eeb4ad-09d7-9093-e827-04d3065f7b03@chamonix.reportlab.co.uk> Message-ID: <1489772949.570750.914848472.77422451@webmail.messagingengine.com> Flit already supports $SOURCE_DATE_EPOCH for building wheels. I think the environment variable is a good idea: if it gets wide support, you will be able to set a single thing to affect lots of different build tools, rather than working out where you need to add command line arguments to half a dozen different build steps. Thomas On Fri, Mar 17, 2017, at 05:33 PM, Matthias Bussonnier wrote: > On Fri, Mar 17, 2017 at 10:19 AM, Robin Becker > wrote: > > An issue has been raised for reportlab to support a specific environment > > variable namely SOURCE_DATE_EPOCH. The intent is that we should get our time > > from this variable rather than time.localtime(time.time()) so that produced > > documents are more invariant. > > > > First off is this a reasonable request? The variable is defined by debian > > here https://reproducible-builds.org/specs/source-date-epoch/ > > > > What happens if other distros decide not to use this environment variable? > > Do I really want distro specific code in the package? > > For what it is worth, it seem like it will make its way into CPython as > well: > https://github.com/python/cpython/pull/296 > > And IFAICT, this env variable naming is already more than just debian. > > -- > M > > > > > > In addition we already have our own mechanism for making the produced > > documents > > invariant although it might require an extension to support externally > > specified date & time as in the debian variable. > > > > In short where does the distro responsibility and package maintainers > > boundary need to be? > > -- > > Robin Becker > > _______________________________________________ > > Distutils-SIG maillist - Distutils-SIG at python.org > > https://mail.python.org/mailman/listinfo/distutils-sig > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig From doko at ubuntu.com Fri Mar 17 13:46:19 2017 From: doko at ubuntu.com (Matthias Klose) Date: Fri, 17 Mar 2017 18:46:19 +0100 Subject: [Distutils] reproducible builds In-Reply-To: <97eeb4ad-09d7-9093-e827-04d3065f7b03@chamonix.reportlab.co.uk> References: <97eeb4ad-09d7-9093-e827-04d3065f7b03@chamonix.reportlab.co.uk> Message-ID: <314e6cba-78c5-bfc9-1cd5-b8fe92ca86d6@ubuntu.com> On 17.03.2017 18:19, Robin Becker wrote: > An issue has been raised for reportlab to support a specific environment > variable namely SOURCE_DATE_EPOCH. The intent is that we should get our time > from this variable rather than time.localtime(time.time()) so that produced > documents are more invariant. > > First off is this a reasonable request? The variable is defined by debian here > https://reproducible-builds.org/specs/source-date-epoch/ > > What happens if other distros decide not to use this environment variable? Do I > really want distro specific code in the package? > > In addition we already have our own mechanism for making the produced documents > invariant although it might require an extension to support externally specified > date & time as in the debian variable. > > In short where does the distro responsibility and package maintainers boundary > need to be? the reproducible-builds thing is not just a Debian thing, it's supported by other distros and upstream projects. Matthias From freddyrietdijk at fridh.nl Fri Mar 17 13:55:08 2017 From: freddyrietdijk at fridh.nl (Freddy Rietdijk) Date: Fri, 17 Mar 2017 18:55:08 +0100 Subject: [Distutils] reproducible builds In-Reply-To: <314e6cba-78c5-bfc9-1cd5-b8fe92ca86d6@ubuntu.com> References: <97eeb4ad-09d7-9093-e827-04d3065f7b03@chamonix.reportlab.co.uk> <314e6cba-78c5-bfc9-1cd5-b8fe92ca86d6@ubuntu.com> Message-ID: Nixpkgs [1] uses SOURCE_DATE_EPOCH as well. We can reproducibly build the Python interpreter (and packages with [2]). [1] https://github.com/NixOS/nixpkgs [2] https://bitbucket.org/pypa/wheel/pull-requests/77 On Fri, Mar 17, 2017 at 6:46 PM, Matthias Klose wrote: > On 17.03.2017 18:19, Robin Becker wrote: > > An issue has been raised for reportlab to support a specific environment > > variable namely SOURCE_DATE_EPOCH. The intent is that we should get our > time > > from this variable rather than time.localtime(time.time()) so that > produced > > documents are more invariant. > > > > First off is this a reasonable request? The variable is defined by > debian here > > https://reproducible-builds.org/specs/source-date-epoch/ > > > > What happens if other distros decide not to use this environment > variable? Do I > > really want distro specific code in the package? > > > > In addition we already have our own mechanism for making the produced > documents > > invariant although it might require an extension to support externally > specified > > date & time as in the debian variable. > > > > In short where does the distro responsibility and package maintainers > boundary > > need to be? > > the reproducible-builds thing is not just a Debian thing, it's supported by > other distros and upstream projects. > > Matthias > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dw+distutils-sig at hmmz.org Fri Mar 17 13:49:28 2017 From: dw+distutils-sig at hmmz.org (David Wilson) Date: Fri, 17 Mar 2017 17:49:28 +0000 Subject: [Distutils] reproducible builds In-Reply-To: <97eeb4ad-09d7-9093-e827-04d3065f7b03@chamonix.reportlab.co.uk> References: <97eeb4ad-09d7-9093-e827-04d3065f7b03@chamonix.reportlab.co.uk> Message-ID: <20170317174928.GA3392@k3> Hey Robin, > What happens if other distros decide not to use this environment variable? > Do I really want distro specific code in the package? AFAIK this is seeing a great deal of use outside of Debian and even Linux, for instance GCC also supports this variable. > In short where does the distro responsibility and package maintainers > boundary need to be? I guess it mostly comes down to whether you'd like them to carry the debt of a vendor patch to implement the behaviour for you in a way you don't like, or you'd prefer to retain full control. :) So it's more a preference than a responsibility. David From leorochael at gmail.com Fri Mar 17 17:04:11 2017 From: leorochael at gmail.com (Leonardo Rochael Almeida) Date: Fri, 17 Mar 2017 18:04:11 -0300 Subject: [Distutils] [proposal] version-free + lookup-friendly dist-info location In-Reply-To: References: <8b536727-9029-dbce-6830-6e76551dfffd@ronnypfannschmidt.de> <883AD099-4647-400D-8AD7-3895F5CA138C@stufft.io> Message-ID: On 17 March 2017 at 11:47, Paul Moore wrote: > On 17 March 2017 at 14:40, Nick Coghlan wrote: > > [...] whichever > > version of "foo" is first on sys.path will "win", and you won't be able > to > > import from the other one (so you'll be able to import "foo.bar" or > > "foo.baz", but not both). [...] > > Really? OK, I feel stupid now, I've been making a fuss over something > that's actually not possible. I should have tested this. My apologies > (in my defense, I could have sworn I remembered someone else making > precisely this point sometime in the past, but I guess I'll have to > put that down to advancing age and brain decay...) > Well, as Nick mentioned, if the `foo` Python package is a namespace package in both foo 1.0 and foo 2.0 distributions, then, yes, both `bar` and `baz` would be importable, and this is a case that should be documented somewhere. So, your point is not without merit. However, most projects with namespace packages tend to be careful with the mapping between packages and dists as the namespace package itself then becomes a shared space that should be populated carefully, so such an issue should be exceedingly rare. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Mar 18 03:20:54 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 18 Mar 2017 17:20:54 +1000 Subject: [Distutils] reproducible builds In-Reply-To: <97eeb4ad-09d7-9093-e827-04d3065f7b03@chamonix.reportlab.co.uk> References: <97eeb4ad-09d7-9093-e827-04d3065f7b03@chamonix.reportlab.co.uk> Message-ID: On 18 March 2017 at 03:19, Robin Becker wrote: > An issue has been raised for reportlab to support a specific environment > variable namely SOURCE_DATE_EPOCH. The intent is that we should get our > time from this variable rather than time.localtime(time.time()) so that > produced documents are more invariant. > > First off is this a reasonable request? The variable is defined by debian > here https://reproducible-builds.org/specs/source-date-epoch/ > > What happens if other distros decide not to use this environment variable? > Do I really want distro specific code in the package? > While the reproducible builds effort started in Debian and is furthest advanced there, it's not distro specific - interested developers working on other distros were already looking into it, and the Core Infrastructure Initiative has backed it as one of their security assurance initiatives. Software Freedom Conservancy have a decent write-up on the current state of things after December's Reproducible Builds Summit: https://sfconservancy.org/blog/2016/dec/26/reproducible-builds-summit-report/ However, you'll probably want to make yourself a helper function that uses SOURCE_DATE_EPOCH if defined, and falls back to the current time otherwise. That way you'll get reproducible behaviour when a build system configures the setting, while retaining your current behaviour for environments that don't. Cheers, Nick. P.S. A question well worth asking for *us* is whether or not setting SOURCE_DATE_EPOCH appropriately (if it isn't already set in the current environment) should be part of the build system abstraction PEPs. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Mar 18 03:49:48 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 18 Mar 2017 17:49:48 +1000 Subject: [Distutils] [proposal] version-free + lookup-friendly dist-info location In-Reply-To: References: <8b536727-9029-dbce-6830-6e76551dfffd@ronnypfannschmidt.de> <883AD099-4647-400D-8AD7-3895F5CA138C@stufft.io> Message-ID: On 18 March 2017 at 07:04, Leonardo Rochael Almeida wrote: > On 17 March 2017 at 11:47, Paul Moore wrote: > >> On 17 March 2017 at 14:40, Nick Coghlan wrote: >> > [...] whichever >> > version of "foo" is first on sys.path will "win", and you won't be able >> to >> > import from the other one (so you'll be able to import "foo.bar" or >> > "foo.baz", but not both). [...] >> >> Really? OK, I feel stupid now, I've been making a fuss over something >> that's actually not possible. I should have tested this. My apologies >> (in my defense, I could have sworn I remembered someone else making >> precisely this point sometime in the past, but I guess I'll have to >> put that down to advancing age and brain decay...) >> > > Well, as Nick mentioned, if the `foo` Python package is a namespace > package in both foo 1.0 and foo 2.0 distributions, then, yes, both `bar` > and `baz` would be importable, and this is a case that should be documented > somewhere. > > So, your point is not without merit. > If I recall correctly, it was also a problem in some of the suggestions made during the discussions leading up to the acceptance of PEP 420 namespace packages, and one of the deciding factors in ruling out the "execute all __init__.py files found in the order they're encountered on sys.path" option. I've been caught by that before myself, where I was reasoning about a later problem based on a design variant we ended up rejecting, rather than the approach that was actually implemented. Cheers, Nick. > -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Sat Mar 18 11:40:08 2017 From: donald at stufft.io (Donald Stufft) Date: Sat, 18 Mar 2017 11:40:08 -0400 Subject: [Distutils] reproducible builds In-Reply-To: References: <97eeb4ad-09d7-9093-e827-04d3065f7b03@chamonix.reportlab.co.uk> Message-ID: > On Mar 18, 2017, at 3:20 AM, Nick Coghlan wrote: > > P.S. A question well worth asking for *us* is whether or not setting SOURCE_DATE_EPOCH appropriately (if it isn't already set in the current environment) should be part of the build system abstraction PEPs. > If it?s getting standard use (and it sounds like it is), then I think it should yes. ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From robin at reportlab.com Mon Mar 20 05:00:59 2017 From: robin at reportlab.com (Robin Becker) Date: Mon, 20 Mar 2017 09:00:59 +0000 Subject: [Distutils] reproducible builds In-Reply-To: <20170317174928.GA3392@k3> References: <97eeb4ad-09d7-9093-e827-04d3065f7b03@chamonix.reportlab.co.uk> <20170317174928.GA3392@k3> Message-ID: <03ef6118-8b39-a263-fb52-79069b52ae03@chamonix.reportlab.co.uk> On 17/03/2017 17:49, David Wilson wrote: > Hey Robin, > >> What happens if other distros decide not to use this environment variable? >> Do I really want distro specific code in the package? > > AFAIK this is seeing a great deal of use outside of Debian and even > Linux, for instance GCC also supports this variable. > > >> In short where does the distro responsibility and package maintainers >> boundary need to be? > > I guess it mostly comes down to whether you'd like them to carry the > debt of a vendor patch to implement the behaviour for you in a way you > don't like, or you'd prefer to retain full control. :) So it's more a > preference than a responsibility. > > > David > . > I think I accept the need to support this variable. Our original use case was for testing purposes where we altered dates injected into the produced pdf meta data and also in some cases the content. However, if that is the implied intent of the debian variable then I will also need to modify the behaviour of some other tests eg in one case the produced pdf output looks like this > The value of i is not larger than 3 > The value of i is equal to 3 > The value of i is not less than 3 > The value of i is 3 > The value of i is 2 > The value of i is 1 > {'doc': 0x00000000093D0240>, 'currentFrame': 'normal', 'currentPageTemplate': 'First', 'aW': > 439.27559055118104, 'aH': 685.8897637795275, 'aWH': (439.27559055118104, > 685.8897637795275), 'i': 0, 'availableWidth': 439.27559055118104, 'availableHeight': > 619.8897637795275} > The current page number is 1 ie we are introspecting internals and injecting that into the document content. I imagine I need to clean up the reporting to avoid getting addresses etc etc into the documents. Obviously if I have the ability to embed repr(some_object) into the document output then it will vary (unless the underlying python is reproducible). I'm not sure if debian runs the whole reportlab test suite, but it makes sense to get this kind of variablity out. When we make significant changes to existing behaviours our current workflow consists of generating a large number of outputs and then rendering them into jpeg pages with ghost script. Differences in the jpegs can be used to spot problems. -- Robin Becker From robin at reportlab.com Mon Mar 20 07:30:59 2017 From: robin at reportlab.com (Robin Becker) Date: Mon, 20 Mar 2017 11:30:59 +0000 Subject: [Distutils] reproducible builds In-Reply-To: References: <97eeb4ad-09d7-9093-e827-04d3065f7b03@chamonix.reportlab.co.uk> Message-ID: <236a7cdf-9622-da30-e2b0-c33278e5523e@chamonix.reportlab.co.uk> On 18/03/2017 07:20, Nick Coghlan wrote: ........... >> > > While the reproducible builds effort started in Debian and is furthest > advanced there, it's not distro specific - interested developers working on > other distros were already looking into it, and the Core Infrastructure > Initiative has backed it as one of their security assurance initiatives. > Software Freedom Conservancy have a decent write-up on the current state of > things after December's Reproducible Builds Summit: > https://sfconservancy.org/blog/2016/dec/26/reproducible-builds-summit-report/ thanks for this; it seems the emphasis is on security. If the intent is that reportlab should be able to reliably reproduce the same binary output then I think I need to do more than just fix a couple of dates. We use many dictionary like objects to produce PDF and I am not sure all are sorted by key during output. Is there a way to excite dictionary ordering changes? I believe there was some way to modify the hashing introduced when the dos dictionary attacks were an issue. Would it be sufficient to generate documents with say Python 2.7 and check against 3.6? > > However, you'll probably want to make yourself a helper function that uses > SOURCE_DATE_EPOCH if defined, and falls back to the current time otherwise. > That way you'll get reproducible behaviour when a build system configures > the setting, while retaining your current behaviour for environments that > don't. > good advice and that's what I am doing. > Cheers, > Nick. > > P.S. A question well worth asking for *us* is whether or not setting > SOURCE_DATE_EPOCH appropriately (if it isn't already set in the current > environment) should be part of the build system abstraction PEPs. > -- Robin Becker From thomas at kluyver.me.uk Mon Mar 20 07:35:06 2017 From: thomas at kluyver.me.uk (Thomas Kluyver) Date: Mon, 20 Mar 2017 11:35:06 +0000 Subject: [Distutils] reproducible builds In-Reply-To: <03ef6118-8b39-a263-fb52-79069b52ae03@chamonix.reportlab.co.uk> References: <97eeb4ad-09d7-9093-e827-04d3065f7b03@chamonix.reportlab.co.uk> <20170317174928.GA3392@k3> <03ef6118-8b39-a263-fb52-79069b52ae03@chamonix.reportlab.co.uk> Message-ID: <1490009706.776024.916977680.753748EA@webmail.messagingengine.com> On Mon, Mar 20, 2017, at 09:00 AM, Robin Becker wrote: > Obviously if I have the ability to embed repr(some_object) > into the document output then it will vary (unless the underlying python > is reproducible). I'm not sure if debian runs the whole reportlab test > suite, but it makes sense to get this kind of variablity out. AIUI, it's fine to have the *ability* to produce non-deterministic output, and it doesn't matter if your tests do that. The aim of reproducible builds is to be able to go from the same source code to an identical binary package. Documents generated by running the tests are presumably not included in binary packages, so it doesn't matter if they change. > I believe there was some way to modify the hashing introduced when the dos dictionary attacks were an issue. The PYTHONHASHSEED environment variable: https://docs.python.org/3/using/cmdline.html#envvar-PYTHONHASHSEED If you have non-determinism introduced by Python hashing, setting a constant value of PYTHONHASHSEED should be an easy way to work around it. From freddyrietdijk at fridh.nl Mon Mar 20 07:46:10 2017 From: freddyrietdijk at fridh.nl (Freddy Rietdijk) Date: Mon, 20 Mar 2017 12:46:10 +0100 Subject: [Distutils] reproducible builds In-Reply-To: <1490009706.776024.916977680.753748EA@webmail.messagingengine.com> References: <97eeb4ad-09d7-9093-e827-04d3065f7b03@chamonix.reportlab.co.uk> <20170317174928.GA3392@k3> <03ef6118-8b39-a263-fb52-79069b52ae03@chamonix.reportlab.co.uk> <1490009706.776024.916977680.753748EA@webmail.messagingengine.com> Message-ID: As Thomas mentioned PYTHONHASHSEED is sufficient to solve non-determinism by the hashing. In my experience this hashing, along with datetimes (e.g. in the bytecode) are typically the only causes of non-determinism in Python packages. Someone from I think Debian did mention [1] that they cannot always set PYTHONHASHSEED and so in certain cases they apply patches to fix non-determinism. This is what they might be after in the case of `reportlab` but you best ask them. I'm not yet sure what to think of that patching approach. E.g., if one couldn't set PYTHONHASHSEED when building the bytecode in the interpreter itself, then one would have to convert all sets to lists with potential negative performance effects. On Mon, Mar 20, 2017 at 12:35 PM, Thomas Kluyver wrote: > On Mon, Mar 20, 2017, at 09:00 AM, Robin Becker wrote: > > Obviously if I have the ability to embed repr(some_object) > > into the document output then it will vary (unless the underlying python > > is reproducible). I'm not sure if debian runs the whole reportlab test > > suite, but it makes sense to get this kind of variablity out. > > AIUI, it's fine to have the *ability* to produce non-deterministic > output, and it doesn't matter if your tests do that. The aim of > reproducible builds is to be able to go from the same source code to an > identical binary package. Documents generated by running the tests are > presumably not included in binary packages, so it doesn't matter if they > change. > > > I believe there was some way to modify the hashing introduced when the > dos dictionary attacks were an issue. > > The PYTHONHASHSEED environment variable: > https://docs.python.org/3/using/cmdline.html#envvar-PYTHONHASHSEED > > If you have non-determinism introduced by Python hashing, setting a > constant value of PYTHONHASHSEED should be an easy way to work around > it. > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robin at reportlab.com Mon Mar 20 09:02:34 2017 From: robin at reportlab.com (Robin Becker) Date: Mon, 20 Mar 2017 13:02:34 +0000 Subject: [Distutils] reproducible builds In-Reply-To: <1490009706.776024.916977680.753748EA@webmail.messagingengine.com> References: <97eeb4ad-09d7-9093-e827-04d3065f7b03@chamonix.reportlab.co.uk> <20170317174928.GA3392@k3> <03ef6118-8b39-a263-fb52-79069b52ae03@chamonix.reportlab.co.uk> <1490009706.776024.916977680.753748EA@webmail.messagingengine.com> Message-ID: <2431ce3b-0237-f26d-7a5f-4e481eb2920b@chamonix.reportlab.co.uk> On 20/03/2017 11:35, Thomas Kluyver wrote: > On Mon, Mar 20, 2017, at 09:00 AM, Robin Becker wrote: >> Obviously if I have the ability to embed repr(some_object) >> into the document output then it will vary (unless the underlying python >> is reproducible). I'm not sure if debian runs the whole reportlab test >> suite, but it makes sense to get this kind of variablity out. > > AIUI, it's fine to have the *ability* to produce non-deterministic > output, and it doesn't matter if your tests do that. The aim of > reproducible builds is to be able to go from the same source code to an > identical binary package. Documents generated by running the tests are > presumably not included in binary packages, so it doesn't matter if they > change. > Well now I am confused. The date / times mentioned in the debian patch are those we force into the documents produced by the reportlab package when it is used. They would not normally be part of the package itself. Although the reportlab documentation is available in the source I'm fairly sure we don't include it in the wheels. Of course if the debian packaging includes output created by reportlab then that document would receive the current (ie variable) time. In addition any random behaviour created by the reportlab generation code would also be embedded in the document. If the debian variable is intended create reproducible PDF as part of their packaging of reportlab or some other package then I'm fairly sure that other variation will need to be checked in addition to the control that the SOURCE_DATE_EPOCH variable would give. Perhaps Matthias could comment; I know little about how the debian packaging works. >> I believe there was some way to modify the hashing introduced when the dos dictionary attacks were an issue. > > The PYTHONHASHSEED environment variable: > https://docs.python.org/3/using/cmdline.html#envvar-PYTHONHASHSEED > > If you have non-determinism introduced by Python hashing, setting a > constant value of PYTHONHASHSEED should be an easy way to work around > it. > Well years ago we tried to get some random behaviour in text selection by setting a seed value eg 23......22 (but that doesn't work across pythons). I guess the algorithm variation across pythons would make dictionary order quite variable. > C:\Users\rptlab>\python27\python > Python 2.7.13 (v2.7.13:a06454b1afa1, Dec 17 2016, 20:53:40) [MSC v.1500 64 bit (AMD64)] on win32 > Type "help", "copyright", "credits" or "license" for more information. >>>> import random >>>> random.seed(23......22) >>>> from random import randint, choice >>>> randint(10,25) > 15 >>>> > C:\Users\rptlab>\python36\python > Python 3.6.0 (v3.6.0:41df79263a11, Dec 23 2016, 08:06:12) [MSC v.1900 64 bit (AMD64)] on win32 > Type "help", "copyright", "credits" or "license" for more information. >>>> import random >>>> random.seed(23......22) >>>> from random import randint, choice >>>> randint(10,25) > 21 >>>> -- Robin Becker From thomas at kluyver.me.uk Mon Mar 20 09:34:49 2017 From: thomas at kluyver.me.uk (Thomas Kluyver) Date: Mon, 20 Mar 2017 13:34:49 +0000 Subject: [Distutils] reproducible builds In-Reply-To: <2431ce3b-0237-f26d-7a5f-4e481eb2920b@chamonix.reportlab.co.uk> References: <97eeb4ad-09d7-9093-e827-04d3065f7b03@chamonix.reportlab.co.uk> <20170317174928.GA3392@k3> <03ef6118-8b39-a263-fb52-79069b52ae03@chamonix.reportlab.co.uk> <1490009706.776024.916977680.753748EA@webmail.messagingengine.com> <2431ce3b-0237-f26d-7a5f-4e481eb2920b@chamonix.reportlab.co.uk> Message-ID: <1490016889.1650001.917106688.2E575580@webmail.messagingengine.com> On Mon, Mar 20, 2017, at 01:02 PM, Robin Becker wrote: > Well now I am confused. The date / times mentioned in the debian patch > are those > we force into the documents produced by the reportlab package when it is > used. > > They would not normally be part of the package itself. Although the > reportlab > documentation is available in the source I'm fairly sure we don't include > it in > the wheels. I'm guessing, but I imagine that Debian may be using reportlab in the builds of other packages, to build documentation. It's normal for Debian packages to include built docs, unlike wheels. So they would want it to create PDFs reproducibly, but the PDFs generated in your test suite probably don't matter. > I guess the algorithm variation across pythons would make dictionary order quite variable. For a Python based tool, I think it's reasonable that reproducing a build requires running with the same version of Python. The requirement would be that, with enough information about the build environment, you *can* produce an identical PDF. It needn't (AFAIK) be identical every time anyone builds it. Thomas From ncoghlan at gmail.com Tue Mar 21 00:21:24 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 21 Mar 2017 14:21:24 +1000 Subject: [Distutils] reproducible builds In-Reply-To: <1490016889.1650001.917106688.2E575580@webmail.messagingengine.com> References: <97eeb4ad-09d7-9093-e827-04d3065f7b03@chamonix.reportlab.co.uk> <20170317174928.GA3392@k3> <03ef6118-8b39-a263-fb52-79069b52ae03@chamonix.reportlab.co.uk> <1490009706.776024.916977680.753748EA@webmail.messagingengine.com> <2431ce3b-0237-f26d-7a5f-4e481eb2920b@chamonix.reportlab.co.uk> <1490016889.1650001.917106688.2E575580@webmail.messagingengine.com> Message-ID: On 20 March 2017 at 23:34, Thomas Kluyver wrote: > On Mon, Mar 20, 2017, at 01:02 PM, Robin Becker wrote: > > I guess the algorithm variation across pythons would make dictionary > order quite variable. > > For a Python based tool, I think it's reasonable that reproducing a > build requires running with the same version of Python. > > The requirement would be that, with enough information about the build > environment, you *can* produce an identical PDF. It needn't (AFAIK) be > identical every time anyone builds it. > Right, one of the other aspects of reproducible-builds is looking into ways to define and distribute build environments in addition to the application source code: https://reproducible-builds.org/docs/definition-strategies/ Within a given binary context (e.g. Debian packages), that may be a text description, like Debian's buildinfo files: https://wiki.debian.org/ReproducibleBuilds/BuildinfoFiles For Fedora/RHEL/CentOS, the equivalent would probably be to extract a suitable config from the build system: https://fedoraproject.org/wiki/Using_the_Koji_build_system#Using_koji_to_generate_a_mock_config_to_replicate_a_buildroot In other cases, the build environment may itself by a binary artifact (e.g. the manylinux1 container images, or the "Holy Build Box" machine images). Fully eliminating non-determinism usually does requiring switching to explicit sorting and ordered containers in build tools and scripts, as otherwise even things like directory listings or JSON serialisation can introduce variations in output when a build is run on a different machine. The reproducible-builds project offers some interesting tools to identify and analyse cases of non-reproducible outputs: https://reproducible-builds.org/tools/ However, nobody can reasonably expect arbitrary upstream projects (especially volunteer run ones) to be going out and pre-emptively solving that kind of problem - the most it's realistic to aim for is to encourage projects to be accommodating when upstream changes are proposed to introduce more determinism into the build processes for particular projects, as well as into the artifact generation process for tools that may be used as part of the build process for other projects. (And I agree with Thomas that it's likely the latter case that applies for reportlab-generated PDFs) Cheers, Nick. P.S. Prompted by Gary Berhnhardt, one of the ways I've started thinking about the whole question of "built artifacts" in general is as a complex distributed caching problem, with reproducible builds being a way of ensuring that it's possible to check the validity of particular cache entries -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From guettliml at thomas-guettler.de Tue Mar 21 04:13:29 2017 From: guettliml at thomas-guettler.de (=?UTF-8?Q?Thomas_G=c3=bcttler?=) Date: Tue, 21 Mar 2017 09:13:29 +0100 Subject: [Distutils] Canonical Repo URL: Make "pip install -e foo" work Message-ID: AFAIK it is impossible to do this: pip install -e foo You need to use the repo URL up to now: pip install -e git+https://example.com/repos/foo#egg=foo AFAIK the fast/short implementation of "pip install -e foo" does not work, since pip can't access metadata of package foo without downloading the whole package. Or am I wrong - is this possible? But how cares for useless downloaded bytes? I don't care. It should be possible to download the whole package "foo", then look at the metadata which is provided by it. Take the canonical repo url, and then get the source from the repo. AFAIK there is no official way to define a "Canonical Repo URL" up to now. If I want to provide it for my custom packages. How could I do this? Regards, Thomas G?ttler -- Thomas Guettler http://www.thomas-guettler.de/ From marius at gedmin.as Tue Mar 21 07:46:47 2017 From: marius at gedmin.as (Marius Gedminas) Date: Tue, 21 Mar 2017 13:46:47 +0200 Subject: [Distutils] reproducible builds In-Reply-To: <236a7cdf-9622-da30-e2b0-c33278e5523e@chamonix.reportlab.co.uk> References: <97eeb4ad-09d7-9093-e827-04d3065f7b03@chamonix.reportlab.co.uk> <236a7cdf-9622-da30-e2b0-c33278e5523e@chamonix.reportlab.co.uk> Message-ID: <20170321114647.ri2cndwtl6jrupuh@platonas> On Mon, Mar 20, 2017 at 11:30:59AM +0000, Robin Becker wrote: > thanks for this; it seems the emphasis is on security. If the intent is that > reportlab should be able to reliably reproduce the same binary output then I > think I need to do more than just fix a couple of dates. We use many > dictionary like objects to produce PDF and I am not sure all are sorted by > key during output. I'm sure the reproducible builds folks will send you patches if they find any spots that you missed. ;-) > Is there a way to excite dictionary ordering changes? I believe there was > some way to modify the hashing introduced when the dos dictionary attacks > were an issue. Would it be sufficient to generate documents with say Python > 2.7 and check against 3.6? Python 3.6 changed the dict implementation so the ordering is always stable (and matches insertion order). You'll want to test with Python 3.5, which perturbs the dict ordering randomly, as a side effect of the randomized string/bytes hashes (unless you fix it by setting the PYTHONHASHSEED environment variable[*]) [*] https://docs.python.org/3.3/using/cmdline.html#envvar-PYTHONHASHSEED Regards, Marius Gedminas -- Yes, always begin work on inherited code by removing comments. Even if they were maintained (they are not) they are natural language written by engineers who cannot be understood ordering coffee in a diner. Getting back to comments not being maintained, my saying on that one is, "Comments do not run." -- Kenny Tilton -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 163 bytes Desc: not available URL: From robin at reportlab.com Tue Mar 21 08:02:59 2017 From: robin at reportlab.com (Robin Becker) Date: Tue, 21 Mar 2017 12:02:59 +0000 Subject: [Distutils] reproducible builds In-Reply-To: <20170321114647.ri2cndwtl6jrupuh@platonas> References: <97eeb4ad-09d7-9093-e827-04d3065f7b03@chamonix.reportlab.co.uk> <236a7cdf-9622-da30-e2b0-c33278e5523e@chamonix.reportlab.co.uk> <20170321114647.ri2cndwtl6jrupuh@platonas> Message-ID: On 21/03/2017 11:46, Marius Gedminas wrote: > On Mon, Mar 20, 2017 at 11:30:59AM +0000, Robin Becker wrote: ......... > > I'm sure the reproducible builds folks will send you patches if they > find any spots that you missed. ;-) > >> Is there a way to excite dictionary ordering changes? I believe there was >> some way to modify the hashing introduced when the dos dictionary attacks >> were an issue. Would it be sufficient to generate documents with say Python >> 2.7 and check against 3.6? > > Python 3.6 changed the dict implementation so the ordering is always stable > (and matches insertion order). > > You'll want to test with Python 3.5, which perturbs the dict ordering > randomly, as a side effect of the randomized string/bytes hashes (unless > you fix it by setting the PYTHONHASHSEED environment variable[*]) > > [*] https://docs.python.org/3.3/using/cmdline.html#envvar-PYTHONHASHSEED ....... thanks for this Marius; having started on the reproducibility trail I find the python 3.x output has more mismatches than I like ('cos of missed bugs). -- Robin Becker From leorochael at gmail.com Tue Mar 21 09:35:08 2017 From: leorochael at gmail.com (Leonardo Rochael Almeida) Date: Tue, 21 Mar 2017 10:35:08 -0300 Subject: [Distutils] Canonical Repo URL: Make "pip install -e foo" work In-Reply-To: References: Message-ID: Hi Thomas, Besides figuring out where the repo url is, you have a second problem to solve: The command `pip install -e some/path` already has something unpacked/checked-out in `some/path` to install in development mode. In which folder would the command `pip install -e some.project` unpack/checkout `some.project`? Cheers, Leo On 21 March 2017 at 05:13, Thomas G?ttler wrote: > AFAIK it is impossible to do this: > > pip install -e foo > > You need to use the repo URL up to now: > > pip install -e git+https://example.com/repos/foo#egg=foo > > AFAIK the fast/short implementation of "pip install -e foo" does > not work, since pip can't access metadata of package foo without > downloading > the whole package. Or am I wrong - is this possible? > > But how cares for useless downloaded bytes? I don't care. > > It should be possible to download the whole package "foo", > then look at the metadata which is provided by it. Take > the canonical repo url, and then get the source from > the repo. > > AFAIK there is no official way to define a "Canonical Repo URL" up to now. > > If I want to provide it for my custom packages. How could I do this? > > Regards, > Thomas G?ttler > > -- > Thomas Guettler http://www.thomas-guettler.de/ > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Tue Mar 21 10:26:53 2017 From: wes.turner at gmail.com (Wes Turner) Date: Tue, 21 Mar 2017 09:26:53 -0500 Subject: [Distutils] Canonical Repo URL: Make "pip install -e foo" work In-Reply-To: References: Message-ID: On Tue, Mar 21, 2017 at 3:13 AM, Thomas G?ttler < guettliml at thomas-guettler.de> wrote: > AFAIK it is impossible to do this: > > pip install -e foo > > You need to use the repo URL up to now: > > pip install -e git+https://example.com/repos/foo#egg=foo > > AFAIK the fast/short implementation of "pip install -e foo" does > not work, since pip can't access metadata of package foo without > downloading > the whole package. Or am I wrong - is this possible? > These read metadata from an already-downloaded package with a setup.py?: ```bash pip install -e . pip install -e "${VIRTUAL_ENV}/src/foo" ``` You can download the JSON metadata from {PyPI, Warehouse} but IDK about {devpi, }? - https://github.com/pypa/warehouse/issues/1638 - https://pypi.org/pypi/pip/json - - https://github.com/pypa/warehouse/blob/master/warehouse/legacy/api/json.py - [ ] add (existing, Metadata 2.0/3.0) JSON api to devpi - [ ] Create a PEP for a pypa JSON HTTP endpoint spec - [ ] Static HTML support: Redirect to pkg-name-ver.tar.gz.json? - http://doc.devpi.net/latest/curl.html - http://doc.devpi.net/latest/userman/devpi_commands.html#getjson - "Pip needs a dependency resolver" https://github.com/pypa/pip/issues/988 - https://github.com/awwad/depresolve/blob/master/depresolve/scrape_pypi_metadata.py - pip clone --recursive - pip install --clone --recursive --no-recursive > > But how cares for useless downloaded bytes? I don't care. > How is this usecase distinct from those solved for by?: - requirements.txt - pipenv install --dev pkgname - https://github.com/kennethreitz/pipenv > > It should be possible to download the whole package "foo", > then look at the metadata which is provided by it. Take > the canonical repo url, and then get the source from > the repo. > > AFAIK there is no official way to define a "Canonical Repo URL" up to now. > > If I want to provide it for my custom packages. How could I do this? > With JSONLD [1], you could just add a "source" attribute (with your own namespaced URI: "myns:source") to the package metadata: sourceURL: "git+ssh://git at github.com/pypa/pip at master" sourceURL: "git+https://github.com/pypa/pip at master" Or, we could add "sourceURL" (pending bikeshedding on the property name) to the metadata 3.0 PEP. ````bash pip clone pip pip install --clone --rev-override=develop pip ``` And then, if you give a mouse a cookie, what about multiple sourceURLs: which is the canonical URL? git+git://git.apache.org/libcloud.git git+https://github.com/apache/libcloud git+ssh://git at github.com/apache/libcloud [1] https://github.com/pypa/interoperability-peps/issues/31 > > Regards, > Thomas G?ttler > > -- > Thomas Guettler http://www.thomas-guettler.de/ > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guettliml at thomas-guettler.de Tue Mar 21 11:43:05 2017 From: guettliml at thomas-guettler.de (=?UTF-8?Q?Thomas_G=c3=bcttler?=) Date: Tue, 21 Mar 2017 16:43:05 +0100 Subject: [Distutils] Canonical Repo URL: Make "pip install -e foo" work In-Reply-To: References: Message-ID: Am 21.03.2017 um 14:35 schrieb Leonardo Rochael Almeida: > Hi Thomas, > > Besides figuring out where the repo url is, you have a second problem to solve: > > The command `pip install -e some/path` already has something unpacked/checked-out in `some/path` to install in development mode. > > In which folder would the command `pip install -e some.project` unpack/checkout `some.project`? Up to now I never specified a directory. I used "pip install -e git+https://...#egg=foo" already very often. Up to now pip cloned the git repo into a directory called "src". Regards, Thomas G?ttler -- http://www.thomas-guettler.de/ From guettliml at thomas-guettler.de Tue Mar 21 11:58:30 2017 From: guettliml at thomas-guettler.de (=?UTF-8?Q?Thomas_G=c3=bcttler?=) Date: Tue, 21 Mar 2017 16:58:30 +0100 Subject: [Distutils] Canonical Repo URL: Make "pip install -e foo" work In-Reply-To: References: Message-ID: <558ac937-e5d5-2041-9e77-2c80356a7254@thomas-guettler.de> Am 21.03.2017 um 15:26 schrieb Wes Turner: > > > On Tue, Mar 21, 2017 at 3:13 AM, Thomas G?ttler > wrote: > > AFAIK it is impossible to do this: > > pip install -e foo > > You need to use the repo URL up to now: > > pip install -e git+https://example.com/repos/foo#egg=foo > > AFAIK the fast/short implementation of "pip install -e foo" does > not work, since pip can't access metadata of package foo without downloading > the whole package. Or am I wrong - is this possible? > > > These read metadata from an already-downloaded package with a setup.py?: > > ```bash > pip install -e . > pip install -e "${VIRTUAL_ENV}/src/foo" > ``` > > You can download the JSON metadata from {PyPI, Warehouse} but IDK about {devpi, }? I don't understand above sentence. Is downloading metadata possible or not? > How is this usecase distinct from those solved for by?: > > - requirements.txt > - pipenv install --dev pkgname > - https://github.com/kennethreitz/pipenv I have never heard of pipenv before. But have read the name of the author before and for me "Kenneth Reitz" means "for human beings". With other words: nice, simple and elegant API. > > > > It should be possible to download the whole package "foo", > then look at the metadata which is provided by it. Take > the canonical repo url, and then get the source from > the repo. > > AFAIK there is no official way to define a "Canonical Repo URL" up to now. > > If I want to provide it for my custom packages. How could I do this? > > > With JSONLD [1], > you could just add a "source" attribute (with your own namespaced URI: "myns:source") to the package metadata: > > sourceURL: "git+ssh://git at github.com/pypa/pip at master " > sourceURL: "git+https://github.com/pypa/pip at master" > > Or, we could add "sourceURL" (pending bikeshedding on the property name) to the metadata 3.0 PEP. "sourceURL" sound good. > > ````bash > pip clone pip > pip install --clone --rev-override=develop pip > ``` > > And then, if you give a mouse a cookie, > what about multiple sourceURLs: which is the canonical URL? > I think it only makes sense to publish one sourceURL. If someone publishes two then ... I don't know. Maybe the first wins? Regards, Thomas G?ttler -- http://www.thomas-guettler.de/ From brett at python.org Tue Mar 21 12:52:02 2017 From: brett at python.org (Brett Cannon) Date: Tue, 21 Mar 2017 16:52:02 +0000 Subject: [Distutils] reproducible builds In-Reply-To: <20170321114647.ri2cndwtl6jrupuh@platonas> References: <97eeb4ad-09d7-9093-e827-04d3065f7b03@chamonix.reportlab.co.uk> <236a7cdf-9622-da30-e2b0-c33278e5523e@chamonix.reportlab.co.uk> <20170321114647.ri2cndwtl6jrupuh@platonas> Message-ID: On Tue, 21 Mar 2017 at 04:54 Marius Gedminas wrote: > On Mon, Mar 20, 2017 at 11:30:59AM +0000, Robin Becker wrote: > > thanks for this; it seems the emphasis is on security. If the intent is > that > > reportlab should be able to reliably reproduce the same binary output > then I > > think I need to do more than just fix a couple of dates. We use many > > dictionary like objects to produce PDF and I am not sure all are sorted > by > > key during output. > > I'm sure the reproducible builds folks will send you patches if they > find any spots that you missed. ;-) > > > Is there a way to excite dictionary ordering changes? I believe there was > > some way to modify the hashing introduced when the dos dictionary attacks > > were an issue. Would it be sufficient to generate documents with say > Python > > 2.7 and check against 3.6? > > Python 3.6 changed the dict implementation so the ordering is always stable > (and matches insertion order). > Do realize that is an implementation detail and not guaranteed by the language specification, so it won't necessarily hold in the future or for other interpreters. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: From robin at reportlab.com Wed Mar 22 04:57:28 2017 From: robin at reportlab.com (Robin Becker) Date: Wed, 22 Mar 2017 08:57:28 +0000 Subject: [Distutils] reproducible builds In-Reply-To: References: <97eeb4ad-09d7-9093-e827-04d3065f7b03@chamonix.reportlab.co.uk> <236a7cdf-9622-da30-e2b0-c33278e5523e@chamonix.reportlab.co.uk> <20170321114647.ri2cndwtl6jrupuh@platonas> Message-ID: <0f2db4c1-7b0d-a155-dceb-d298b42785c7@chamonix.reportlab.co.uk> On 21/03/2017 16:52, Brett Cannon wrote: > On Tue, 21 Mar 2017 at 04:54 Marius Gedminas wrote: ..... >> >> Python 3.6 changed the dict implementation so the ordering is always stable >> (and matches insertion order). >> > > Do realize that is an implementation detail and not guaranteed by the > language specification, so it won't necessarily hold in the future or for > other interpreters. > > -Brett one of the main issues in the reportlab pdf variability are the dict objects which come out as << /Key1 value ..... /Key n >> I think we have these coming out in sorted order without reliance on the underlying dicts. Up to now we used pixel equality ie the appearance, but as I understand it, reproducibility means byte equality which is harder. A bit of work has been done making the variation between Python 2.7 & 3.6 renderings go away. This reproducibility effort has revealed several bugs which is in itself useful. -- Robin Becker From guettliml at thomas-guettler.de Wed Mar 22 12:29:39 2017 From: guettliml at thomas-guettler.de (=?UTF-8?Q?Thomas_G=c3=bcttler?=) Date: Wed, 22 Mar 2017 17:29:39 +0100 Subject: [Distutils] add "sourceURL" to the metadata 3.0 PEP. In-Reply-To: References: Message-ID: > Wes Turner: > sourceURL: "git+ssh://git at github.com/pypa/pip at master" > sourceURL: "git+https://github.com/pypa/pip at master" > Or, we could add "sourceURL" (pending bikeshedding on the property name) to the metadata 3.0 PEP. Why not? What is the next step to add sourceURL to the pep? Regards, Thomas G?ttler -- Thomas Guettler http://www.thomas-guettler.de/ From wes.turner at gmail.com Thu Mar 23 00:59:13 2017 From: wes.turner at gmail.com (Wes Turner) Date: Wed, 22 Mar 2017 23:59:13 -0500 Subject: [Distutils] add "sourceURL" to the metadata 3.0 PEP. In-Reply-To: References: Message-ID: @ncoghlan would know which are the current metadata peps? - http://python-notes.curiousefficiency.org/en/latest/pep_ideas/core_packaging_api.html - https://github.com/pypa/python-packaging-user-guide - https://packaging.python.org/ - https://packaging.python.org/specifications/ - https://github.com/pypa/interoperability-peps - https://github.com/pypa/interoperability-peps/issues - https://github.com/pypa/interoperability-peps/issues/31 - @ncoghlan this could probably jow just be titled "PyPA JSON-LD Context"? On Wed, Mar 22, 2017 at 11:29 AM, Thomas G?ttler < guettliml at thomas-guettler.de> wrote: > > Wes Turner: > > sourceURL: "git+ssh://git at github.com/pypa/pip at master" > > sourceURL: "git+https://github.com/pypa/pip at master" > > Or, we could add "sourceURL" (pending bikeshedding on the property name) > to the metadata 3.0 PEP. > > Why not? > > What is the next step to add sourceURL to the pep? > > Regards, > Thomas G?ttler > > > -- > Thomas Guettler http://www.thomas-guettler.de/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Thu Mar 23 01:01:54 2017 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 23 Mar 2017 00:01:54 -0500 Subject: [Distutils] add "sourceURL" to the metadata 3.0 PEP. In-Reply-To: References: Message-ID: ( The thread subject here was: "[Distutils] Canonical Repo URL: Make "pip install -e foo" work" https://mail.python.org/pipermail/distutils-sig/2017-March/thread.html#30302 ) On Wed, Mar 22, 2017 at 11:59 PM, Wes Turner wrote: > @ncoghlan would know which are the current metadata peps? > - http://python-notes.curiousefficiency.org/en/latest/pep_ideas/core_ > packaging_api.html > > > - https://github.com/pypa/python-packaging-user-guide > - https://packaging.python.org/ > - https://packaging.python.org/specifications/ > > > - https://github.com/pypa/interoperability-peps > - https://github.com/pypa/interoperability-peps/issues > - https://github.com/pypa/interoperability-peps/issues/31 > - @ncoghlan this could probably jow just be titled "PyPA JSON-LD > Context"? > > On Wed, Mar 22, 2017 at 11:29 AM, Thomas G?ttler < > guettliml at thomas-guettler.de> wrote: > >> > Wes Turner: >> > sourceURL: "git+ssh://git at github.com/pypa/pip at master" >> > sourceURL: "git+https://github.com/pypa/pip at master" >> > Or, we could add "sourceURL" (pending bikeshedding on the property >> name) to the metadata 3.0 PEP. >> >> Why not? >> >> What is the next step to add sourceURL to the pep? >> >> Regards, >> Thomas G?ttler >> >> >> -- >> Thomas Guettler http://www.thomas-guettler.de/ >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Mar 23 03:23:06 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 23 Mar 2017 17:23:06 +1000 Subject: [Distutils] add "sourceURL" to the metadata 3.0 PEP. In-Reply-To: References: Message-ID: On 23 March 2017 at 02:29, Thomas G?ttler wrote: > > Wes Turner: > > sourceURL: "git+ssh://git at github.com/pypa/pip at master" > > sourceURL: "git+https://github.com/pypa/pip at master" > > Or, we could add "sourceURL" (pending bikeshedding on the property name) > to the metadata 3.0 PEP. > > Why not? > > What is the next step to add sourceURL to the pep? > I'm not adding any new metadata fields to the core metadata 3.0 proposal (I'm only removing them). This means we're not going to be automating the process of getting an editable checkout in the core tools any time soon - there are already 100k+ published packages on PyPI, so anyone that seriously wants to do this is going to have to write their own client utility that attempts to infer it from the metadata that already exists (probably by building atop distlib, since that has all the necessary pieces to read the various metadata formats, both remote and local). Future metadata extensions might help to make such a tool more reliable, but *requiring* metadata changes to be made first will just make it useless (since it wouldn't work at all until after publishers start publishing the new metadata, which would mean waiting years before it covered a reasonable percentage of PyPI). Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From leonardo.bianconi at eldorado.org.br Thu Mar 23 15:00:53 2017 From: leonardo.bianconi at eldorado.org.br (Leonardo Bianconi) Date: Thu, 23 Mar 2017 19:00:53 +0000 Subject: [Distutils] Wheel files for PPC64le Message-ID: <840906168df74413ac361803c1bbf1b8@serv030.corp.eldorado.org.br> Hi all! I have been discussed the creation of a PEP, that describes how to create wheel files for the PPC64le architecture on wheel-builders (https://mail.python.org/pipermail/wheel-builders/) since January (https://mail.python.org/pipermail/wheel-builders/2017-January/000245.html). As all discussion from that list are done, the next step is it be reviewed here, and then create a draft on github, see it bellow: Abstract ======== This PEP proposes the extension of PEP 513 [1]_, which means extending the work done for platform tag ``manylinux1_``, initially created for x86_64 and i686 systems, to work on PowerPC 64 little endian (ppc64le), making wheel files available for this architecture. The platform tag, of this architecture for Python package built distributions, is called ``manylinux3_{ppc64le}``. Rationale ========= Currently on PowerPC 64 little endian, the ``pip install`` process downloads the module source code and builds it on the fly, to after that, install it. This process may cause a divergence on the presence of optional libraries it uses. One example of that is numpy, which optionally can use the OpenBlas [2]_; or BLAS [3]_; or neither of them. For each situation the performance of the module is affected and, badly enough, an end user is not able to know what is causing that. Building wheel files for the new architecture considers all work done on PEP 513 [1]_ with some changes proposed to handle the parameters for another architectures. The ``manylinux3`` policy ========================= Based on PEP 513 [1]_, the policy follows the same rules and library dependencies, but with the following versions for backward compatibility and base Operational System: * Backward compatibility versions: GLIBC <= 2.17 CXXABI <= 1.3.7 GLIBCXX <= 3.4.9 GCC <= 4.8.5 * Base Operational System: The stock O.S. release need to be the CentOS 7 [4]_, as it is the first CentOS release available for PowerPC64 little endian. The tag version for ppc64le architecture starts with 3 (``manylinux3_``), as it is supposed to be the version to match the CentOS 7 [4]_ in the future of the tag for x86_64/i686 architecture. There is the possibility of both tags diverge until it reaches the version 3, then a new PEP may be create to converge both to the manylinux baseline. Compilation of Compliant Wheels =============================== As compiling wheel files that meet the ``manylinux3`` standard for PowerPC64 little endian requires a specific Linux distro and version, the following tool is provided: Docker Image (Will be implemented when CentOS be available on Docker) ------------ The Docker Image is based on CentOS 7 [4]_, which is the first PowerPC 64 little endian CentOS release. The Image contains all necessary tools in the requested version to build wheel files (gcc, g++ and gfortran 4.8.5). Machine Image ------------- A full machine image containing all necessary software is provided for developers until CentOS be available on Docker for ppc64le. Cloud Service ------------- There are Cloud Services that provide ppc64le virtual machines for development. These machines can be used for the development of the wheel files, as CentOS 7 [4]_ an option for O.S.. All steps to obtain a machine on it is available for developers. Auditwheel ---------- This tool is an already provided item from PEP 513 [1]_, but needs to support the new architecture, so we propose the following changes: 1. Change the JSON file to handle more than one architecture, adding the compatible libraries and versions list for it. 2. Add a new filed in the JSON object to handle a list of architecture that the object is compatible. 3. When reading the JSON file, only consider the objects with the correspondent machine architecture. Platform detection for Installers ================================= The platform detection is almost the same as described in PEP 513 [1]_, but with the following proposed change: 1. Add the platform ppc64le in the platform list as a compatible one: [``linux-x86_64``, ``linux-i686``, ``linux-ppc64le``] 2. Add an if to switch architecture and consider the correct version of the GLIBC on ``return have_compatible_glibc(2, 5)``. References ========== .. [1] PEP 513 -- A Platform Tag for Portable Linux Built Distributions (https://www.python.org/dev/peps/pep-0513/) .. [2] OpenBLAS -- An optimized BLAS library (http://www.openblas.net/) .. [3] BLAS -- Basic Linear Algebra Subprograms (http://www.netlib.org/blas/) .. [4] CentOS 7 Release Notes (https://wiki.centos.org/Manuals/ReleaseNotes/CentOS7) .. [5] CentOS 5.11 Release Notes (https://wiki.centos.org/Manuals/ReleaseNotes/CentOS5.11) Thanks, Leonardo Bianconi. From ncoghlan at gmail.com Thu Mar 23 22:58:55 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 24 Mar 2017 12:58:55 +1000 Subject: [Distutils] Wheel files for PPC64le In-Reply-To: <840906168df74413ac361803c1bbf1b8@serv030.corp.eldorado.org.br> References: <840906168df74413ac361803c1bbf1b8@serv030.corp.eldorado.org.br> Message-ID: On 24 March 2017 at 05:00, Leonardo Bianconi < leonardo.bianconi at eldorado.org.br> wrote: > Hi all! > > I have been discussed the creation of a PEP, that describes how to create > wheel > files for the PPC64le architecture on wheel-builders > (https://mail.python.org/pipermail/wheel-builders/) since January > (https://mail.python.org/pipermail/wheel-builders/2017-January/000245.html > ). > Thanks Leonardo! > > As all discussion from that list are done, the next step is it be reviewed > here, and then create a draft on github, You can submit the PR to the PEPs repo whenever you're ready - it's actually handy to have the PEP number assigned fairly early as a convenient reference for the proposal. > The ``manylinux3`` policy > ========================= > > Based on PEP 513 [1]_, the policy follows the same rules and library > dependencies, but with the following versions for backward compatibility > and base Operational System: > > * Backward compatibility versions: > GLIBC <= 2.17 > CXXABI <= 1.3.7 > GLIBCXX <= 3.4.9 > GCC <= 4.8.5 > > * Base Operational System: > The stock O.S. release need to be the CentOS 7 [4]_, as it is the first > CentOS release available for PowerPC64 little endian. > > The tag version for ppc64le architecture starts with 3 (``manylinux3_``), > as it > is supposed to be the version to match the CentOS 7 [4]_ in the future of > the > tag for x86_64/i686 architecture. There is the possibility of both tags > diverge > until it reaches the version 3, then a new PEP may be create to converge > both to the manylinux baseline. > Having manylinuxN consistently align with CentOS(N+4) seems reasonable to me for simplicity's sake, but there should be a discussion in the PEP around how that aligns with ppc64le support on other LTS distros (mainly Debian and Ubuntu). Given the relative dates involved, I'd expect manylinux-style binaries compiled on CentOS 7 to also work on Ubuntu 14.04, 16.04 and Debian 8, but the PEP should explicitly confirm that the nominated symbol versions above are available on all of those distros. > Compilation of Compliant Wheels > =============================== > > As compiling wheel files that meet the ``manylinux3`` standard for > PowerPC64 > little endian requires a specific Linux distro and version, the following > tool > is provided: > > > Docker Image (Will be implemented when CentOS be available on Docker) > ------------ > > The Docker Image is based on CentOS 7 [4]_, which is the first PowerPC 64 > little endian CentOS release. The Image contains all necessary tools in the > requested version to build wheel files (gcc, g++ and gfortran 4.8.5). > These seem to be present now: https://hub.docker.com/r/ppc64le/centos/tags/ I'm not clear on the provenance of the 'ppc64le' user account though, so I've asked for clarification: ttps:// twitter.com/ncoghlan_dev/status/845099237117329408 > Platform detection for Installers > ================================= > > The platform detection is almost the same as described in PEP 513 [1]_, but > with the following proposed change: > > 1. Add the platform ppc64le in the platform list as a compatible one: > [``linux-x86_64``, ``linux-i686``, ``linux-ppc64le``] > 2. Add an if to switch architecture and consider the correct version of the > GLIBC on ``return have_compatible_glibc(2, 5)``. > I don't think is quite that simple, as installers need to be able to figure out: - on manylinux3 compatible platforms, prefer manylinux3 to manylinux1 - on manylinux3 *in*compatible platforms, only consider manylinux1 And that means asking the question: when combined with the option of the distro-provided `_manylinux` module, is "have_compatible_glibc(2, 5) and not have_compatible_glibc(2, 17)" an adequate check for the latter case? (My inclination is to say "yes", but it would be helpful to have some more concrete data on glibc versions in different distros of interest) Beyond that, I think the main open question would be: do we go ahead and define the full `manylinux3` specification now? CentOS 7+, Ubuntu 14.04+, Debian 8+ compatibility still covers a *lot* of distros and deployments, and doing so means folks can bring the latest versions of gcc to bear on their code, rather than being limited to the last version that was made available for RHEL/CentOS 5 (gcc 4.8). Going down that path would also means things would be simpler on the PyPI front - it could just allow manylinux3 for any architecture and let installers decide whether or not to use them. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Thu Mar 23 23:24:03 2017 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 23 Mar 2017 22:24:03 -0500 Subject: [Distutils] add "sourceURL" to the metadata 3.0 PEP. In-Reply-To: References: Message-ID: On Thu, Mar 23, 2017 at 2:23 AM, Nick Coghlan wrote: > On 23 March 2017 at 02:29, Thomas G?ttler > wrote: > >> > Wes Turner: >> > sourceURL: "git+ssh://git at github.com/pypa/pip at master" >> > sourceURL: "git+https://github.com/pypa/pip at master" >> > Or, we could add "sourceURL" (pending bikeshedding on the property >> name) to the metadata 3.0 PEP. >> >> Why not? >> >> What is the next step to add sourceURL to the pep? >> > > I'm not adding any new metadata fields to the core metadata 3.0 proposal > (I'm only removing them). > Got it. > > This means we're not going to be automating the process of getting an > editable checkout in the core tools any time soon - there are already 100k+ > published packages on PyPI, so anyone that seriously wants to do this is > going to have to write their own client utility that attempts to infer it > from the metadata that already exists (probably by building atop distlib, > since that has all the necessary pieces to read the various metadata > formats, both remote and local). > > Future metadata extensions might help to make such a tool more reliable, > but *requiring* metadata changes to be made first will just make it useless > (since it wouldn't work at all until after publishers start publishing the > new metadata, which would mean waiting years before it covered a reasonable > percentage of PyPI). > Here's a way to define Requirements and a RequirementsMap with additional data: https://github.com/westurner/pyleset/blob/57140bcef5/setup.py#L118 It creates a directory full of requirements[.dev].txt files: https://github.com/westurner/pyleset/tree/57140bce/requirements Additional metadata in Pipfile would be nice; but it would be fairly easy to send a PR to: BLD: setup.py: add the canonical sourceURL > Regards, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Mar 24 00:59:10 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 24 Mar 2017 14:59:10 +1000 Subject: [Distutils] add "sourceURL" to the metadata 3.0 PEP. In-Reply-To: References: Message-ID: On 24 March 2017 at 13:24, Wes Turner wrote: > > On Thu, Mar 23, 2017 at 2:23 AM, Nick Coghlan wrote: > >> This means we're not going to be automating the process of getting an >> editable checkout in the core tools any time soon - there are already 100k+ >> published packages on PyPI, so anyone that seriously wants to do this is >> going to have to write their own client utility that attempts to infer it >> from the metadata that already exists (probably by building atop distlib, >> since that has all the necessary pieces to read the various metadata >> formats, both remote and local). >> >> Future metadata extensions might help to make such a tool more reliable, >> but *requiring* metadata changes to be made first will just make it useless >> (since it wouldn't work at all until after publishers start publishing the >> new metadata, which would mean waiting years before it covered a reasonable >> percentage of PyPI). >> > > Here's a way to define Requirements and a RequirementsMap with additional > data: > https://github.com/westurner/pyleset/blob/57140bcef5/setup.py#L118 > > It creates a directory full of requirements[.dev].txt files: > https://github.com/westurner/pyleset/tree/57140bce/requirements > > Additional metadata in Pipfile would be nice; > but it would be fairly easy to send a PR to: > > BLD: setup.py: add the canonical sourceURL > PEP 426 already has a source URL field: https://www.python.org/dev/peps/pep-0426/#source-url It's just not required to be a *version* control reference - it's free to be a reference to a tarball or zip archive instead (just not a reference to the sdist itself, since that will contain a copy of the metadata file). However, independently of that concern, "send a PR" is only the first step in updating published metadata to accommodate tasks that package *consumers* want to perform: 1. Someone has to write and submit the upstream project patch 2. The publisher has to review and accept the change 3. The publisher has to publish the new release 4. Rinse-and-repeat for dozens/hundreds/thousands of projects, depending on the scope of what you care about So the lesson we've learned is that for consumer tasks it's *always* better to start by asking "How can I best achieve my objective without asking publishers to change *anything*?". In the case of finding version control references, that's a matter of: - looking at Download-URL and Project-URL entries for links that "look like" version control references - if that doesn't turn up anything useful, scan the long description - once you have a repository reference, look for promising tag names (if the link didn't nominate a specific commit) On the *publisher* side, the equivalent question is "Can publishers already choose to publish this metadata without having to wait for a metadata update?". In this case, the answer is yes, due to the "Project-URL" field: anyone is free to push for the adoption of a particular convention for tagging the exact version control reference needed for "pip -e" to retrieve the corresponding source code. Putting those two together means that anyone that chooses to do so is already free to write a tool that: - downloads a PyPI package - looks for a "Editable Install" Project-URL, and uses that if defined - otherwise looks for a promising VCS reference in Download-URL, the Project-URL definitions, and the long description - runs `pip -e` based on whatever it finds And as long as that tool is itself pip installable, there's no particular reason the feature needs to be built into pip itself. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Fri Mar 24 05:26:11 2017 From: wes.turner at gmail.com (Wes Turner) Date: Fri, 24 Mar 2017 04:26:11 -0500 Subject: [Distutils] add "sourceURL" to the metadata 3.0 PEP. In-Reply-To: References: Message-ID: On Thu, Mar 23, 2017 at 11:59 PM, Nick Coghlan wrote: > On 24 March 2017 at 13:24, Wes Turner wrote: > >> >> On Thu, Mar 23, 2017 at 2:23 AM, Nick Coghlan wrote: >> >>> This means we're not going to be automating the process of getting an >>> editable checkout in the core tools any time soon - there are already 100k+ >>> published packages on PyPI, so anyone that seriously wants to do this is >>> going to have to write their own client utility that attempts to infer it >>> from the metadata that already exists (probably by building atop distlib, >>> since that has all the necessary pieces to read the various metadata >>> formats, both remote and local). >>> >>> Future metadata extensions might help to make such a tool more reliable, >>> but *requiring* metadata changes to be made first will just make it useless >>> (since it wouldn't work at all until after publishers start publishing the >>> new metadata, which would mean waiting years before it covered a reasonable >>> percentage of PyPI). >>> >> >> Here's a way to define Requirements and a RequirementsMap with additional >> data: >> https://github.com/westurner/pyleset/blob/57140bcef5/setup.py#L118 >> >> It creates a directory full of requirements[.dev].txt files: >> https://github.com/westurner/pyleset/tree/57140bce/requirements >> >> Additional metadata in Pipfile would be nice; >> but it would be fairly easy to send a PR to: >> >> BLD: setup.py: add the canonical sourceURL >> > > PEP 426 already has a source URL field: https://www.python.org/dev/ > peps/pep-0426/#source-url > > It's just not required to be a *version* control reference - it's free to > be a reference to a tarball or zip archive instead (just not a reference to > the sdist itself, since that will contain a copy of the metadata file). > > However, independently of that concern, "send a PR" is only the first step > in updating published metadata to accommodate tasks that package > *consumers* want to perform: > > 1. Someone has to write and submit the upstream project patch > 2. The publisher has to review and accept the change > 3. The publisher has to publish the new release > 4. Rinse-and-repeat for dozens/hundreds/thousands of projects, depending > on the scope of what you care about > > So the lesson we've learned is that for consumer tasks it's *always* > better to start by asking "How can I best achieve my objective without > asking publishers to change *anything*?". > > In the case of finding version control references, that's a matter of: > > - looking at Download-URL and Project-URL entries for links that "look > like" version control references > - if that doesn't turn up anything useful, scan the long description > - once you have a repository reference, look for promising tag names (if > the link didn't nominate a specific commit) > > On the *publisher* side, the equivalent question is "Can publishers > already choose to publish this metadata without having to wait for a > metadata update?". > > In this case, the answer is yes, due to the "Project-URL" field: anyone is > free to push for the adoption of a particular convention for tagging the > exact version control reference needed for "pip -e" to retrieve the > corresponding source code. > > Putting those two together means that anyone that chooses to do so is > already free to write a tool that: > > - downloads a PyPI package > - looks for a "Editable Install" Project-URL, and uses that if defined > - otherwise looks for a promising VCS reference in Download-URL, the > Project-URL definitions, and the long description > - runs `pip -e` based on whatever it finds > > > And as long as that tool is itself pip installable, there's no particular > reason the feature needs to be built into pip itself. > STORY: Users can pull the source code for each installed package (git, [{RPM,} (archive-within-RPM.tar.gz)]) ... the npm package.json docs are a pretty good read here: - (with {name, description, url} things are already schema.org/Thing s) - https://docs.npmjs.com/files/package.json#bugs - https://docs.npmjs.com/files/package.json#repository - https://docs.npmjs.com/files/package.json#man ```json "bugs": { "url" : "https://github.com/owner/project/issues", "email" : "project at hostname.com" } "repository" : { "type" : "git" , "url" : "https://github.com/npm/npm.git" } "repository" : { "type" : "svn" , "url" : "https://v8.googlecode.com/svn/trunk/" } ``` > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Fri Mar 24 05:37:43 2017 From: wes.turner at gmail.com (Wes Turner) Date: Fri, 24 Mar 2017 04:37:43 -0500 Subject: [Distutils] add "sourceURL" to the metadata 3.0 PEP. In-Reply-To: References: Message-ID: On Fri, Mar 24, 2017 at 4:26 AM, Wes Turner wrote: > > > On Thu, Mar 23, 2017 at 11:59 PM, Nick Coghlan wrote: > >> On 24 March 2017 at 13:24, Wes Turner wrote: >> >>> >>> On Thu, Mar 23, 2017 at 2:23 AM, Nick Coghlan >>> wrote: >>> >>>> This means we're not going to be automating the process of getting an >>>> editable checkout in the core tools any time soon - there are already 100k+ >>>> published packages on PyPI, so anyone that seriously wants to do this is >>>> going to have to write their own client utility that attempts to infer it >>>> from the metadata that already exists (probably by building atop distlib, >>>> since that has all the necessary pieces to read the various metadata >>>> formats, both remote and local). >>>> >>>> Future metadata extensions might help to make such a tool more >>>> reliable, but *requiring* metadata changes to be made first will just make >>>> it useless (since it wouldn't work at all until after publishers start >>>> publishing the new metadata, which would mean waiting years before it >>>> covered a reasonable percentage of PyPI). >>>> >>> >>> Here's a way to define Requirements and a RequirementsMap with >>> additional data: >>> https://github.com/westurner/pyleset/blob/57140bcef5/setup.py#L118 >>> >>> It creates a directory full of requirements[.dev].txt files: >>> https://github.com/westurner/pyleset/tree/57140bce/requirements >>> >>> Additional metadata in Pipfile would be nice; >>> but it would be fairly easy to send a PR to: >>> >>> BLD: setup.py: add the canonical sourceURL >>> >> >> PEP 426 already has a source URL field: https://www.python.org/dev/pep >> s/pep-0426/#source-url >> >> It's just not required to be a *version* control reference - it's free to >> be a reference to a tarball or zip archive instead (just not a reference to >> the sdist itself, since that will contain a copy of the metadata file). >> >> However, independently of that concern, "send a PR" is only the first >> step in updating published metadata to accommodate tasks that package >> *consumers* want to perform: >> >> 1. Someone has to write and submit the upstream project patch >> 2. The publisher has to review and accept the change >> 3. The publisher has to publish the new release >> 4. Rinse-and-repeat for dozens/hundreds/thousands of projects, depending >> on the scope of what you care about >> >> So the lesson we've learned is that for consumer tasks it's *always* >> better to start by asking "How can I best achieve my objective without >> asking publishers to change *anything*?". >> >> In the case of finding version control references, that's a matter of: >> >> - looking at Download-URL and Project-URL entries for links that "look >> like" version control references >> - if that doesn't turn up anything useful, scan the long description >> - once you have a repository reference, look for promising tag names (if >> the link didn't nominate a specific commit) >> >> On the *publisher* side, the equivalent question is "Can publishers >> already choose to publish this metadata without having to wait for a >> metadata update?". >> > > >> In this case, the answer is yes, due to the "Project-URL" field: anyone >> is free to push for the adoption of a particular convention for tagging the >> exact version control reference needed for "pip -e" to retrieve the >> corresponding source code. >> > https://www.google.com/search?q=python+pep+"project-url" (!) https://www.python.org/dev/peps/pep-0345/#project-url-multiple-use Project-URL (multiple-use) > A string containing a browsable URL for the project and a label for it, > separated by a comma. > Example: > Bug Tracker, http://bitbucket.org/tarek/distribute/issues/ > The label is a free text limited to 32 signs. - Predicate URIs are often longer than 32 signs. (pypi:pkgname, label, URL) # RDF triples (subject, predicate, object) (URI, URI, *) # RDF quads (graph, s, p, o) (URI, URI, URI, *) >From http://legacy.python.org/dev/peps/pep-0426/#source-url: > For version control references, the VCS+protocol scheme SHOULD be used to > identify both the version control system and the secure transport, and a > version control system with hash based commit identifiers SHOULD be used. > Automated tools MAY omit warnings about missing hashes for version control > systems that do not provide hash based commit identifiers. > To handle version control systems that do not support including commit or > tag references directly in the URL, that information may be appended to the > end of the URL using the @ or the @# > notation. >> Putting those two together means that anyone that chooses to do so is >> already free to write a tool that: >> >> - downloads a PyPI package >> - looks for a "Editable Install" Project-URL, and uses that if defined >> - otherwise looks for a promising VCS reference in Download-URL, the >> Project-URL definitions, and the long description >> > > Explicit is better than implicit. > Simple is better than complex. > - runs `pip -e` based on whatever it finds >> > > >> And as long as that tool is itself pip installable, there's no particular >> reason the feature needs to be built into pip itself. >> > > STORY: Users can pull the source code for each installed package (git, > [{RPM,} (archive-within-RPM.tar.gz)]) > > ... > > the npm package.json docs are a pretty good read here: > > - (with {name, description, url} things are already schema.org/Thing s) > - https://docs.npmjs.com/files/package.json#bugs > - https://docs.npmjs.com/files/package.json#repository > - https://docs.npmjs.com/files/package.json#man > > ```json > > "bugs": > { "url" : "https://github.com/owner/project/issues", > "email" : "project at hostname.com" > } > > "repository" : > { "type" : "git" > , "url" : "https://github.com/npm/npm.git" > } > > "repository" : > { "type" : "svn" > , "url" : "https://v8.googlecode.com/svn/trunk/" > } > ``` > > > >> >> Cheers, >> Nick. >> >> -- >> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Mar 24 08:41:37 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 24 Mar 2017 22:41:37 +1000 Subject: [Distutils] add "sourceURL" to the metadata 3.0 PEP. In-Reply-To: References: Message-ID: On 24 March 2017 at 19:37, Wes Turner wrote: > > On Fri, Mar 24, 2017 at 4:26 AM, Wes Turner wrote: > > https://www.python.org/dev/peps/pep-0345/#project-url-multiple-use > > Project-URL (multiple-use) >> A string containing a browsable URL for the project and a label for it, >> separated by a comma. >> Example: >> Bug Tracker, http://bitbucket.org/tarek/distribute/issues/ >> The label is a free text limited to 32 signs. > > > - Predicate URIs are often longer than 32 signs. > The nominal 32 character limit is on the label, not on the URL. (And I'm not sure it's a real limit in practice) Putting those two together means that anyone that chooses to do so is >> already free to write a tool that: >> >> - downloads a PyPI package >> - looks for a "Editable Install" Project-URL, and uses that if defined >> - otherwise looks for a promising VCS reference in Download-URL, the >> Project-URL definitions, and the long description >> > > Explicit is better than implicit. > Simple is better than complex. And complex is better than complicated. The logistics of packaging metadata updates are complex because the deployment cycles are so long, and you somehow have to backfill missing data for projects that don't yet provide it in the new-and-improved form. For this particular problem, finding the right URL to clone is such a small part of making edits to a dependency that it's an incredibly long way down the list of "limitations that regularly cause problems for Python users". Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From lmazuel at microsoft.com Fri Mar 24 12:49:05 2017 From: lmazuel at microsoft.com (Laurent Mazuel) Date: Fri, 24 Mar 2017 16:49:05 +0000 Subject: [Distutils] Pip different installation behavior when using --no-cache-dir Message-ID: Hello, I was talking with Brett Cannon at work today about why ?python setup.py install? is called when `--no-cache-dir` is specified instead of python setup.py bdist_wheel in the normal case. He didn't know the answer and suggested I ask here to see if it's on purpose or just an oversight of when caching was introduced. Basically we have a situation where we tweak a little the bdist_wheel step in our setup.py to take advantage of the ?flat? install of pip, and allow easyinstall to work perfectly from the sdist at the same time. Pip install of the sdist file works like a charm, since as expected pip call bdist_wheel first (and apply our changes): > pip install .\azure-common-1.1.4.zip Processing d:\vsprojects\azure-sdk-for-python-official\wheelhouse\azure-common-1.1.4.zip Building wheels for collected packages: azure-common Running setup.py bdist_wheel for azure-common ... done Stored in directory: C:\Users\lmazuel\AppData\Local\pip\Cache\wheels\4c\3a\10\5e2ef6db79d3785728205a4b5b8348eb41a474ec99505cd865 Successfully built azure-common Installing collected packages: azure-common Successfully installed azure-common-1.1.4 However, the same call with ?no-cache-dir bypasses the bdist_wheel step: > pip install --no-cache-dir .\azure-common-1.1.4.zip Processing d:\vsprojects\azure-sdk-for-python-official\wheelhouse\azure-common-1.1.4.zip Installing collected packages: azure-common Running setup.py install for azure-common ... done Successfully installed azure-common-1.1.4 It seems to me that pip should always call bdist_wheel, since in theory the wheel building step can be changed to fit the platform (in my understanding of sdist VS egg VS wheel). And so even ?no-cache-dir should call bdist_wheel, even if the wheel is not cached at the end. Pip install of the sdist is not our most common situation since our wheels are universals, but I?m still interesting to improve my pip knowledge ? What do you think? Thanks! Laurent -------------- next part -------------- An HTML attachment was scrubbed... URL: From leonardo.bianconi at eldorado.org.br Fri Mar 24 15:40:56 2017 From: leonardo.bianconi at eldorado.org.br (Leonardo Bianconi) Date: Fri, 24 Mar 2017 19:40:56 +0000 Subject: [Distutils] Wheel files for PPC64le In-Reply-To: References: <840906168df74413ac361803c1bbf1b8@serv030.corp.eldorado.org.br> Message-ID: <46301d400edb469099abac67f6fed74a@serv030.corp.eldorado.org.br> From: Nick Coghlan [mailto:ncoghlan at gmail.com] Sent: quinta-feira, 23 de mar?o de 2017 23:59 To: Leonardo Bianconi Cc: distutils-sig at python.org Subject: Re: [Distutils] Wheel files for PPC64le On 24 March 2017 at 05:00, Leonardo Bianconi > wrote: Hi all! I have been discussed the creation of a PEP, that describes how to create wheel files for the PPC64le architecture on wheel-builders (https://mail.python.org/pipermail/wheel-builders/) since January (https://mail.python.org/pipermail/wheel-builders/2017-January/000245.html). Thanks Leonardo! As all discussion from that list are done, the next step is it be reviewed here, and then create a draft on github, You can submit the PR to the PEPs repo whenever you're ready - it's actually handy to have the PEP number assigned fairly early as a convenient reference for the proposal. The ``manylinux3`` policy ========================= Based on PEP 513 [1]_, the policy follows the same rules and library dependencies, but with the following versions for backward compatibility and base Operational System: * Backward compatibility versions: GLIBC <= 2.17 CXXABI <= 1.3.7 GLIBCXX <= 3.4.9 GCC <= 4.8.5 * Base Operational System: The stock O.S. release need to be the CentOS 7 [4]_, as it is the first CentOS release available for PowerPC64 little endian. The tag version for ppc64le architecture starts with 3 (``manylinux3_``), as it is supposed to be the version to match the CentOS 7 [4]_ in the future of the tag for x86_64/i686 architecture. There is the possibility of both tags diverge until it reaches the version 3, then a new PEP may be create to converge both to the manylinux baseline. Having manylinuxN consistently align with CentOS(N+4) seems reasonable to me for simplicity's sake, but there should be a discussion in the PEP around how that aligns with ppc64le support on other LTS distros (mainly Debian and Ubuntu). Given the relative dates involved, I'd expect manylinux-style binaries compiled on CentOS 7 to also work on Ubuntu 14.04, 16.04 and Debian 8, but the PEP should explicitly confirm that the nominated symbol versions above are available on all of those distros. Ok, I can add it to the PEP, but regarding the supported distros, the older than CentOS 7 may not be compatible, based on the backward compatibility rules, that does not guarantee compatibility with older versions, only with newer. I sent a message about it here https://mail.python.org/pipermail/wheel-builders/2017-March/000265.html. Compilation of Compliant Wheels =============================== As compiling wheel files that meet the ``manylinux3`` standard for PowerPC64 little endian requires a specific Linux distro and version, the following tool is provided: Docker Image (Will be implemented when CentOS be available on Docker) ------------ The Docker Image is based on CentOS 7 [4]_, which is the first PowerPC 64 little endian CentOS release. The Image contains all necessary tools in the requested version to build wheel files (gcc, g++ and gfortran 4.8.5). These seem to be present now: https://hub.docker.com/r/ppc64le/centos/tags/ I'm not clear on the provenance of the 'ppc64le' user account though, so I've asked for clarification: ttps://twitter.com/ncoghlan_dev/status/845099237117329408 Platform detection for Installers ================================= The platform detection is almost the same as described in PEP 513 [1]_, but with the following proposed change: 1. Add the platform ppc64le in the platform list as a compatible one: [``linux-x86_64``, ``linux-i686``, ``linux-ppc64le``] 2. Add an if to switch architecture and consider the correct version of the GLIBC on ``return have_compatible_glibc(2, 5)``. I don't think is quite that simple, as installers need to be able to figure out: - on manylinux3 compatible platforms, prefer manylinux3 to manylinux1 - on manylinux3 *in*compatible platforms, only consider manylinux1 And that means asking the question: when combined with the option of the distro-provided `_manylinux` module, is "have_compatible_glibc(2, 5) and not have_compatible_glibc(2, 17)" an adequate check for the latter case? (My inclination is to say "yes", but it would be helpful to have some more concrete data on glibc versions in different distros of interest) Well, I didn?t realize that proposing a new tag would require an additional check about the tags, which will be a requirement for the manylinux2 as well, when the CentOS 5 be replaced by CentOS 6 for x86_64/i686. I need to check where and how the method ?is_manylinux1_compatible? is used to think how it would be done. I will check that and propose how to do it. Beyond that, I think the main open question would be: do we go ahead and define the full `manylinux3` specification now? CentOS 7+, Ubuntu 14.04+, Debian 8+ compatibility still covers a *lot* of distros and deployments, and doing so means folks can bring the latest versions of gcc to bear on their code, rather than being limited to the last version that was made available for RHEL/CentOS 5 (gcc 4.8). Actually the idea was make it available for PPC64le, just as it is available to x86_64/i686 nowadays, like porting it. I didn?t think about the definition of all requirements for the manylinux3 for all architectures, as it can change until x86_64/i686 get to the manylinux3. Being limited to an old version, as CentOS 5 (gcc 4.8) is a requirement from PEP 513, which guarantees the backward compatibility, right? I do not want to change it, this proposal is just to create a tag for PPC64le, until both architectures get to the same base distro version. As I said above, I have already sent a message about basing it on CentOS 7, which does not guarantee the compatibility with older distros (example: Ubuntu 14.04). Is there any thinking about base on a newer distro and make the wheel files compatible with distros older than it? Sorry if I?m missing something here. Going down that path would also means things would be simpler on the PyPI front - it could just allow manylinux3 for any architecture and let installers decide whether or not to use them. Cheers, Nick. I?m coping the Bruno Rosa, which will be involved with this PEP as well. Cheers, Leonardo Bianconi. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From guettliml at thomas-guettler.de Fri Mar 24 17:12:14 2017 From: guettliml at thomas-guettler.de (=?UTF-8?Q?Thomas_G=c3=bcttler?=) Date: Fri, 24 Mar 2017 22:12:14 +0100 Subject: [Distutils] add "sourceURL" to the metadata 3.0 PEP. In-Reply-To: References: Message-ID: <6defa0f1-ebe9-1f99-3f7a-edc3acf0ffc2@thomas-guettler.de> Am 24.03.2017 um 05:59 schrieb Nick Coghlan: > On 24 March 2017 at 13:24, Wes Turner > wrote: > > > On Thu, Mar 23, 2017 at 2:23 AM, Nick Coghlan > wrote: > > This means we're not going to be automating the process of getting an editable checkout in the core tools any time soon - there are already 100k+ published packages on PyPI, so anyone that seriously wants to do this is going to have to write their own client utility that attempts to infer it from the metadata that already exists (probably by building atop distlib, since that has all the necessary pieces to read the various metadata formats, both remote and local). > > Future metadata extensions might help to make such a tool more reliable, but *requiring* metadata changes to be made first will just make it useless (since it wouldn't work at all until after publishers start publishing the new metadata, which would mean waiting years before it covered a reasonable percentage of PyPI). > > > Here's a way to define Requirements and a RequirementsMap with additional data: > https://github.com/westurner/pyleset/blob/57140bcef5/setup.py#L118 > > It creates a directory full of requirements[.dev].txt files: > https://github.com/westurner/pyleset/tree/57140bce/requirements > > Additional metadata in Pipfile would be nice; > but it would be fairly easy to send a PR to: > > BLD: setup.py: add the canonical sourceURL > > > PEP 426 already has a source URL field: https://www.python.org/dev/peps/pep-0426/#source-url > > It's just not required to be a *version* control reference - it's free to be a reference to a tarball or zip archive instead (just not a reference to the sdist itself, since that will contain a copy of the metadata file). > > However, independently of that concern, "send a PR" is only the first step in updating published metadata to accommodate tasks that package *consumers* want to perform: > > 1. Someone has to write and submit the upstream project patch > 2. The publisher has to review and accept the change > 3. The publisher has to publish the new release > 4. Rinse-and-repeat for dozens/hundreds/thousands of projects, depending on the scope of what you care about > > So the lesson we've learned is that for consumer tasks it's *always* better to start by asking "How can I best achieve my objective without asking publishers to change *anything*?". > > In the case of finding version control references, that's a matter of: > > - looking at Download-URL and Project-URL entries for links that "look like" version control references > - if that doesn't turn up anything useful, scan the long description > - once you have a repository reference, look for promising tag names (if the link didn't nominate a specific commit) This is the spirit of python packaging: We love guessing and trying we hate well defined datastructures. Yes, nothing is more boring then Entity?relationship models. But this provides solid ground. Yes, I can write code which does the steps you suggest: looking at Download-URL, if this ... do that, look for something promising .... on full moon start dancing, but not in april ... Anyone is free to do what he wants. JSON here, JSON there ... Let's meet again in ten years and have a look how the IT world changed. My guess: well defined datastructures like protocol buffers will increase and fuzzy data structures like json will decrease. Unfortunately I have not found a dependency resolver which works for several languages (not just python) yet. The feeling "I want to leave pip and python-packaging" is here since several years, but up to now I found to concrete path to follow. I know that all here are doing their best. If you feel insulted, then I am not sorry at all. Since I did not do it. I just wrote what I feel. This is my personal problem. Not yours. Regards, Thomas G?ttler -- http://www.thomas-guettler.de/ From ncoghlan at gmail.com Sat Mar 25 01:38:56 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 25 Mar 2017 15:38:56 +1000 Subject: [Distutils] add "sourceURL" to the metadata 3.0 PEP. In-Reply-To: <6defa0f1-ebe9-1f99-3f7a-edc3acf0ffc2@thomas-guettler.de> References: <6defa0f1-ebe9-1f99-3f7a-edc3acf0ffc2@thomas-guettler.de> Message-ID: On 25 March 2017 at 07:12, Thomas G?ttler wrote: > Am 24.03.2017 um 05:59 schrieb Nick Coghlan: > > So the lesson we've learned is that for consumer tasks it's *always* > better to start by asking "How can I best achieve my objective without > asking publishers to change *anything*?". > > > > In the case of finding version control references, that's a matter of: > > > > - looking at Download-URL and Project-URL entries for links that "look > like" version control references > > - if that doesn't turn up anything useful, scan the long description > > - once you have a repository reference, look for promising tag names (if > the link didn't nominate a specific commit) > > This is the spirit of python packaging: > > We love guessing and trying we hate well defined datastructures. > > Yes, nothing is more boring then Entity?relationship models. But this > provides solid ground. > We've learned over the years that the data migration challenges involved in packaging metadata changes override essentially *every* other consideration. We've also learned that volunteers generally won't work on features they don't personally need, that commercial vendors will typically only fund work on their own downstream package management systems, and that if an opportunity arises to structurally bypass PyPA and python-dev as centralised approval authorities (aka procedural bottlenecks) we should take it. Those lessons mean that all proposals for changes to the metadata now have to address two key not-so-simple questions: 1. Is the idea still potentially useful even if *nobody except the person proposing it* ever adopts it? 2. Is a formal change to the interoperability specifications, with the associated delays in availability and adoption, the *only* way to solve the issue at hand? This is a large part of why PEP 426 was deferred for so long, and why my recent changes to that PEP have all been directed at removing features rather than adding them. It's also why the accepted pyproject.toml proposal assumes that sdists will continue to include a setup.py shim for backwards compatibility with older tools that assume distutils/setuptools based build processes. A "py-install-editable" utility that looks for an "Editable Source" label in Project-URL as a convention would meet those criterion, as it makes use of an existing v1.2 metadata field and shouldn't require any changes to pip, PyPI, setuptools, etc for people to enable it for their *own* projects - they'll just have to set Project-URL appropriately, and run "pip install py-install-editable && py-install-editable ". Whoever actually wrote the `py-install-editable` tool would have complete freedom to define the expected label name, as well as what fallback heuristics (if any) were tried in the event that the label wasn't set, without having to seek permission or approval from anyone else (not even distutils-sig). If such a technique got popular enough, *then* we could look at elevating it beyond "Project URL with a conventional label backed up by an end user tool that reads that label" (e.g. by having "pip install -e" itself read the label, or enshrining the conventional label name in a PEP). Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From guettliml at thomas-guettler.de Sat Mar 25 08:10:33 2017 From: guettliml at thomas-guettler.de (=?UTF-8?Q?Thomas_G=c3=bcttler?=) Date: Sat, 25 Mar 2017 13:10:33 +0100 Subject: [Distutils] add "sourceURL" to the metadata 3.0 PEP. In-Reply-To: References: <6defa0f1-ebe9-1f99-3f7a-edc3acf0ffc2@thomas-guettler.de> Message-ID: <622b27bd-b8e4-9fa5-1fa8-0ff27fb464c6@thomas-guettler.de> Am 25.03.2017 um 06:38 schrieb Nick Coghlan: > On 25 March 2017 at 07:12, Thomas G?ttler > wrote: > > Am 24.03.2017 um 05:59 schrieb Nick Coghlan: > > So the lesson we've learned is that for consumer tasks it's *always* better to start by asking "How can I best achieve my objective without asking publishers to change *anything*?". > > > > In the case of finding version control references, that's a matter of: > > > > - looking at Download-URL and Project-URL entries for links that "look like" version control references > > - if that doesn't turn up anything useful, scan the long description > > - once you have a repository reference, look for promising tag names (if the link didn't nominate a specific commit) > > This is the spirit of python packaging: > > We love guessing and trying we hate well defined datastructures. > > Yes, nothing is more boring then Entity?relationship models. But this provides solid ground. > > > We've learned over the years that the data migration challenges involved in packaging metadata changes override essentially *every* other consideration. We've also learned that volunteers generally won't work on features they don't personally need, that commercial vendors will typically only fund work on their own downstream package management systems, and that if an opportunity arises to structurally bypass PyPA and python-dev as centralised approval authorities (aka procedural bottlenecks) we should take it. > > Those lessons mean that all proposals for changes to the metadata now have to address two key not-so-simple questions: > > 1. Is the idea still potentially useful even if *nobody except the person proposing it* ever adopts it? > 2. Is a formal change to the interoperability specifications, with the associated delays in availability and adoption, the *only* way to solve the issue at hand? > > This is a large part of why PEP 426 was deferred for so long, and why my recent changes to that PEP have all been directed at removing features rather than adding them. This is great to hear! I tried to read it three times in the past. Everytime it took to long to read it. I was interrupted by more urgent stuff or I realized that I am tired and need some hours of sleep. This is my personal opinion: Why a PEP at all? I like this quote: rough consensus and running code. > It's also why the accepted pyproject.toml proposal assumes that sdists will continue to include a setup.py shim for backwards compatibility with older tools that assume distutils/setuptools based build processes. At university I was told to avoid redundancy. But you are the expert here. If you think redundancy is good here, then I think it is the right decision. > > A "py-install-editable" utility that looks for an "Editable Source" label in Project-URL as a convention would meet those criterion, as it makes use of an existing v1.2 metadata field and shouldn't require any changes to pip, PyPI, setuptools, etc for people to enable it for their *own* projects - they'll just have to set Project-URL appropriately, and run "pip install py-install-editable && py-install-editable ". Whoever actually wrote the `py-install-editable` tool would have complete freedom to define the expected label name, as well as what fallback heuristics (if any) were tried in the event that the label wasn't set, without having to seek permission or approval from anyone else (not even distutils-sig). OK, that sounds good. At first I had he impression that this is not possible. My idea could be implemented without modifying pip ... great. > > If such a technique got popular enough, *then* we could look at elevating it beyond "Project URL with a conventional label backed up by an end user tool that reads that label" (e.g. by having "pip install -e" itself read the label, or enshrining the conventional label name in a PEP). Yes, this sound good. This way the core does not get polluted by random crazy ideas. Thank you very very much for listening. Regards, Thomas G?ttler -- I am looking for feedback for my personal programming guidelines: https://github.com/guettli/programming-guidelines From pradyunsg at gmail.com Sat Mar 25 11:44:19 2017 From: pradyunsg at gmail.com (Pradyun Gedam) Date: Sat, 25 Mar 2017 15:44:19 +0000 Subject: [Distutils] GSoC 2017 - Request for Comments on Proposal Message-ID: Hello Everyone! I had previously sent a mail on this list, stating that I would like to work on pip's dependency resolution for my GSoC 2017. I now have drafted a proposal for the same; with help from my mentors - Donald Stufft and Justin Cappos. I'd also take this opportunity to thank them for agreeing to be my mentors for this GSoC. I would like to request for comments on my proposal for GSoC - it is hosted at https://gist.github.com/pradyunsg/5cf4a35b81f08b6432f280aba6f511eb. Please find trailing a MarkDown version of the proposal. Thanks, Pradyun Gedam ----- # Adding Proper Dependency Resolution to pip - **Name:** Pradyun S. Gedam - **Email:** [pradyunsg at gmail.com][mailto-email] - **Github:** [pradyunsg][github-profile] - **University:** [VIT University, Vellore, India][vit-homepage] - **Course:** Bachelor of Technology in Computer Science and Engineering - **Course Term:** 2016/17 - 2019/20 (4 Year) - **Timezone:** IST (GMT +5:30) - **GSoC Blog RSS Feed URL:** < https://pradyunsg.github.io/gsoc-2017/feed.xml> [github-profile]: http://github.com/pradyunsg/ [vit-homepage]: http://vit.ac.in/ [mailto-email]: mailto:pradyunsg at gmail.com ## About Me I was introduced to Python about five years ago, through "Core Python Programming" by Weasley J Chun. After the initial struggle with Python 2/3, the ball was set rolling and hasn't stopped since. I have fiddled around with Game Programming (PyGame), Computer Vision (OpenCV), Data Analytics (Pandas, SciPy, NumPy), transcompilers (Py2C) and more. As a high school student, I did internship at Enthought in 2013 and 2014. The two summers that I spent at Enthought were a great learning experience and I thoroughly enjoyed the environment there. Other than Python, I have also used C, C++, Web Technologies (JavaScript, HTML, CSS) and preprocessors (Pug, TypeScript, LESS/SASS/SCSS), Java and Bash/Zsh for some other projects. Curious to understand how pip works, I began looking into pip's source code. I started contributing to pip in May 2016. I am now fairly familiar with the design of pip and have a fair understanding of how it works, due to the various contributions I have made to pip in the past year. ## Mentors - Donald Stufft (@dstufft on GitHub) - Justin Cappos (@JustinCappos on GitHub) Communication with the mentors will be done mostly on issues and pull requests on pip's GitHub repository. If at any point in time, a real time discussion is to be done with the mentors, it would be done over IRC or Skype. Email can also be used if needed. ## Proposal This project is regarding improving dependency resolution performed within pip by implementing a dependency resolver within it. ### Abstract Currently, pip does not resolve dependencies correctly when there are conflicting requirements. The lack of dependency resolution has caused hard-to-debug bugs/failures due to the installation of incompatible packages. The lack of a dependency resolver is also a blocker for various other features - adding an upgrade-all functionality to pip and properly determining build-time dependencies for packages are two such features. ### Deliverables At the end of this project, pip will have the ability to: - detect requirement conflicts - resolve conflicting dependency requirements where possible ### Implementation The implementation language will be Python. The code will maintain the compatibility requirements of pip - the same source code will support the multiple Python implementations and version, including but not limited to, CPython 2.7, CPython 3.3+, PyPy 2, PyPy3. New Tests for the functionality introduced will be added to pip's current test suite. User documentation would not be a major part of this project. The only changes would be to mention pip can now resolve dependencies properly. There are certain sections that might need updating. #### Overview The project will be composed of the following stages: 1. Refactor the dependency resolution logic of pip into a separate module. 1. Implement dependency information caching in pip. 1. Implement a backtracking dependency resolver to resolve the dependencies. Every stage depends on the previous ones being completed. This step-wise approach would make incremental improvements so that there is a constant feedback on the work being done as well as making it easy to change course without losing existing work; if needed for some unforeseen reason. #### Discussion There is a class in pip - `RequirementSet`, which is currently a God class. It is responsible for the following: 1. Holding a list of things to be installed 1. Downloading Files 1. Dependency Resolution 1. Unpacking downloaded files (preparation for installation) 1. Uninstalling packages 1. Installing packages This is clearly a bad situation since this is most of the heavy lifting involved in pip. These responsibilities need to be separated and moved out into their independent modules/classes, to allow for simplification of `RequirementSet` while providing a clean platform for the remaining work. This is the most tricky portion of this project, given the complexity of `RequirementSet` as it stands today. There are two kinds of distributions that may be used to install a package - wheels (binary) and sdists (source). When installing a package, pip builds a wheel from an sdist and then proceeds to install the wheel. The difference between the two formats of distribution relevant to this project is - wheels store the information about dependencies within them statically; sdists do not. Determining the dependencies of a wheel distribution is merely the matter of fetching the information from a METADATA file within the `.whl` file. The dependency information for an sdist, on the other hand, can only be determined after running its `setup.py` file on the target system. This means that dependencies of an sdist depend on how its `setup.py` behaves which can change due to variations on target systems or could even contain through random choices. Computation of an sdist's dependencies on the target system is a time-consuming task since it potentially involves fetching a package from PyPI and executing it's setup.py to get the dependency information. In order to improve performance, once an sdist's dependencies are computed, they would be stored in a cache so that during dependency resolution, the dependencies of an sdist are not computed every time they are needed. Further, pip caches wheels it downloads or builds meaning that any installed package or downloaded wheel's dependency information would available statically, without the need to go through the sdist dependency cache. Like the wheel cache, sdist-dependency-cache will be a file system based cache. The sdist-dependency-cache would only be used if the corresponding sdist is being used. Since sdist dependencies are non-deterministic, the cached dependency information is potentially incorrect - in certain corner cases such as using random choices in setup.py files. Such uses are not seen as important enough to cater to, compared the benefits of having a cache. Further, this is already the case with the information in the wheel cache. SAT solver based resolution is not feasible for pip since a SAT solver needs the entire set of packages and their dependencies to compute a solution, which cannot be generated due to the aforementioned non-deterministic behaviour of setup.py file. Computing dependency information for all of PyPI on a target system for installing a package is simply not feasible. The most reasonable approach is using a backtracking solver. Such a solver can be incrementally provided information about the dependencies of a package and would only need dependency information about packages in the dependency graph of the current system. There is a need to keep a cache of visited packages during dependency resolution. A certain package-version combination may be reached via multiple paths and it is an inefficient use of computation time to re-compute that whether it is indeed going to satisfy the requirements or not. By storing information about which package-version combinations have been visited and do (or do not) satisfy the constraints, it is possible to speedup the resolution. Consider the following example: ``` A-1.0 (depends: B) A-2.0 (depends: B and E-1.0) B-1.0 (depends: C and D) C-1.0 (depends: D) D-1.0 D-1.1 (depends: E-2.0) E-1.0 E-2.0 ``` If an installation of A is required, either A-2.0 or D-1.1 should not be installed because they have a conflicting requirement - E. While there are multiple possible solutions to this situation, the "most obvious" one us to use the D-1.0, instead of not installing A-2.0. Further, as multiple packages depend on D, the algorithm would "reach it" multiple times. By maintaining a cache for the visited packages, it is possible to achieve a speedup in such a scenario. #### Details Pull requests would be made on a regular basis during the project to ensure that the feedback loop is quick. This also reduces the possibilities of conflicts due to unrelated changes in pip. All the code will be tested within pip's existing testing infrastructure. It has everything needed to write tests related to all the changes to be made. Every PR made to pip as a part of this project will contain new tests or modifications to existing ones as needed. ##### Stage 1 Initially, some abstractions will be introduced to the pip codebase to improve the reuse of certain common patterns within the codebase. This includes cleaner temporary directory management through a `TempDirectory` class. `RequirementSet.prepare_files` and `RequirementSet._prepare_file` are downloading, unpacking packages as well as doing what pip does as dependency resolution. Taking these functions apart neatly is going to be a tricky task. The following is a listing of the final modules that will be responsible for the various tasks that are currently being performed by `RequirementSet`: - `pip.resolve` - Dependency Resolution - `pip.download` - Downloading Files & Unpacking downloaded files - `pip.req.req_set` - Holding a list of things to be installed / uninstalled - `pip.operations.install` - Installing Packages (preparation for installation) - `pip.operations.uninstall` - Uninstalling Packages To be able to proceed to the next step, only the dependency resolution related code needs to be refactored into a separate module. Other portions of `RequirementSet` are not required to be refactored but the same will be tackled as optional deliverables. In other words, only `pip.resolve` needs to be completed to be able to proceed to the next stage in this project. This is needed since in Stage 3, the resolver would be written in `pip.resolve`, independent of the rest of the codebase. ##### Stage 2 A new module `pip.cache` will be created. Within this module, all the caching will be handled. Thus, the code for the current wheel cache would be moved. The new code for a dependency cache for sdists would also be written here. The new cache would hold all of an sdist's egg-info. The information will be stored on the file-system in a sub directory structure much like that of the wheel cache, in a directory structure based on hash of the sdist file holding the egg-info at the end. This will be done in a class named `EggInfoCache`. `EggInfoCache` cache will be used only if a corresponding wheel for an sdist is not available. Installing an sdist results in the creation of a wheel which contains the dependency information, which would be used over the information available in the `EggInfoCache`. To be able to proceed to the next step, it is required that `EggInfoCache` is implemented. ##### Stage 3 The module `pip.resolve` will be modified and a class named `BacktrackingResolver` will be added to it. The class does what you expect it to do - it would resolve dependencies with recursive backtracking. As described above, there will be some global state stored about the packages that have been explored. Other than the maintenance of global state, in the form of the cache, the rest of the algorithm will essentially follow the same structure as any backtracking algorithm. The project would be complete when the aforementioned resolver is implemented. #### Existing Work There is existing work directly related to dependency resolution in pip, done by multiple individuals. - Robert Collins (un-merged closed pull request on pypa/pip) This has an incomplete backtracking dependency resolver and dependency caching. - Sebastien Awwad (branch on a separate fork) This was used for the depresolve project, to investigate the state of Dependency Resolution in PyPI/pip ecosystem. - `pip-compile` (separate project) This has a backtracking dependency resolver implemented to overcome pip's inablity to resolve dependencies. Their work would be used for reference, where appropriate, during the course of the project. Further, there are many package managers which implement dependency resolution, which would also be looked into. ### Tentative Timeline - Community Bonding Period: **5 May - 29 May** - Clean up and finalize my existing pull requests. - Read existing literature regarding dependency resolution. - Inspect existing implementations of dependency resolvers. GOAL: Be ready for the work coming up. - Week 1: **30 May - 5 June** - Introduce abstractions across pip's codebase to make refactoring `RequirementSet` easier. - Begin breaking down `RequirementSet.prepare_file`. - Week 2: **6 June - 12 June** - Continue working on the refactor of `RequirementSet`. - Week 3: **13 June - 19 June** - Continue working on the refactor of `RequirementSet`. - Finish and polish `pip.resolve`. GOAL: `pip.resolve` module will be ready, using the current resolution strategy. - Week 4: **20 June - 26 June** - Finish and polish all work on the refactor of `RequirementSet`. - Week 5: **27 June - 3 July** - Move code for `WheelCache` into a new `pip.cache` module. - Write tests for `pip.cache.EggInfoCache`, based on `WheelCache`. - Begin implementation of `pip.cache.EggInfoCache`. - Week 6: **4 July - 10 July** - Finish and polish `pip.cache.EggInfoCache`. GOAL: A cache for storing dependency information of sdists would be ready to add to pip. - Week 7: **10 July - 16 July** - Create a comprehensive set of tests for the dependency resolver. (in `tests/unit/test_resolve.py`) - Begin implementation on the backtracking algorithm. GOAL: A comprehensive test bed is ready for testing the dependency resolver. - Week 8: **17 July - 24 July** - Complete a rough implementation of the backtracking algorithm GOAL: An implementation of a dependency resolver to begin running tests on and work on improving. - Week 9: **25 July - 31 July** - Fixing bugs in dependency resolver - Week 10: **1 August - 6 August** - Finish and polish work on dependency resolver GOAL: A ready-to-merge PR adding a backtracking dependency resolver for pip. - Week 11: **6 August - 13 August** Buffer Week. - Week 12: **13 August - 21 August** Buffer Week. Finalization of project for evaluation. If the deliverable is achieved ahead of schedule, the remaining time will be utilized to resolve open issues on pip's repository in the order of priority as determined under the guidance of the mentors. #### Other Commitments I expect to not be able to put in 40 hour/week for at most 3 weeks throughout the working period, due to the schedule of my university. I will have semester-end examinations - from 10th May 2017 to 24th May 2017 - during the Community Bonding Period. My university will re-open for my third semester on 12 July 2017. I expect mid-semester examinations to be held in my University around 20th August 2017. During these times, I would not be able to put in full 40 hour weeks due to the academic workload. I might take a 3-4 day break during this period, regarding which I would be informing my mentor around a week in advance. I will be completely free from 30th May 2017 to 11 July 2017. ### Future Work There seems to be some interest in being able to reuse the above dependency resolution algorithm in other packaging related tools, specifically from the buildout project. I intend to eventually move the dependency resolution code that would come out of this project into a separate library to allow for reuse by installer projects - pip, buildout and other tools. ## Previous Contributions to pip (As on 12th March 2017) ### Issues Authored: - #3785 - Prefering wheel-based installation over source-based installation (Open) - #3786 - Make install command upgrade packages by default (Closed) - #3787 - Check if pip broke the dependency graph and warn the user (Open) - #3807 - Tests fail since change on PyPI (Closed) - #3809 - Switch to TOML for configuration files (Open) - #3871 - Provide a way to perform non-eager upgrades (Closed) - #4198 - Travis CI - pypy broken dues to dependency change in pycrypto (Closed) - #4282 - What's the release schedule? (Closed) Participated: - #59 - Add "upgrade" and "upgrade-all" commands (Open) - #988 - Pip needs a dependency resolver (Open) - #1056 - pip install -U does not remember whether a package was installed with --user (Open) - #1511 - ssl certificate hostname mismatch errors presented badly (Open) - #1668 - Default to --user (Open) - #1736 - Create a command to make it easy to access the configuration file (Open) - #1737 - Don't tell the user what they meant, just do what they meant (Open) - #2313 - Automated the Creation and Upload of Release Artifacts (Open) - #2732 - pip install hangs with interactive setup.py setups (Open) - #3549 - pip -U pip fails (Open) - #3580 - Update requests/urllib3 (Closed) - #3610 - pip install from package from github, with github dependencies (Open) - #3788 - `pip` version suggested is older than the version which is installed (Closed) - #3789 - Error installing Mayavi in Mac OS X (Closed) - #3798 - On using python -m pip install -upgrade pip.. its throwing an error like the one below (Closed) - #3811 - no matching distribution found for install (Closed) - #3814 - pip could not find a version that satisfies the requirement oslo.context (Closed) - #3876 - support git refs in @ syntax (Open) - #4021 - Will you accept PRs with pep484 type hints? (Open) - #4087 - pip list produces error (Closed) - #4149 - Exception thrown when binary is already linked to /usr/local/bin (Open) - #4160 - Pip does not seem to be handling deep requirements correctly (Open) - #4162 - Let --find-links be context aware to support github, gitlab, etc. links (Open) - #4170 - pip list |head raises BrokenPipeError (Open) - #4182 - pip install should install packages in order to avoid ABI incompatibilities in compiled (Open) - #4186 - IOError: [Errno 13] Permission denied: '/usr/local/bin/pip' (Open) - #4206 - Where on Windows 10 is pip.conf or pip.ini located? (Closed) - #4221 - Feature request: Check if user has permissions before downloading files (Closed) - #4229 - "pip uninstall" is too noisy (Open) #### Pull Requests Authored: - #3806 - Change install command's default behaviour to upgrade packages by default (Closed, superseded by #3972) - #3808 - Fix Tests (Merged) - #3818 - Improve UX and tests of check command (Merged) - #3972 - Add an upgrade-strategy option (Merged) - #3974 - [minor] An aesthetic change to wheel command source (Merged) - #4192 - Move out all the config code to a separate module (Merged) - #4193 - Add the ability to autocorrect a user's command (Open) - #4199 - Fix Tests for Travis CI (Merged) - #4200 - Reduce the API exposed by the configuration class (Merged) - #4232 - Update documentation to mention upgrade-strategy (Merged) - #4233 - Nicer permissions error message (Open) - #4240 - [WIP] Add a configuration command (Open) Participated: - #2716 - Issue #988: new resolver for pip (Closed) - #2975 - Different mock dependencies based on Python version (Merged) - #3744 - Add a "Upgrade all local installed packages" (Open) - #3750 - Add a `pip check` command. (Merged) - #3751 - tox.ini: Add "cover" target (Open) - #3794 - Use the canonicalize_name function for finding .dist-info (Merged) - #4142 - Optionally load C dependencies based on platform (Open) - #4144 - Install build dependencies as specified in PEP 518 (Open) - #4150 - Clarify default for --upgrade-strategy is eager (Merged) - #4194 - Allow passing a --no-progress-bar to the install script to surpress progress bar (Merged) - #4201 - Register req_to_install for cleanup sooner (Merged) - #4202 - Switch to 3.6.0 final as our latest 3.x release (Merged) - #4211 - improve message when installing requirements file (#4127) (Merged) - #4241 - Python 3.6 is supported (Merged) ## References 1. [pypa/pip#988](https://github.com/pypa/pip/issues/988) Tracking issue for adding a proper dependency resolver in pip. Contains links to various useful resources. 1. [pypa/pip#2716](https://github.com/pypa/pip/issues/2716) Prior work by Robert Collins for adding a proper dependency resolver in pip. 1. [Python Dependency Resolution]( https://docs.google.com/document/d/1x_VrNtXCup75qA3glDd2fQOB2TakldwjKZ6pXaAjAfg/edit?usp=sharing ) A writeup by Sebastian Awwad on the current state of dependency resolution in pip and PyPI in general. 1. [PSF Application Template]( https://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2017) For guidance on how to write the application and what information is needed. 1. [Stork: Secure Package Management for VM Environments]( http://isis.poly.edu/~jcappos/papers/cappos_stork_dissertation_08.pdf) A Paper by Justin Cappos about Stork, used for reference regarding Backtracking Resolution -------------- next part -------------- An HTML attachment was scrubbed... URL: From pradyunsg at gmail.com Sat Mar 25 11:46:54 2017 From: pradyunsg at gmail.com (Pradyun Gedam) Date: Sat, 25 Mar 2017 15:46:54 +0000 Subject: [Distutils] GSoC 2017 - RFC on Proposal Message-ID: Hello Everyone! I had previously sent a mail on this list, stating that I would like to work on pip's dependency resolution for my GSoC 2017. I now have drafted a proposal for the same; with help from my mentors - Donald Stufft and Justin Cappos. I'd also take this opportunity to thank them for agreeing to be my mentors for this GSoC. I would like to request for comments on my proposal for GSoC - it is hosted at https://gist.github.com/pradyunsg/5cf4a35b81f08b6432f280aba6f511eb. Please find trailing a MarkDown version of the proposal. Thanks, Pradyun Gedam ----- # Adding Proper Dependency Resolution to pip - **Name:** Pradyun S. Gedam - **Email:** [pradyunsg at gmail.com][mailto-email] - **Github:** [pradyunsg][github-profile] - **University:** [VIT University, Vellore, India][vit-homepage] - **Course:** Bachelor of Technology in Computer Science and Engineering - **Course Term:** 2016/17 - 2019/20 (4 Year) - **Timezone:** IST (GMT +5:30) - **GSoC Blog RSS Feed URL:** < https://pradyunsg.github.io/gsoc-2017/feed.xml> [github-profile]: http://github.com/pradyunsg/ [vit-homepage]: http://vit.ac.in/ [mailto-email]: mailto:pradyunsg at gmail.com ## About Me I was introduced to Python about five years ago, through "Core Python Programming" by Weasley J Chun. After the initial struggle with Python 2/3, the ball was set rolling and hasn't stopped since. I have fiddled around with Game Programming (PyGame), Computer Vision (OpenCV), Data Analytics (Pandas, SciPy, NumPy), transcompilers (Py2C) and more. As a high school student, I did internship at Enthought in 2013 and 2014. The two summers that I spent at Enthought were a great learning experience and I thoroughly enjoyed the environment there. Other than Python, I have also used C, C++, Web Technologies (JavaScript, HTML, CSS) and preprocessors (Pug, TypeScript, LESS/SASS/SCSS), Java and Bash/Zsh for some other projects. Curious to understand how pip works, I began looking into pip's source code. I started contributing to pip in May 2016. I am now fairly familiar with the design of pip and have a fair understanding of how it works, due to the various contributions I have made to pip in the past year. ## Mentors - Donald Stufft (@dstufft on GitHub) - Justin Cappos (@JustinCappos on GitHub) Communication with the mentors will be done mostly on issues and pull requests on pip's GitHub repository. If at any point in time, a real time discussion is to be done with the mentors, it would be done over IRC or Skype. Email can also be used if needed. ## Proposal This project is regarding improving dependency resolution performed within pip by implementing a dependency resolver within it. ### Abstract Currently, pip does not resolve dependencies correctly when there are conflicting requirements. The lack of dependency resolution has caused hard-to-debug bugs/failures due to the installation of incompatible packages. The lack of a dependency resolver is also a blocker for various other features - adding an upgrade-all functionality to pip and properly determining build-time dependencies for packages are two such features. ### Deliverables At the end of this project, pip will have the ability to: - detect requirement conflicts - resolve conflicting dependency requirements where possible ### Implementation The implementation language will be Python. The code will maintain the compatibility requirements of pip - the same source code will support the multiple Python implementations and version, including but not limited to, CPython 2.7, CPython 3.3+, PyPy 2, PyPy3. New Tests for the functionality introduced will be added to pip's current test suite. User documentation would not be a major part of this project. The only changes would be to mention pip can now resolve dependencies properly. There are certain sections that might need updating. #### Overview The project will be composed of the following stages: 1. Refactor the dependency resolution logic of pip into a separate module. 1. Implement dependency information caching in pip. 1. Implement a backtracking dependency resolver to resolve the dependencies. Every stage depends on the previous ones being completed. This step-wise approach would make incremental improvements so that there is a constant feedback on the work being done as well as making it easy to change course without losing existing work; if needed for some unforeseen reason. #### Discussion There is a class in pip - `RequirementSet`, which is currently a God class. It is responsible for the following: 1. Holding a list of things to be installed 1. Downloading Files 1. Dependency Resolution 1. Unpacking downloaded files (preparation for installation) 1. Uninstalling packages 1. Installing packages This is clearly a bad situation since this is most of the heavy lifting involved in pip. These responsibilities need to be separated and moved out into their independent modules/classes, to allow for simplification of `RequirementSet` while providing a clean platform for the remaining work. This is the most tricky portion of this project, given the complexity of `RequirementSet` as it stands today. There are two kinds of distributions that may be used to install a package - wheels (binary) and sdists (source). When installing a package, pip builds a wheel from an sdist and then proceeds to install the wheel. The difference between the two formats of distribution relevant to this project is - wheels store the information about dependencies within them statically; sdists do not. Determining the dependencies of a wheel distribution is merely the matter of fetching the information from a METADATA file within the `.whl` file. The dependency information for an sdist, on the other hand, can only be determined after running its `setup.py` file on the target system. This means that dependencies of an sdist depend on how its `setup.py` behaves which can change due to variations on target systems or could even contain through random choices. Computation of an sdist's dependencies on the target system is a time-consuming task since it potentially involves fetching a package from PyPI and executing it's setup.py to get the dependency information. In order to improve performance, once an sdist's dependencies are computed, they would be stored in a cache so that during dependency resolution, the dependencies of an sdist are not computed every time they are needed. Further, pip caches wheels it downloads or builds meaning that any installed package or downloaded wheel's dependency information would available statically, without the need to go through the sdist dependency cache. Like the wheel cache, sdist-dependency-cache will be a file system based cache. The sdist-dependency-cache would only be used if the corresponding sdist is being used. Since sdist dependencies are non-deterministic, the cached dependency information is potentially incorrect - in certain corner cases such as using random choices in setup.py files. Such uses are not seen as important enough to cater to, compared the benefits of having a cache. Further, this is already the case with the information in the wheel cache. SAT solver based resolution is not feasible for pip since a SAT solver needs the entire set of packages and their dependencies to compute a solution, which cannot be generated due to the aforementioned non-deterministic behaviour of setup.py file. Computing dependency information for all of PyPI on a target system for installing a package is simply not feasible. The most reasonable approach is using a backtracking solver. Such a solver can be incrementally provided information about the dependencies of a package and would only need dependency information about packages in the dependency graph of the current system. There is a need to keep a cache of visited packages during dependency resolution. A certain package-version combination may be reached via multiple paths and it is an inefficient use of computation time to re-compute that whether it is indeed going to satisfy the requirements or not. By storing information about which package-version combinations have been visited and do (or do not) satisfy the constraints, it is possible to speedup the resolution. Consider the following example: ``` A-1.0 (depends: B) A-2.0 (depends: B and E-1.0) B-1.0 (depends: C and D) C-1.0 (depends: D) D-1.0 D-1.1 (depends: E-2.0) E-1.0 E-2.0 ``` If an installation of A is required, either A-2.0 or D-1.1 should not be installed because they have a conflicting requirement - E. While there are multiple possible solutions to this situation, the "most obvious" one us to use the D-1.0, instead of not installing A-2.0. Further, as multiple packages depend on D, the algorithm would "reach it" multiple times. By maintaining a cache for the visited packages, it is possible to achieve a speedup in such a scenario. #### Details Pull requests would be made on a regular basis during the project to ensure that the feedback loop is quick. This also reduces the possibilities of conflicts due to unrelated changes in pip. All the code will be tested within pip's existing testing infrastructure. It has everything needed to write tests related to all the changes to be made. Every PR made to pip as a part of this project will contain new tests or modifications to existing ones as needed. ##### Stage 1 Initially, some abstractions will be introduced to the pip codebase to improve the reuse of certain common patterns within the codebase. This includes cleaner temporary directory management through a `TempDirectory` class. `RequirementSet.prepare_files` and `RequirementSet._prepare_file` are downloading, unpacking packages as well as doing what pip does as dependency resolution. Taking these functions apart neatly is going to be a tricky task. The following is a listing of the final modules that will be responsible for the various tasks that are currently being performed by `RequirementSet`: - `pip.resolve` - Dependency Resolution - `pip.download` - Downloading Files & Unpacking downloaded files - `pip.req.req_set` - Holding a list of things to be installed / uninstalled - `pip.operations.install` - Installing Packages (preparation for installation) - `pip.operations.uninstall` - Uninstalling Packages To be able to proceed to the next step, only the dependency resolution related code needs to be refactored into a separate module. Other portions of `RequirementSet` are not required to be refactored but the same will be tackled as optional deliverables. In other words, only `pip.resolve` needs to be completed to be able to proceed to the next stage in this project. This is needed since in Stage 3, the resolver would be written in `pip.resolve`, independent of the rest of the codebase. ##### Stage 2 A new module `pip.cache` will be created. Within this module, all the caching will be handled. Thus, the code for the current wheel cache would be moved. The new code for a dependency cache for sdists would also be written here. The new cache would hold all of an sdist's egg-info. The information will be stored on the file-system in a sub directory structure much like that of the wheel cache, in a directory structure based on hash of the sdist file holding the egg-info at the end. This will be done in a class named `EggInfoCache`. `EggInfoCache` cache will be used only if a corresponding wheel for an sdist is not available. Installing an sdist results in the creation of a wheel which contains the dependency information, which would be used over the information available in the `EggInfoCache`. To be able to proceed to the next step, it is required that `EggInfoCache` is implemented. ##### Stage 3 The module `pip.resolve` will be modified and a class named `BacktrackingResolver` will be added to it. The class does what you expect it to do - it would resolve dependencies with recursive backtracking. As described above, there will be some global state stored about the packages that have been explored. Other than the maintenance of global state, in the form of the cache, the rest of the algorithm will essentially follow the same structure as any backtracking algorithm. The project would be complete when the aforementioned resolver is implemented. #### Existing Work There is existing work directly related to dependency resolution in pip, done by multiple individuals. - Robert Collins (un-merged closed pull request on pypa/pip) This has an incomplete backtracking dependency resolver and dependency caching. - Sebastien Awwad (branch on a separate fork) This was used for the depresolve project, to investigate the state of Dependency Resolution in PyPI/pip ecosystem. - `pip-compile` (separate project) This has a backtracking dependency resolver implemented to overcome pip's inablity to resolve dependencies. Their work would be used for reference, where appropriate, during the course of the project. Further, there are many package managers which implement dependency resolution, which would also be looked into. ### Tentative Timeline - Community Bonding Period: **5 May - 29 May** - Clean up and finalize my existing pull requests. - Read existing literature regarding dependency resolution. - Inspect existing implementations of dependency resolvers. GOAL: Be ready for the work coming up. - Week 1: **30 May - 5 June** - Introduce abstractions across pip's codebase to make refactoring `RequirementSet` easier. - Begin breaking down `RequirementSet.prepare_file`. - Week 2: **6 June - 12 June** - Continue working on the refactor of `RequirementSet`. - Week 3: **13 June - 19 June** - Continue working on the refactor of `RequirementSet`. - Finish and polish `pip.resolve`. GOAL: `pip.resolve` module will be ready, using the current resolution strategy. - Week 4: **20 June - 26 June** - Finish and polish all work on the refactor of `RequirementSet`. - Week 5: **27 June - 3 July** - Move code for `WheelCache` into a new `pip.cache` module. - Write tests for `pip.cache.EggInfoCache`, based on `WheelCache`. - Begin implementation of `pip.cache.EggInfoCache`. - Week 6: **4 July - 10 July** - Finish and polish `pip.cache.EggInfoCache`. GOAL: A cache for storing dependency information of sdists would be ready to add to pip. - Week 7: **10 July - 16 July** - Create a comprehensive set of tests for the dependency resolver. (in `tests/unit/test_resolve.py`) - Begin implementation on the backtracking algorithm. GOAL: A comprehensive test bed is ready for testing the dependency resolver. - Week 8: **17 July - 24 July** - Complete a rough implementation of the backtracking algorithm GOAL: An implementation of a dependency resolver to begin running tests on and work on improving. - Week 9: **25 July - 31 July** - Fixing bugs in dependency resolver - Week 10: **1 August - 6 August** - Finish and polish work on dependency resolver GOAL: A ready-to-merge PR adding a backtracking dependency resolver for pip. - Week 11: **6 August - 13 August** Buffer Week. - Week 12: **13 August - 21 August** Buffer Week. Finalization of project for evaluation. If the deliverable is achieved ahead of schedule, the remaining time will be utilized to resolve open issues on pip's repository in the order of priority as determined under the guidance of the mentors. #### Other Commitments I expect to not be able to put in 40 hour/week for at most 3 weeks throughout the working period, due to the schedule of my university. I will have semester-end examinations - from 10th May 2017 to 24th May 2017 - during the Community Bonding Period. My university will re-open for my third semester on 12 July 2017. I expect mid-semester examinations to be held in my University around 20th August 2017. During these times, I would not be able to put in full 40 hour weeks due to the academic workload. I might take a 3-4 day break during this period, regarding which I would be informing my mentor around a week in advance. I will be completely free from 30th May 2017 to 11 July 2017. ### Future Work There seems to be some interest in being able to reuse the above dependency resolution algorithm in other packaging related tools, specifically from the buildout project. I intend to eventually move the dependency resolution code that would come out of this project into a separate library to allow for reuse by installer projects - pip, buildout and other tools. ## Previous Contributions to pip (As on 12th March 2017) ### Issues Authored: - #3785 - Prefering wheel-based installation over source-based installation (Open) - #3786 - Make install command upgrade packages by default (Closed) - #3787 - Check if pip broke the dependency graph and warn the user (Open) - #3807 - Tests fail since change on PyPI (Closed) - #3809 - Switch to TOML for configuration files (Open) - #3871 - Provide a way to perform non-eager upgrades (Closed) - #4198 - Travis CI - pypy broken dues to dependency change in pycrypto (Closed) - #4282 - What's the release schedule? (Closed) Participated: - #59 - Add "upgrade" and "upgrade-all" commands (Open) - #988 - Pip needs a dependency resolver (Open) - #1056 - pip install -U does not remember whether a package was installed with --user (Open) - #1511 - ssl certificate hostname mismatch errors presented badly (Open) - #1668 - Default to --user (Open) - #1736 - Create a command to make it easy to access the configuration file (Open) - #1737 - Don't tell the user what they meant, just do what they meant (Open) - #2313 - Automated the Creation and Upload of Release Artifacts (Open) - #2732 - pip install hangs with interactive setup.py setups (Open) - #3549 - pip -U pip fails (Open) - #3580 - Update requests/urllib3 (Closed) - #3610 - pip install from package from github, with github dependencies (Open) - #3788 - `pip` version suggested is older than the version which is installed (Closed) - #3789 - Error installing Mayavi in Mac OS X (Closed) - #3798 - On using python -m pip install -upgrade pip.. its throwing an error like the one below (Closed) - #3811 - no matching distribution found for install (Closed) - #3814 - pip could not find a version that satisfies the requirement oslo.context (Closed) - #3876 - support git refs in @ syntax (Open) - #4021 - Will you accept PRs with pep484 type hints? (Open) - #4087 - pip list produces error (Closed) - #4149 - Exception thrown when binary is already linked to /usr/local/bin (Open) - #4160 - Pip does not seem to be handling deep requirements correctly (Open) - #4162 - Let --find-links be context aware to support github, gitlab, etc. links (Open) - #4170 - pip list |head raises BrokenPipeError (Open) - #4182 - pip install should install packages in order to avoid ABI incompatibilities in compiled (Open) - #4186 - IOError: [Errno 13] Permission denied: '/usr/local/bin/pip' (Open) - #4206 - Where on Windows 10 is pip.conf or pip.ini located? (Closed) - #4221 - Feature request: Check if user has permissions before downloading files (Closed) - #4229 - "pip uninstall" is too noisy (Open) #### Pull Requests Authored: - #3806 - Change install command's default behaviour to upgrade packages by default (Closed, superseded by #3972) - #3808 - Fix Tests (Merged) - #3818 - Improve UX and tests of check command (Merged) - #3972 - Add an upgrade-strategy option (Merged) - #3974 - [minor] An aesthetic change to wheel command source (Merged) - #4192 - Move out all the config code to a separate module (Merged) - #4193 - Add the ability to autocorrect a user's command (Open) - #4199 - Fix Tests for Travis CI (Merged) - #4200 - Reduce the API exposed by the configuration class (Merged) - #4232 - Update documentation to mention upgrade-strategy (Merged) - #4233 - Nicer permissions error message (Open) - #4240 - [WIP] Add a configuration command (Open) Participated: - #2716 - Issue #988: new resolver for pip (Closed) - #2975 - Different mock dependencies based on Python version (Merged) - #3744 - Add a "Upgrade all local installed packages" (Open) - #3750 - Add a `pip check` command. (Merged) - #3751 - tox.ini: Add "cover" target (Open) - #3794 - Use the canonicalize_name function for finding .dist-info (Merged) - #4142 - Optionally load C dependencies based on platform (Open) - #4144 - Install build dependencies as specified in PEP 518 (Open) - #4150 - Clarify default for --upgrade-strategy is eager (Merged) - #4194 - Allow passing a --no-progress-bar to the install script to surpress progress bar (Merged) - #4201 - Register req_to_install for cleanup sooner (Merged) - #4202 - Switch to 3.6.0 final as our latest 3.x release (Merged) - #4211 - improve message when installing requirements file (#4127) (Merged) - #4241 - Python 3.6 is supported (Merged) ## References 1. [pypa/pip#988](https://github.com/pypa/pip/issues/988) Tracking issue for adding a proper dependency resolver in pip. Contains links to various useful resources. 1. [pypa/pip#2716](https://github.com/pypa/pip/issues/2716) Prior work by Robert Collins for adding a proper dependency resolver in pip. 1. [Python Dependency Resolution]( https://docs.google.com/document/d/1x_VrNtXCup75qA3glDd2fQOB2TakldwjKZ6pXaAjAfg/edit?usp=sharing ) A writeup by Sebastian Awwad on the current state of dependency resolution in pip and PyPI in general. 1. [PSF Application Template]( https://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2017) For guidance on how to write the application and what information is needed. 1. [Stork: Secure Package Management for VM Environments]( http://isis.poly.edu/~jcappos/papers/cappos_stork_dissertation_08.pdf) A Paper by Justin Cappos about Stork, used for reference regarding Backtracking Resolution -------------- next part -------------- An HTML attachment was scrubbed... URL: From pradyunsg at gmail.com Sat Mar 25 11:50:17 2017 From: pradyunsg at gmail.com (Pradyun Gedam) Date: Sat, 25 Mar 2017 15:50:17 +0000 Subject: [Distutils] GSoC 2017 - RFC on Proposal In-Reply-To: References: Message-ID: I apologize for the duplicate mail. Please respond on this thread. On Sat, 25 Mar 2017 at 21:16 Pradyun Gedam wrote: > Hello Everyone! > > I had previously sent a mail on this list, stating that I would like to > work on pip's dependency resolution for my GSoC 2017. I now have drafted a > proposal for the same; with help from my mentors - Donald Stufft and Justin > Cappos. I'd also take this opportunity to thank them for agreeing to be my > mentors for this GSoC. > > I would like to request for comments on my proposal for GSoC - it is > hosted at > https://gist.github.com/pradyunsg/5cf4a35b81f08b6432f280aba6f511eb. > > Please find trailing a MarkDown version of the proposal. > > Thanks, > Pradyun Gedam > > ----- > > # Adding Proper Dependency Resolution to pip > > - **Name:** Pradyun S. Gedam > - **Email:** [pradyunsg at gmail.com][mailto-email] > - **Github:** [pradyunsg][github-profile] > - **University:** [VIT University, Vellore, India][vit-homepage] > - **Course:** Bachelor of Technology in Computer Science and Engineering > - **Course Term:** 2016/17 - 2019/20 (4 Year) > - **Timezone:** IST (GMT +5:30) > - **GSoC Blog RSS Feed URL:** < > https://pradyunsg.github.io/gsoc-2017/feed.xml> > > [github-profile]: http://github.com/pradyunsg/ > [vit-homepage]: http://vit.ac.in/ > [mailto-email]: mailto:pradyunsg at gmail.com > > ## About Me > > I was introduced to Python about five years ago, through "Core Python > Programming" by Weasley J Chun. After the initial struggle with Python > 2/3, the > ball was set rolling and hasn't stopped since. I have fiddled around with > Game Programming (PyGame), Computer Vision (OpenCV), Data Analytics > (Pandas, > SciPy, NumPy), transcompilers (Py2C) and more. > > As a high school student, I did internship at Enthought in 2013 and 2014. > The two summers that I spent at Enthought were a great learning experience > and > I thoroughly enjoyed the environment there. > > Other than Python, I have also used C, C++, Web Technologies (JavaScript, > HTML, > CSS) and preprocessors (Pug, TypeScript, LESS/SASS/SCSS), Java and > Bash/Zsh for > some other projects. > > Curious to understand how pip works, I began looking into pip's source > code. > I started contributing to pip in May 2016. I am now fairly familiar with > the > design of pip and have a fair understanding of how it works, due to the > various > contributions I have made to pip in the past year. > > ## Mentors > > - Donald Stufft (@dstufft on GitHub) > - Justin Cappos (@JustinCappos on GitHub) > > Communication with the mentors will be done mostly on issues and pull > requests > on pip's GitHub repository. If at any point in time, a real time > discussion is > to be done with the mentors, it would be done over IRC or Skype. Email can > also > be used if needed. > > ## Proposal > > This project is regarding improving dependency resolution performed within > pip > by implementing a dependency resolver within it. > > ### Abstract > > Currently, pip does not resolve dependencies correctly when there are > conflicting requirements. The lack of dependency resolution has caused > hard-to-debug bugs/failures due to the installation of incompatible > packages. > The lack of a dependency resolver is also a blocker for various other > features - > adding an upgrade-all functionality to pip and properly determining > build-time > dependencies for packages are two such features. > > ### Deliverables > > At the end of this project, pip will have the ability to: > > - detect requirement conflicts > - resolve conflicting dependency requirements where possible > > ### Implementation > > The implementation language will be Python. The code will maintain the > compatibility requirements of pip - the same source code will support the > multiple Python implementations and version, including but not limited to, > CPython 2.7, CPython 3.3+, PyPy 2, PyPy3. > > New Tests for the functionality introduced will be added to pip's current > test > suite. > > User documentation would not be a major part of this project. The only > changes > would be to mention pip can now resolve dependencies properly. There are > certain > sections that might need updating. > > #### Overview > > The project will be composed of the following stages: > > 1. Refactor the dependency resolution logic of pip into a separate module. > 1. Implement dependency information caching in pip. > 1. Implement a backtracking dependency resolver to resolve the > dependencies. > > Every stage depends on the previous ones being completed. This step-wise > approach would make incremental improvements so that there is a constant > feedback on the work being done as well as making it easy to change course > without losing existing work; if needed for some unforeseen reason. > > #### Discussion > > There is a class in pip - `RequirementSet`, which is currently a God > class. It > is responsible for the following: > > 1. Holding a list of things to be installed > 1. Downloading Files > 1. Dependency Resolution > 1. Unpacking downloaded files (preparation for installation) > 1. Uninstalling packages > 1. Installing packages > > This is clearly a bad situation since this is most of the heavy lifting > involved in pip. These responsibilities need to be separated and moved out > into > their independent modules/classes, to allow for simplification of > `RequirementSet` while providing a clean platform for the remaining work. > This is the most tricky portion of this project, given the complexity of > `RequirementSet` as it stands today. > > There are two kinds of distributions that may be used to install a package > - > wheels (binary) and sdists (source). When installing a package, pip builds > a > wheel from an sdist and then proceeds to install the wheel. The difference > between the two formats of distribution relevant to this project is - > wheels > store the information about dependencies within them statically; sdists do > not. > > Determining the dependencies of a wheel distribution is merely the matter > of > fetching the information from a METADATA file within the `.whl` file. The > dependency information for an sdist, on the other hand, can only be > determined > after running its `setup.py` file on the target system. This means that > dependencies of an sdist depend on how its `setup.py` behaves which can > change > due to variations on target systems or could even contain through random > choices. > > Computation of an sdist's dependencies on the target system is a > time-consuming > task since it potentially involves fetching a package from PyPI and > executing > it's setup.py to get the dependency information. In order to improve > performance, once an sdist's dependencies are computed, they would be > stored in > a cache so that during dependency resolution, the dependencies of an sdist > are > not computed every time they are needed. Further, pip caches wheels it > downloads or builds meaning that any installed package or downloaded > wheel's > dependency information would available statically, without the need to go > through the sdist dependency cache. > > Like the wheel cache, sdist-dependency-cache will be a file system based > cache. > The sdist-dependency-cache would only be used if the corresponding sdist is > being used. > > Since sdist dependencies are non-deterministic, the cached dependency > information is potentially incorrect - in certain corner cases such as > using > random choices in setup.py files. Such uses are not seen as important > enough to > cater to, compared the benefits of having a cache. Further, this is > already the > case with the information in the wheel cache. > > SAT solver based resolution is not feasible for pip since a SAT solver > needs > the entire set of packages and their dependencies to compute a solution, > which > cannot be generated due to the aforementioned non-deterministic behaviour > of > setup.py file. Computing dependency information for all of PyPI on a target > system for installing a package is simply not feasible. > > The most reasonable approach is using a backtracking solver. Such a solver > can > be incrementally provided information about the dependencies of a package > and > would only need dependency information about packages in the dependency > graph > of the current system. > > There is a need to keep a cache of visited packages during dependency > resolution. A certain package-version combination may be reached via > multiple > paths and it is an inefficient use of computation time to re-compute that > whether it is indeed going to satisfy the requirements or not. By storing > information about which package-version combinations have been visited and > do > (or do not) satisfy the constraints, it is possible to speedup the > resolution. > > Consider the following example: > > ``` > A-1.0 (depends: B) > A-2.0 (depends: B and E-1.0) > B-1.0 (depends: C and D) > C-1.0 (depends: D) > D-1.0 > D-1.1 (depends: E-2.0) > E-1.0 > E-2.0 > ``` > > If an installation of A is required, either A-2.0 or D-1.1 should not be > installed because they have a conflicting requirement - E. While there are > multiple possible solutions to this situation, the "most obvious" one us > to use > the D-1.0, instead of not installing A-2.0. Further, as multiple packages > depend on D, the algorithm would "reach it" multiple times. By maintaining > a > cache for the visited packages, it is possible to achieve a speedup in > such a > scenario. > > #### Details > > Pull requests would be made on a regular basis during the project to ensure > that the feedback loop is quick. This also reduces the possibilities of > conflicts due to unrelated changes in pip. > > All the code will be tested within pip's existing testing infrastructure. > It > has everything needed to write tests related to all the changes to be made. > Every PR made to pip as a part of this project will contain new tests or > modifications to existing ones as needed. > > ##### Stage 1 > > Initially, some abstractions will be introduced to the pip codebase to > improve > the reuse of certain common patterns within the codebase. This includes > cleaner > temporary directory management through a `TempDirectory` class. > > `RequirementSet.prepare_files` and `RequirementSet._prepare_file` are > downloading, unpacking packages as well as doing what pip does as > dependency > resolution. Taking these functions apart neatly is going to be a tricky > task. > > The following is a listing of the final modules that will be responsible > for > the various tasks that are currently being performed by `RequirementSet`: > > - `pip.resolve` - Dependency Resolution > - `pip.download` - Downloading Files & Unpacking downloaded files > - `pip.req.req_set` - Holding a list of things to be installed / > uninstalled > - `pip.operations.install` - Installing Packages (preparation for > installation) > - `pip.operations.uninstall` - Uninstalling Packages > > To be able to proceed to the next step, only the dependency resolution > related > code needs to be refactored into a separate module. Other portions of > `RequirementSet` are not required to be refactored but the same will be > tackled > as optional deliverables. In other words, only `pip.resolve` needs to be > completed to be able to proceed to the next stage in this project. This is > needed since in Stage 3, the resolver would be written in `pip.resolve`, > independent of the rest of the codebase. > > ##### Stage 2 > > A new module `pip.cache` will be created. Within this module, all the > caching > will be handled. Thus, the code for the current wheel cache would be moved. > The new code for a dependency cache for sdists would also be written here. > > The new cache would hold all of an sdist's egg-info. The information will > be > stored on the file-system in a sub directory structure much like that of > the > wheel cache, in a directory structure based on hash of the sdist file > holding > the egg-info at the end. This will be done in a class named `EggInfoCache`. > > `EggInfoCache` cache will be used only if a corresponding wheel for an > sdist is > not available. Installing an sdist results in the creation of a wheel which > contains the dependency information, which would be used over the > information > available in the `EggInfoCache`. > > To be able to proceed to the next step, it is required that `EggInfoCache` > is > implemented. > > ##### Stage 3 > > The module `pip.resolve` will be modified and a class named > `BacktrackingResolver` will be added to it. The class does what you expect > it > to do - it would resolve dependencies with recursive backtracking. As > described > above, there will be some global state stored about the packages that have > been > explored. Other than the maintenance of global state, in the form of the > cache, > the rest of the algorithm will essentially follow the same structure as any > backtracking algorithm. > > The project would be complete when the aforementioned resolver is > implemented. > > #### Existing Work > > There is existing work directly related to dependency resolution in pip, > done by > multiple individuals. > > - Robert Collins (un-merged closed pull request on pypa/pip) > > This has an incomplete backtracking dependency resolver and dependency > caching. > > - Sebastien Awwad (branch on a separate fork) > > This was used for the depresolve project, to investigate the state of > Dependency Resolution in PyPI/pip ecosystem. > > - `pip-compile` (separate project) > > This has a backtracking dependency resolver implemented to overcome pip's > inablity to resolve dependencies. > > Their work would be used for reference, where appropriate, during the > course of > the project. Further, there are many package managers which implement > dependency > resolution, which would also be looked into. > > ### Tentative Timeline > > - Community Bonding Period: **5 May - 29 May** > > - Clean up and finalize my existing pull requests. > - Read existing literature regarding dependency resolution. > - Inspect existing implementations of dependency resolvers. > > GOAL: Be ready for the work coming up. > > - Week 1: **30 May - 5 June** > > - Introduce abstractions across pip's codebase to make refactoring > `RequirementSet` easier. > - Begin breaking down `RequirementSet.prepare_file`. > > - Week 2: **6 June - 12 June** > > - Continue working on the refactor of `RequirementSet`. > > - Week 3: **13 June - 19 June** > > - Continue working on the refactor of `RequirementSet`. > - Finish and polish `pip.resolve`. > > GOAL: `pip.resolve` module will be ready, using the current resolution > strategy. > > - Week 4: **20 June - 26 June** > > - Finish and polish all work on the refactor of `RequirementSet`. > > - Week 5: **27 June - 3 July** > > - Move code for `WheelCache` into a new `pip.cache` module. > - Write tests for `pip.cache.EggInfoCache`, based on `WheelCache`. > - Begin implementation of `pip.cache.EggInfoCache`. > > - Week 6: **4 July - 10 July** > > - Finish and polish `pip.cache.EggInfoCache`. > > GOAL: A cache for storing dependency information of sdists would be > ready to > add to pip. > > - Week 7: **10 July - 16 July** > > - Create a comprehensive set of tests for the dependency resolver. > (in `tests/unit/test_resolve.py`) > - Begin implementation on the backtracking algorithm. > > GOAL: A comprehensive test bed is ready for testing the dependency > resolver. > > - Week 8: **17 July - 24 July** > > - Complete a rough implementation of the backtracking algorithm > > GOAL: An implementation of a dependency resolver to begin running tests > on > and work on improving. > > - Week 9: **25 July - 31 July** > > - Fixing bugs in dependency resolver > > - Week 10: **1 August - 6 August** > > - Finish and polish work on dependency resolver > > GOAL: A ready-to-merge PR adding a backtracking dependency resolver for > pip. > > - Week 11: **6 August - 13 August** > > Buffer Week. > > - Week 12: **13 August - 21 August** > > Buffer Week. Finalization of project for evaluation. > > If the deliverable is achieved ahead of schedule, the remaining time will > be > utilized to resolve open issues on pip's repository in the order of > priority as > determined under the guidance of the mentors. > > #### Other Commitments > > I expect to not be able to put in 40 hour/week for at most 3 weeks > throughout > the working period, due to the schedule of my university. > > I will have semester-end examinations - from 10th May 2017 to 24th May > 2017 - > during the Community Bonding Period. My university will re-open for my > third > semester on 12 July 2017. I expect mid-semester examinations to be held in > my > University around 20th August 2017. During these times, I would not be > able to > put in full 40 hour weeks due to the academic workload. > > I might take a 3-4 day break during this period, regarding which I would be > informing my mentor around a week in advance. > > I will be completely free from 30th May 2017 to 11 July 2017. > > ### Future Work > > There seems to be some interest in being able to reuse the above dependency > resolution algorithm in other packaging related tools, specifically from > the > buildout project. I intend to eventually move the dependency resolution > code > that would come out of this project into a separate library to allow for > reuse > by installer projects - pip, buildout and other tools. > > ## Previous Contributions to pip > > (As on 12th March 2017) > > ### Issues > > Authored: > > - #3785 - Prefering wheel-based installation over source-based > installation (Open) > - #3786 - Make install command upgrade packages by default (Closed) > - #3787 - Check if pip broke the dependency graph and warn the user (Open) > - #3807 - Tests fail since change on PyPI (Closed) > - #3809 - Switch to TOML for configuration files (Open) > - #3871 - Provide a way to perform non-eager upgrades (Closed) > - #4198 - Travis CI - pypy broken dues to dependency change in pycrypto > (Closed) > - #4282 - What's the release schedule? (Closed) > > Participated: > > - #59 - Add "upgrade" and "upgrade-all" commands (Open) > - #988 - Pip needs a dependency resolver (Open) > - #1056 - pip install -U does not remember whether a package was installed > with --user (Open) > - #1511 - ssl certificate hostname mismatch errors presented badly (Open) > - #1668 - Default to --user (Open) > - #1736 - Create a command to make it easy to access the configuration > file (Open) > - #1737 - Don't tell the user what they meant, just do what they meant > (Open) > - #2313 - Automated the Creation and Upload of Release Artifacts (Open) > - #2732 - pip install hangs with interactive setup.py setups (Open) > - #3549 - pip -U pip fails (Open) > - #3580 - Update requests/urllib3 (Closed) > - #3610 - pip install from package from github, with github dependencies > (Open) > - #3788 - `pip` version suggested is older than the version which is > installed (Closed) > - #3789 - Error installing Mayavi in Mac OS X (Closed) > - #3798 - On using python -m pip install -upgrade pip.. its throwing an > error like the one below (Closed) > - #3811 - no matching distribution found for install (Closed) > - #3814 - pip could not find a version that satisfies the requirement > oslo.context (Closed) > - #3876 - support git refs in @ syntax (Open) > - #4021 - Will you accept PRs with pep484 type hints? (Open) > - #4087 - pip list produces error (Closed) > - #4149 - Exception thrown when binary is already linked to /usr/local/bin > (Open) > - #4160 - Pip does not seem to be handling deep requirements correctly > (Open) > - #4162 - Let --find-links be context aware to support github, gitlab, > etc. links (Open) > - #4170 - pip list |head raises BrokenPipeError (Open) > - #4182 - pip install should install packages in order to avoid ABI > incompatibilities in compiled (Open) > - #4186 - IOError: [Errno 13] Permission denied: '/usr/local/bin/pip' > (Open) > - #4206 - Where on Windows 10 is pip.conf or pip.ini located? (Closed) > - #4221 - Feature request: Check if user has permissions before > downloading files (Closed) > - #4229 - "pip uninstall" is too noisy (Open) > > #### Pull Requests > > Authored: > > - #3806 - Change install command's default behaviour to upgrade packages > by default (Closed, superseded by #3972) > - #3808 - Fix Tests (Merged) > - #3818 - Improve UX and tests of check command (Merged) > - #3972 - Add an upgrade-strategy option (Merged) > - #3974 - [minor] An aesthetic change to wheel command source (Merged) > - #4192 - Move out all the config code to a separate module (Merged) > - #4193 - Add the ability to autocorrect a user's command (Open) > - #4199 - Fix Tests for Travis CI (Merged) > - #4200 - Reduce the API exposed by the configuration class (Merged) > - #4232 - Update documentation to mention upgrade-strategy (Merged) > - #4233 - Nicer permissions error message (Open) > - #4240 - [WIP] Add a configuration command (Open) > > Participated: > > - #2716 - Issue #988: new resolver for pip (Closed) > - #2975 - Different mock dependencies based on Python version (Merged) > - #3744 - Add a "Upgrade all local installed packages" (Open) > - #3750 - Add a `pip check` command. (Merged) > - #3751 - tox.ini: Add "cover" target (Open) > - #3794 - Use the canonicalize_name function for finding .dist-info > (Merged) > - #4142 - Optionally load C dependencies based on platform (Open) > - #4144 - Install build dependencies as specified in PEP 518 (Open) > - #4150 - Clarify default for --upgrade-strategy is eager (Merged) > - #4194 - Allow passing a --no-progress-bar to the install script to > surpress progress bar (Merged) > - #4201 - Register req_to_install for cleanup sooner (Merged) > - #4202 - Switch to 3.6.0 final as our latest 3.x release (Merged) > - #4211 - improve message when installing requirements file (#4127) > (Merged) > - #4241 - Python 3.6 is supported (Merged) > > ## References > > 1. [pypa/pip#988](https://github.com/pypa/pip/issues/988) > > Tracking issue for adding a proper dependency resolver in pip. Contains > links to various useful resources. > > 1. [pypa/pip#2716](https://github.com/pypa/pip/issues/2716) > > Prior work by Robert Collins for adding a proper dependency resolver in > pip. > > 1. [Python Dependency Resolution]( > https://docs.google.com/document/d/1x_VrNtXCup75qA3glDd2fQOB2TakldwjKZ6pXaAjAfg/edit?usp=sharing > ) > > A writeup by Sebastian Awwad on the current state of dependency > resolution > in pip and PyPI in general. > > 1. [PSF Application Template]( > https://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2017) > > For guidance on how to write the application and what information is > needed. > > 1. [Stork: Secure Package Management for VM Environments]( > http://isis.poly.edu/~jcappos/papers/cappos_stork_dissertation_08.pdf) > > A Paper by Justin Cappos about Stork, used for reference regarding > Backtracking Resolution > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Mar 25 12:47:25 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 26 Mar 2017 02:47:25 +1000 Subject: [Distutils] GSoC 2017 - RFC on Proposal In-Reply-To: References: Message-ID: On 26 March 2017 at 01:46, Pradyun Gedam wrote: > Hello Everyone! > > I had previously sent a mail on this list, stating that I would like to work > on pip's dependency resolution for my GSoC 2017. I now have drafted a > proposal for the same; with help from my mentors - Donald Stufft and Justin > Cappos. I'd also take this opportunity to thank them for agreeing to be my > mentors for this GSoC. The problem description and proposed resolution plan both look excellent to me - thank you to you all, and I hope the project goes well! Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From p.f.moore at gmail.com Sat Mar 25 13:32:53 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 25 Mar 2017 17:32:53 +0000 Subject: [Distutils] GSoC 2017 - Request for Comments on Proposal In-Reply-To: References: Message-ID: On 25 March 2017 at 15:44, Pradyun Gedam wrote: > Hello Everyone! > > I had previously sent a mail on this list, stating that I would like to work > on pip's dependency resolution for my GSoC 2017. I now have drafted a > proposal for the same; with help from my mentors - Donald Stufft and Justin > Cappos. I'd also take this opportunity to thank them for agreeing to be my > mentors for this GSoC. > > I would like to request for comments on my proposal for GSoC - it is hosted > at https://gist.github.com/pradyunsg/5cf4a35b81f08b6432f280aba6f511eb. > > Please find trailing a MarkDown version of the proposal. Hi Pradyun, Your proposal looks pretty impressive - well structured and thought out. I've looked through it and the plans seem reasonable - I've looked more at what you're proposing than at the timescales, but your staged approach seems sensible - if you do hit issues with time, it looks like you'll be able to deliver real improvements even if you don't complete everything, which is fantastic. Best of luck - assuming you complete all the work you've planned, it will be a significant benefit to pip. If I can be of any help, with PR reviews or anything similar, feel free to ping me. Paul From waynejwerner at gmail.com Sat Mar 25 22:31:15 2017 From: waynejwerner at gmail.com (Wayne Werner) Date: Sun, 26 Mar 2017 02:31:15 +0000 Subject: [Distutils] GSoC 2017 - Request for Comments on Proposal In-Reply-To: References: Message-ID: Tiny change: I believe its Wesley, not Weasley :) On Sat, Mar 25, 2017, 12:33 PM Paul Moore wrote: > On 25 March 2017 at 15:44, Pradyun Gedam wrote: > > Hello Everyone! > > > > I had previously sent a mail on this list, stating that I would like to > work > > on pip's dependency resolution for my GSoC 2017. I now have drafted a > > proposal for the same; with help from my mentors - Donald Stufft and > Justin > > Cappos. I'd also take this opportunity to thank them for agreeing to be > my > > mentors for this GSoC. > > > > I would like to request for comments on my proposal for GSoC - it is > hosted > > at https://gist.github.com/pradyunsg/5cf4a35b81f08b6432f280aba6f511eb. > > > > Please find trailing a MarkDown version of the proposal. > > Hi Pradyun, > Your proposal looks pretty impressive - well structured and thought > out. I've looked through it and the plans seem reasonable - I've > looked more at what you're proposing than at the timescales, but your > staged approach seems sensible - if you do hit issues with time, it > looks like you'll be able to deliver real improvements even if you > don't complete everything, which is fantastic. > > Best of luck - assuming you complete all the work you've planned, it > will be a significant benefit to pip. If I can be of any help, with PR > reviews or anything similar, feel free to ping me. > > Paul > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pradyunsg at gmail.com Sun Mar 26 00:29:56 2017 From: pradyunsg at gmail.com (Pradyun Gedam) Date: Sun, 26 Mar 2017 04:29:56 +0000 Subject: [Distutils] GSoC 2017 - RFC on Proposal In-Reply-To: References: Message-ID: Thank You Nick! ^.^ On Sat, Mar 25, 2017, 22:17 Nick Coghlan wrote: On 26 March 2017 at 01:46, Pradyun Gedam wrote: > Hello Everyone! > > I had previously sent a mail on this list, stating that I would like to work > on pip's dependency resolution for my GSoC 2017. I now have drafted a > proposal for the same; with help from my mentors - Donald Stufft and Justin > Cappos. I'd also take this opportunity to thank them for agreeing to be my > mentors for this GSoC. The problem description and proposed resolution plan both look excellent to me - thank you to you all, and I hope the project goes well! Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From pradyunsg at gmail.com Sun Mar 26 00:37:56 2017 From: pradyunsg at gmail.com (Pradyun Gedam) Date: Sun, 26 Mar 2017 04:37:56 +0000 Subject: [Distutils] GSoC 2017 - Request for Comments on Proposal In-Reply-To: References: Message-ID: On Sat, Mar 25, 2017, 23:02 Paul Moore wrote: > On 25 March 2017 at 15:44, Pradyun Gedam wrote: > > Hello Everyone! > > > > I had previously sent a mail on this list, stating that I would like to > work > > on pip's dependency resolution for my GSoC 2017. I now have drafted a > > proposal for the same; with help from my mentors - Donald Stufft and > Justin > > Cappos. I'd also take this opportunity to thank them for agreeing to be > my > > mentors for this GSoC. > > > > I would like to request for comments on my proposal for GSoC - it is > hosted > > at https://gist.github.com/pradyunsg/5cf4a35b81f08b6432f280aba6f511eb. > > > > Please find trailing a MarkDown version of the proposal. > > Hi Pradyun, > Your proposal looks pretty impressive - well structured and thought > out. I've looked through it and the plans seem reasonable - I've > looked more at what you're proposing than at the timescales, but your > staged approach seems sensible - if you do hit issues with time, it > looks like you'll be able to deliver real improvements even if you > don't complete everything, which is fantastic. > > Best of luck - assuming you complete all the work you've planned, it > will be a significant benefit to pip. If I can be of any help, with PR > reviews or anything similar, feel free to ping me. > > Paul > Thank You Paul! Hopefully, I'll be able to overcome any issue that come up. I definitely will ping you. :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From pradyunsg at gmail.com Sun Mar 26 00:40:06 2017 From: pradyunsg at gmail.com (Pradyun Gedam) Date: Sun, 26 Mar 2017 04:40:06 +0000 Subject: [Distutils] GSoC 2017 - Request for Comments on Proposal In-Reply-To: References: Message-ID: On Sun, Mar 26, 2017, 08:01 Wayne Werner wrote: > Tiny change: I believe its Wesley, not Weasley :) > Indeed. My bad. I'll fix it. Thank you! > On Sat, Mar 25, 2017, 12:33 PM Paul Moore wrote: > > On 25 March 2017 at 15:44, Pradyun Gedam wrote: > > Hello Everyone! > > > > I had previously sent a mail on this list, stating that I would like to > work > > on pip's dependency resolution for my GSoC 2017. I now have drafted a > > proposal for the same; with help from my mentors - Donald Stufft and > Justin > > Cappos. I'd also take this opportunity to thank them for agreeing to be > my > > mentors for this GSoC. > > > > I would like to request for comments on my proposal for GSoC - it is > hosted > > at https://gist.github.com/pradyunsg/5cf4a35b81f08b6432f280aba6f511eb. > > > > Please find trailing a MarkDown version of the proposal. > > Hi Pradyun, > Your proposal looks pretty impressive - well structured and thought > out. I've looked through it and the plans seem reasonable - I've > looked more at what you're proposing than at the timescales, but your > staged approach seems sensible - if you do hit issues with time, it > looks like you'll be able to deliver real improvements even if you > don't complete everything, which is fantastic. > > Best of luck - assuming you complete all the work you've planned, it > will be a significant benefit to pip. If I can be of any help, with PR > reviews or anything similar, feel free to ping me. > > Paul > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Mar 26 22:28:53 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 27 Mar 2017 12:28:53 +1000 Subject: [Distutils] Wheel files for PPC64le In-Reply-To: References: <840906168df74413ac361803c1bbf1b8@serv030.corp.eldorado.org.br> Message-ID: On 24 March 2017 at 12:58, Nick Coghlan wrote: > On 24 March 2017 at 05:00, Leonardo Bianconi > wrote: >> Docker Image (Will be implemented when CentOS be available on Docker) >> ------------ >> >> The Docker Image is based on CentOS 7 [4]_, which is the first PowerPC 64 >> little endian CentOS release. The Image contains all necessary tools in >> the >> requested version to build wheel files (gcc, g++ and gfortran 4.8.5). > > > These seem to be present now: https://hub.docker.com/r/ppc64le/centos/tags/ > > I'm not clear on the provenance of the 'ppc64le' user account though, so > I've asked for clarification: > ttps://twitter.com/ncoghlan_dev/status/845099237117329408 Looks like these are genuinely official images maintained by IBM+Docker engineers: https://twitter.com/estesp/status/845296651363246080 And they're referenced from the Docker "official images" README: https://github.com/docker-library/official-images/blob/master/README.md#architectures-other-than-amd64 Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From kdreyer at redhat.com Fri Mar 24 18:24:18 2017 From: kdreyer at redhat.com (Ken Dreyer) Date: Fri, 24 Mar 2017 16:24:18 -0600 Subject: [Distutils] package ownership transfer: kobo Message-ID: Hi folks, Would someone please give me access to upload new versions of kobo to PyPI? I've filed the transfer request [1] a while back, and several users are asking for new versions. [2] - Ken [1] https://sourceforge.net/p/pypi/support-requests/632/ [2] https://github.com/release-engineering/kobo/issues/24 From brett at python.org Mon Mar 27 14:39:45 2017 From: brett at python.org (Brett Cannon) Date: Mon, 27 Mar 2017 18:39:45 +0000 Subject: [Distutils] Is the SourceForge repo still used? (was: package ownership transfer: kobo In-Reply-To: References: Message-ID: On Mon, 27 Mar 2017 at 05:02 Ken Dreyer wrote: > [SNIP] > [1] https://sourceforge.net/p/pypi/support-requests/632/ I didn't even know this repo existed until I noticed that the sidebar on pypi.python.org. It isn't mentioned anywhere on https://pypi.org/help/ . Should people still be using that project or the Google Form? Or do we need a GitHub repo to track things like package ownership transfers? -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: From guettliml at thomas-guettler.de Wed Mar 29 01:29:44 2017 From: guettliml at thomas-guettler.de (=?UTF-8?Q?Thomas_G=c3=bcttler?=) Date: Wed, 29 Mar 2017 07:29:44 +0200 Subject: [Distutils] Source of confusion Message-ID: <07e85472-5e08-32ee-3884-af0e327ac466@thomas-guettler.de> Quoting https://www.pypa.io/en/latest/ {{{ They host projects on github and bitbucket, and discuss issues on the pypa-dev and distutils-sig mailing lists. }}} I don't know where to go to. ... Choices .... too many choices ..... ... giving me the feeling of uncertainty. I am feeling fear. I am stupid and missing a guiding hand which gives me simple straight forward step by step instruction. Please help me. Regards, Thomas G?ttler -- I am looking for feedback for my personal programming guidelines: https://github.com/guettli/programming-guidelines From p.f.moore at gmail.com Wed Mar 29 03:51:58 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 29 Mar 2017 08:51:58 +0100 Subject: [Distutils] Source of confusion In-Reply-To: <07e85472-5e08-32ee-3884-af0e327ac466@thomas-guettler.de> References: <07e85472-5e08-32ee-3884-af0e327ac466@thomas-guettler.de> Message-ID: On 29 March 2017 at 06:29, Thomas G?ttler wrote: > I am stupid and missing a guiding hand which gives me simple straight forward step by step instruction. To do what? Paul From ncoghlan at gmail.com Wed Mar 29 04:27:01 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 29 Mar 2017 18:27:01 +1000 Subject: [Distutils] Source of confusion In-Reply-To: References: <07e85472-5e08-32ee-3884-af0e327ac466@thomas-guettler.de> Message-ID: On 29 March 2017 at 17:51, Paul Moore wrote: > On 29 March 2017 at 06:29, Thomas G?ttler wrote: >> I am stupid and missing a guiding hand which gives me simple straight forward step by step instruction. > > To do what? As far as I can tell, to get a customer experience instead of a prospective co-contributor one. I'm sorry Thomas, as long as you continue looking for a coherent customer experience from a collaborative collection of volunteer-run community projects, you're going to continually be confused and disappointed. The Python ecosystem *does* include commercial vendors that offer to make opinionated technical decisions on behalf of their customers, as well as providing a single point of contact for support questions and feature requests, but beyond that, offering an overwhelming array of confusing choices is pretty much the way open source *works*. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guettliml at thomas-guettler.de Wed Mar 29 05:31:35 2017 From: guettliml at thomas-guettler.de (=?UTF-8?Q?Thomas_G=c3=bcttler?=) Date: Wed, 29 Mar 2017 11:31:35 +0200 Subject: [Distutils] Source of confusion In-Reply-To: References: <07e85472-5e08-32ee-3884-af0e327ac466@thomas-guettler.de> Message-ID: Am 29.03.2017 um 09:51 schrieb Paul Moore: > On 29 March 2017 at 06:29, Thomas G?ttler wrote: >> I am stupid and missing a guiding hand which gives me simple straight forward step by step instruction. > > To do what? To find canonical docs. With "canonical" I mean current docs from the upstream. Regards, Thomas -- Thomas Guettler http://www.thomas-guettler.de/ From p.f.moore at gmail.com Wed Mar 29 05:47:10 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 29 Mar 2017 10:47:10 +0100 Subject: [Distutils] Source of confusion In-Reply-To: References: <07e85472-5e08-32ee-3884-af0e327ac466@thomas-guettler.de> Message-ID: On 29 March 2017 at 10:31, Thomas G?ttler wrote: > Am 29.03.2017 um 09:51 schrieb Paul Moore: >> >> On 29 March 2017 at 06:29, Thomas G?ttler >> wrote: >>> >>> I am stupid and missing a guiding hand which gives me simple straight >>> forward step by step instruction. >> >> >> To do what? > > To find canonical docs. With "canonical" I mean current docs from the > upstream. I think Nick's point probably covers this discussion, but you haven't said what you want docs *for*. pip? setuptools? wheel? something else? They are in various places, which you can hunt out via pypi or google. It's not hard to do, but certainly it's true that it's harder to find things than you'd want if you were paying for a well-documented service. But given that you're not paying anything, and no-one working on Python packaging has any obligation to meet your expectations, you'll need to either lower the level of your expectations, pay someone to provide what you're looking for, or offer your own time and energy to address the issues you find. Simply making vague complaints on this list isn't particularly productive. Sorry if that's not the response you were hoping for, and in particular if you have a pressing need for support that we're not providing, I do understand how that can be a problem for you, but as Nick says, this is the reality of relying on software that's provided to you free of charge. Paul From jelle.zijlstra at gmail.com Wed Mar 29 11:27:25 2017 From: jelle.zijlstra at gmail.com (Jelle Zijlstra) Date: Wed, 29 Mar 2017 08:27:25 -0700 Subject: [Distutils] Source of confusion In-Reply-To: References: <07e85472-5e08-32ee-3884-af0e327ac466@thomas-guettler.de> Message-ID: 2017-03-29 2:31 GMT-07:00 Thomas G?ttler : > > > Am 29.03.2017 um 09:51 schrieb Paul Moore: > >> On 29 March 2017 at 06:29, Thomas G?ttler >> wrote: >> >>> I am stupid and missing a guiding hand which gives me simple straight >>> forward step by step instruction. >>> >> >> To do what? >> > > To find canonical docs. With "canonical" I mean current docs from the > upstream. > > Are you aware of https://packaging.python.org/ ? > Regards, > Thomas > > > > > > -- > Thomas Guettler http://www.thomas-guettler.de/ > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.jerdonek at gmail.com Wed Mar 29 14:41:27 2017 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Wed, 29 Mar 2017 11:41:27 -0700 Subject: [Distutils] obtaining project name and packages Message-ID: Hi, this seems like a simple question, but I haven't been able to find the answer online: What is the current recommended way to get (1) the name of a project, and (2) the names of the top-level packages installed by a project (not counting the project's dependencies). You have access to / can run the project's setup.py, and you're also allowed to assume that the project is installed. For example, for (1) I know you can do-- $ python setup.py --name But I'm not sure if accessing setup.py is no longer recommended (as opposed to going through a tool like pip). Thanks a lot, --Chris From thomas at kluyver.me.uk Wed Mar 29 14:55:28 2017 From: thomas at kluyver.me.uk (Thomas Kluyver) Date: Wed, 29 Mar 2017 19:55:28 +0100 Subject: [Distutils] obtaining project name and packages In-Reply-To: References: Message-ID: <1490813728.1343566.927743488.1F6614AA@webmail.messagingengine.com> I have a tool that does this from a wheel: https://github.com/takluyver/wheeldex >From an sdist, I think you need to either build a wheel or install it before you can get this information reliably. Some of my installed packages have a 'top_level.txt' file in the .dist-info folder, containing a list of the top-level package names installed by that distribution. I don't believe this is formally specified anywhere, though, and packages created by flit do not have it. Thomas On Wed, Mar 29, 2017, at 07:41 PM, Chris Jerdonek wrote: > Hi, this seems like a simple question, but I haven't been able to find > the answer online: > > What is the current recommended way to get (1) the name of a project, > and (2) the names of the top-level packages installed by a project > (not counting the project's dependencies). You have access to / can > run the project's setup.py, and you're also allowed to assume that the > project is installed. > > For example, for (1) I know you can do-- > > $ python setup.py --name > > But I'm not sure if accessing setup.py is no longer recommended (as > opposed to going through a tool like pip). > > Thanks a lot, > --Chris > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig From wes.turner at gmail.com Wed Mar 29 16:19:48 2017 From: wes.turner at gmail.com (Wes Turner) Date: Wed, 29 Mar 2017 15:19:48 -0500 Subject: [Distutils] obtaining project name and packages In-Reply-To: <1490813728.1343566.927743488.1F6614AA@webmail.messagingengine.com> References: <1490813728.1343566.927743488.1F6614AA@webmail.messagingengine.com> Message-ID: On Wed, Mar 29, 2017 at 1:55 PM, Thomas Kluyver wrote: > I have a tool that does this from a wheel: > https://github.com/takluyver/wheeldex > > From an sdist, I think you need to either build a wheel or install it > before you can get this information reliably. > Src: https://code.launchpad.net/~tseaver/pkginfo/trunk PyPI: https://pypi.python.org/pypi/pkginfo This package provides an API for querying the distutils metadata written in > the PKG-INFO file inside a source distriubtion (an sdist) or a binary > distribution (e.g., created by running bdist_egg). It can also query the > EGG-INFO directory of an installed distribution, and the *.egg-info stored > in a ?development checkout? (e.g, created by running setup.py develop). > Docs: https://pythonhosted.org/pkginfo/ https://bazaar.launchpad.net/~tseaver/pkginfo/trunk/files/head:/pkginfo/tests/ > Some of my installed packages have a 'top_level.txt' file in the > .dist-info folder, containing a list of the top-level package names > installed by that distribution. I don't believe this is formally > specified anywhere, though, and packages created by flit do not have it. > > Thomas > > On Wed, Mar 29, 2017, at 07:41 PM, Chris Jerdonek wrote: > > Hi, this seems like a simple question, but I haven't been able to find > > the answer online: > > > > What is the current recommended way to get (1) the name of a project, > > and (2) the names of the top-level packages installed by a project > > (not counting the project's dependencies). You have access to / can > > run the project's setup.py, and you're also allowed to assume that the > > project is installed. > > > > For example, for (1) I know you can do-- > > > > $ python setup.py --name > > > > But I'm not sure if accessing setup.py is no longer recommended (as > > opposed to going through a tool like pip). > > > > Thanks a lot, > > --Chris > > _______________________________________________ > > Distutils-SIG maillist - Distutils-SIG at python.org > > https://mail.python.org/mailman/listinfo/distutils-sig > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Wed Mar 29 17:26:27 2017 From: wes.turner at gmail.com (Wes Turner) Date: Wed, 29 Mar 2017 16:26:27 -0500 Subject: [Distutils] obtaining project name and packages In-Reply-To: References: <1490813728.1343566.927743488.1F6614AA@webmail.messagingengine.com> Message-ID: cd ./lib/python2.7/site-packages/notebook-4.4.1.dist-info cat metadata.json | python -m json.tool { "classifiers": [ "Intended Audience :: Developers", "Intended Audience :: System Administrators", "Intended Audience :: Science/Research", "License :: OSI Approved :: BSD License", "Programming Language :: Python", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3" ], "extensions": { "python.commands": { "wrap_console": { "jupyter-nbextension": "notebook.nbextensions:main", "jupyter-notebook": "notebook.notebookapp:main", "jupyter-serverextension": "notebook.serverextensions:main" } }, "python.details": { "contacts": [ { "email": "jupyter at googlegroups.com", "name": "Jupyter Development Team", "role": "author" } ], "document_names": { "description": "DESCRIPTION.rst" }, "project_urls": { "Home": "http://jupyter.org" } }, "python.exports": { "console_scripts": { "jupyter-nbextension": "notebook.nbextensions:main", "jupyter-notebook": "notebook.notebookapp:main", "jupyter-serverextension": "notebook.serverextensions:main" } } }, "extras": [ "doc", "test" ], "generator": "bdist_wheel (0.29.0)", "keywords": [ "Interactive", "Interpreter", "Shell", "Web" ], "license": "BSD", "metadata_version": "2.0", "name": "notebook", "platform": "Linux", "run_requires": [ { "extra": "doc", "requires": [ "Sphinx (>=1.1)" ] }, { "requires": [ "ipykernel", "ipython-genutils", "jinja2", "jupyter-client", "jupyter-core", "nbconvert", "nbformat", "tornado (>=4)", "traitlets" ] }, { "extra": "test", "requires": [ "nose", "requests" ] }, { "environment": "python_version == \"2.7\"", "extra": "test", "requires": [ "mock" ] }, { "environment": "sys_platform != \"win32\"", "requires": [ "terminado (>=0.3.3)" ] } ], "summary": "A web-based notebook environment for interactive computing", "version": "4.4.1" } On Wed, Mar 29, 2017 at 3:19 PM, Wes Turner wrote: > > > On Wed, Mar 29, 2017 at 1:55 PM, Thomas Kluyver > wrote: > >> I have a tool that does this from a wheel: >> https://github.com/takluyver/wheeldex >> >> From an sdist, I think you need to either build a wheel or install it >> before you can get this information reliably. >> > > Src: https://code.launchpad.net/~tseaver/pkginfo/trunk > > PyPI: https://pypi.python.org/pypi/pkginfo > > This package provides an API for querying the distutils metadata written >> in the PKG-INFO file inside a source distriubtion (an sdist) or a binary >> distribution (e.g., created by running bdist_egg). It can also query the >> EGG-INFO directory of an installed distribution, and the *.egg-info stored >> in a ?development checkout? (e.g, created by running setup.py develop). >> > > Docs: https://pythonhosted.org/pkginfo/ > > https://bazaar.launchpad.net/~tseaver/pkginfo/trunk/files/ > head:/pkginfo/tests/ > > >> Some of my installed packages have a 'top_level.txt' file in the >> .dist-info folder, containing a list of the top-level package names >> installed by that distribution. I don't believe this is formally >> specified anywhere, though, and packages created by flit do not have it. >> >> Thomas >> >> On Wed, Mar 29, 2017, at 07:41 PM, Chris Jerdonek wrote: >> > Hi, this seems like a simple question, but I haven't been able to find >> > the answer online: >> > >> > What is the current recommended way to get (1) the name of a project, >> > and (2) the names of the top-level packages installed by a project >> > (not counting the project's dependencies). You have access to / can >> > run the project's setup.py, and you're also allowed to assume that the >> > project is installed. >> > >> > For example, for (1) I know you can do-- >> > >> > $ python setup.py --name >> > >> > But I'm not sure if accessing setup.py is no longer recommended (as >> > opposed to going through a tool like pip). >> > >> > Thanks a lot, >> > --Chris >> > _______________________________________________ >> > Distutils-SIG maillist - Distutils-SIG at python.org >> > https://mail.python.org/mailman/listinfo/distutils-sig >> _______________________________________________ >> Distutils-SIG maillist - Distutils-SIG at python.org >> https://mail.python.org/mailman/listinfo/distutils-sig >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Mar 29 20:52:31 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 30 Mar 2017 10:52:31 +1000 Subject: [Distutils] Source of confusion In-Reply-To: References: <07e85472-5e08-32ee-3884-af0e327ac466@thomas-guettler.de> Message-ID: On 30 March 2017 at 01:27, Jelle Zijlstra wrote: > > > 2017-03-29 2:31 GMT-07:00 Thomas G?ttler : >> >> >> >> Am 29.03.2017 um 09:51 schrieb Paul Moore: >>> >>> On 29 March 2017 at 06:29, Thomas G?ttler >>> wrote: >>>> >>>> I am stupid and missing a guiding hand which gives me simple straight >>>> forward step by step instruction. >>> >>> >>> To do what? >> >> >> To find canonical docs. With "canonical" I mean current docs from the >> upstream. >> > Are you aware of https://packaging.python.org/ ? As an opinionated-but-still-free combination of tools, there's also Kenneth Reitz's pipenv: https://github.com/kennethreitz/pipenv Understandably, that's mainly geared towards network service hosting environments like Heroku, but it also works pretty well for command line apps, testing environment setups, etc. However, none of the available options will get away from the fact that only end users know their own operational requirements - we can't provide a single universal right answer, because there isn't a single universal use case. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guettliml at thomas-guettler.de Thu Mar 30 03:31:52 2017 From: guettliml at thomas-guettler.de (=?UTF-8?Q?Thomas_G=c3=bcttler?=) Date: Thu, 30 Mar 2017 09:31:52 +0200 Subject: [Distutils] Source of confusion In-Reply-To: References: <07e85472-5e08-32ee-3884-af0e327ac466@thomas-guettler.de> Message-ID: <00ecbf9b-5f11-46fe-482d-2db07e52524e@thomas-guettler.de> Am 29.03.2017 um 10:27 schrieb Nick Coghlan: > On 29 March 2017 at 17:51, Paul Moore wrote: >> On 29 March 2017 at 06:29, Thomas G?ttler wrote: >>> I am stupid and missing a guiding hand which gives me simple straight forward step by step instruction. >> >> To do what? > > As far as I can tell, to get a customer experience instead of a > prospective co-contributor one. You are right > I'm sorry Thomas, as long as you continue looking for a coherent > customer experience from a collaborative collection of volunteer-run > community projects, you're going to continually be confused and > disappointed. You are right > The Python ecosystem *does* include commercial vendors that offer to > make opinionated technical decisions on behalf of their customers, as > well as providing a single point of contact for support questions and > feature requests, but beyond that, offering an overwhelming array of > confusing choices is pretty much the way open source *works*. You are right. Regards, Thomas -- Thomas Guettler http://www.thomas-guettler.de/ From guettliml at thomas-guettler.de Thu Mar 30 03:36:59 2017 From: guettliml at thomas-guettler.de (=?UTF-8?Q?Thomas_G=c3=bcttler?=) Date: Thu, 30 Mar 2017 09:36:59 +0200 Subject: [Distutils] Source of confusion In-Reply-To: References: <07e85472-5e08-32ee-3884-af0e327ac466@thomas-guettler.de> Message-ID: <4be84a88-2c3c-5a57-909f-6a3f8fb96ebd@thomas-guettler.de> Am 29.03.2017 um 11:47 schrieb Paul Moore: > On 29 March 2017 at 10:31, Thomas G?ttler wrote: >> Am 29.03.2017 um 09:51 schrieb Paul Moore: >>> >>> On 29 March 2017 at 06:29, Thomas G?ttler >>> wrote: >>>> >>>> I am stupid and missing a guiding hand which gives me simple straight >>>> forward step by step instruction. >>> >>> >>> To do what? >> >> To find canonical docs. With "canonical" I mean current docs from the >> upstream. > > I think Nick's point probably covers this discussion, but you haven't > said what you want docs *for*. pip? setuptools? wheel?something else? If you are wearing new comer glasses, you don't know exactly what you are looking for. If you would know that, then you would be an expert. And then you don't need a guiding hand. > They are in various places, which you can hunt out via pypi or google. > It's not hard to do, but certainly it's true that it's harder to find > things than you'd want if you were paying for a well-documented > service. But given that you're not paying anything, and no-one working > on Python packaging has any obligation to meet your expectations, > you'll need to either lower the level of your expectations, pay > someone to provide what you're looking for, or offer your own time and > energy to address the issues you find. Simply making vague complaints > on this list isn't particularly productive. The complaint is vague? Here is it more precise: Quoting https://www.pypa.io/en/latest/ {{{ They host projects on github and bitbucket, and discuss issues on the pypa-dev and distutils-sig mailing lists. }}} Why two repo providers, why two mailing lists. This confuses new comers. I think this is precise feedback. > Sorry if that's not the response you were hoping for, and in > particular if you have a pressing need for support that we're not > providing, I do understand how that can be a problem for you, but as > Nick says, this is the reality of relying on software that's provided > to you free of charge. Yes, Nich is right. Regards, Thomas -- Thomas Guettler http://www.thomas-guettler.de/ From p.f.moore at gmail.com Thu Mar 30 04:07:06 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 30 Mar 2017 09:07:06 +0100 Subject: [Distutils] Source of confusion In-Reply-To: <4be84a88-2c3c-5a57-909f-6a3f8fb96ebd@thomas-guettler.de> References: <07e85472-5e08-32ee-3884-af0e327ac466@thomas-guettler.de> <4be84a88-2c3c-5a57-909f-6a3f8fb96ebd@thomas-guettler.de> Message-ID: On 30 March 2017 at 08:36, Thomas G?ttler wrote: > Why two repo providers, why two mailing lists. This confuses new comers. > > I think this is precise feedback. Because there is more than one project, and because the topics of discussion are different on the two. Paul From guettliml at thomas-guettler.de Thu Mar 30 04:53:00 2017 From: guettliml at thomas-guettler.de (=?UTF-8?Q?Thomas_G=c3=bcttler?=) Date: Thu, 30 Mar 2017 10:53:00 +0200 Subject: [Distutils] Which commercial vendor? In-Reply-To: References: <07e85472-5e08-32ee-3884-af0e327ac466@thomas-guettler.de> Message-ID: <2ea413e9-9fce-9125-897d-caef87a51dfd@thomas-guettler.de> Am 29.03.2017 um 10:27 schrieb Nick Coghlan: > On 29 March 2017 at 17:51, Paul Moore wrote: >> On 29 March 2017 at 06:29, Thomas G?ttler wrote: >>> I am stupid and missing a guiding hand which gives me simple straight forward step by step instruction. >> >> To do what? > > As far as I can tell, to get a customer experience instead of a > prospective co-contributor one. > > I'm sorry Thomas, as long as you continue looking for a coherent > customer experience from a collaborative collection of volunteer-run > community projects, you're going to continually be confused and > disappointed. > > The Python ecosystem *does* include commercial vendors that offer to > make opinionated technical decisions on behalf of their customers, as > well as providing a single point of contact for support questions and > feature requests, but beyond that, offering an overwhelming array of > confusing choices is pretty much the way open source *works*. My frustration has reached a limit. Yes, I am willing to pay money. Which vendor do you suggest to give me a reliable package management? Regards, Thomas G?ttler -- Thomas Guettler http://www.thomas-guettler.de/ From ncoghlan at gmail.com Thu Mar 30 05:38:09 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 30 Mar 2017 19:38:09 +1000 Subject: [Distutils] Which commercial vendor? In-Reply-To: <2ea413e9-9fce-9125-897d-caef87a51dfd@thomas-guettler.de> References: <07e85472-5e08-32ee-3884-af0e327ac466@thomas-guettler.de> <2ea413e9-9fce-9125-897d-caef87a51dfd@thomas-guettler.de> Message-ID: On 30 March 2017 at 18:53, Thomas G?ttler wrote: > Am 29.03.2017 um 10:27 schrieb Nick Coghlan: >> The Python ecosystem *does* include commercial vendors that offer to >> make opinionated technical decisions on behalf of their customers, as >> well as providing a single point of contact for support questions and >> feature requests, but beyond that, offering an overwhelming array of >> confusing choices is pretty much the way open source *works*. > > My frustration has reached a limit. Yes, I am willing to pay money. > > Which vendor do you suggest to give me a reliable package management? For cross-platform use cases, the best known options are ActiveState, Enthought, and Continuum Analytics (with the latter two focusing primarily on data analysis tasks). Another option if you're looking to bundle your own applications is PyRun, from eGenix: http://www.egenix.com/products/python/PyRun/ Finally, if you're solely interested in Linux, then Python runtimes are generally covered by commercial Linux support agreements. However, exactly which of the available commercial support options will be the best fit for your needs will depend on what you're aiming to do, and how it aligns with their product offerings. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From donald at stufft.io Thu Mar 30 07:17:05 2017 From: donald at stufft.io (Donald Stufft) Date: Thu, 30 Mar 2017 07:17:05 -0400 Subject: [Distutils] Source of confusion In-Reply-To: References: <07e85472-5e08-32ee-3884-af0e327ac466@thomas-guettler.de> <4be84a88-2c3c-5a57-909f-6a3f8fb96ebd@thomas-guettler.de> Message-ID: <63423554-7721-4B45-B7CE-8CF4F4C3FF1B@stufft.io> > On Mar 30, 2017, at 4:07 AM, Paul Moore wrote: > > On 30 March 2017 at 08:36, Thomas G?ttler wrote: >> Why two repo providers, why two mailing lists. This confuses new comers. >> >> I think this is precise feedback. > > Because there is more than one project, and because the topics of > discussion are different on the two. > Paul > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig To expand upon this a little more, PyPA is not really a formal organization where we dictate top down about how projects must operate. About the only requirements we have are that your project relates in some way to Python?s packaging toolchain and that you accept being bound by our CoC. Beyond that projects under the PyPA banner are operated independently of each other with their own policies and procedures and such. It provides some loose organization and a ?brand? but that?s really about all, so while most projects have opted to use GitHub, not all of them have (and that?s ok!). The two mailing lists are *largely* historical really, there was a time when distutils-sig (and before that even, catalog-sig) was not a particularly pleasant place to discuss things at and as such it made sense to try and sequester yourself away from it for some kinds of discussions. This has gotten a lot better in recent years and *most* mailing list like discussion tends to happen here on distutils-sig. Couple that with the fact that the individual projects tend to use their issue trackers to hold discussions that are specific to their particular project and we could probably consolidate, but I also don?t think it?s a big deal either. ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruno.rosa at eldorado.org.br Thu Mar 30 16:09:20 2017 From: bruno.rosa at eldorado.org.br (Bruno Alexandre Rosa) Date: Thu, 30 Mar 2017 20:09:20 +0000 Subject: [Distutils] Wheel files for PPC64le In-Reply-To: <46301d400edb469099abac67f6fed74a@serv030.corp.eldorado.org.br> References: <840906168df74413ac361803c1bbf1b8@serv030.corp.eldorado.org.br> <46301d400edb469099abac67f6fed74a@serv030.corp.eldorado.org.br> Message-ID: <53d512c13781484db994797dbacf34d4@serv030.corp.eldorado.org.br> Hi there, First of all, thanks for checking out the information about Docker images, Nick! Since Leonardo's last email got some formatting issues, I'm fixing it (mostly manually) and sending it here again. Kind regards, Bruno Rosa -------------------------------------------------- > Having manylinuxN consistently align with CentOS(N+4) seems reasonable to me for simplicity's sake, but there should be a discussion in the PEP around how that aligns with ppc64le support on other LTS distros (mainly Debian and Ubuntu). > Given the relative dates involved, I'd expect manylinux-style binaries compiled on CentOS 7 to also work on Ubuntu 14.04, 16.04 and Debian 8, but the PEP should explicitly confirm that the nominated symbol versions above are available on all of those distros. Ok, I can add it to the PEP, but regarding the supported distros, the older than CentOS 7 may not be compatible, based on the backward compatibility rules, that does not guarantee compatibility with older versions, only with newer. I sent a message about it here https://mail.python.org/pipermail/wheel-builders/2017-March/000265.html. > I don't think is quite that simple, as installers need to be able to figure out: > - on manylinux3 compatible platforms, prefer manylinux3 to manylinux1 > - on manylinux3 *in*compatible platforms, only consider manylinux1 > And that means asking the question: when combined with the option of the distro-provided `_manylinux` module, is "have_compatible_glibc(2, 5) and not have_compatible_glibc(2, 17)" an adequate check for the latter case? (My inclination is to say "yes", but it would be helpful to have some more concrete data on glibc versions in different distros of interest) Well, I didn?t realize that proposing a new tag would require an additional check about the tags, which will be a requirement for the manylinux2 as well, when the CentOS 5 be replaced by CentOS 6 for x86_64/i686. I need to check where and how the method ?is_manylinux1_compatible? is used to think how it would be done. I will check that and propose how to do it. > Beyond that, I think the main open question would be: do we go ahead and define the full `manylinux3` specification now? CentOS 7+, Ubuntu 14.04+, Debian 8+ compatibility still covers a *lot* of distros and deployments, and doing so means folks can bring the latest versions of gcc to bear on their code, rather than being limited to the last version that was made available for RHEL/CentOS 5 (gcc 4.8). Actually the idea was make it available for PPC64le, just as it is available to x86_64/i686 nowadays, like porting it. I didn?t think about the definition of all requirements for the manylinux3 for all architectures, as it can change until x86_64/i686 get to the manylinux3. Being limited to an old version, as CentOS 5 (gcc 4.8) is a requirement from PEP 513, which guarantees the backward compatibility, right? I do not want to change it, this proposal is just to create a tag for PPC64le, until both architectures get to the same base distro version. As I said above, I have already sent a message about basing it on CentOS 7, which does not guarantee the compatibility with older distros (example: Ubuntu 14.04). Is there any thinking about base on a newer distro and make the wheel files compatible with distros older than it? Sorry if I?m missing something here. I?m coping the Bruno Rosa, which will be involved with this PEP as well. Cheers, Leonardo Bianconi. --------------------------------------------------