From pradyunsg at gmail.com Wed Mar 1 05:28:34 2017 From: pradyunsg at gmail.com (Pradyun Gedam) Date: Wed, 01 Mar 2017 10:28:34 +0000 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: References:

Message-ID: On Tue, Feb 28, 2017, 21:18 Jim Fulton wrote: On Tue, Feb 28, 2017 at 10:14 AM, Pradyun Gedam wrote: ... 4. (if time permits) Move any dependency resolution code out into a separate library. This would make it possible for other projects (like buildout or a future pip replacement) to reuse the dependency resolver. Thank you! Welcome! ... I do intend to reuse some of the work done by Robert Collins in PR #2716 on pip's GitHub repository. Are you aware of the proof of concept in distlib? I am. I had looked at it a few weeks back. IIRC it makes a dependency graph using distlib and operates with that. I haven't really understood how it gets the information about dependencies without downloading the packages... I'll give it another pass this weekend. https://distil.readthedocs.io/en/0.1.0/overview.html#actual-improvements Jim -- Jim Fulton http://jimfulton.info -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Wed Mar 1 05:53:09 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 1 Mar 2017 10:53:09 +0000 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: References:

Message-ID: On 1 March 2017 at 10:28, Pradyun Gedam wrote: > I haven't really understood how it gets the information about dependencies > without downloading the packages... I'll give it another pass this weekend. If I recall, it reads static dependency data held on the red-dove site and maintained by downloading and running egg-info on the packages as changes occur. I don't think it's a sustainable approach for pip at the moment (my understanding is that it was a proof of concept for what having static metadata on PyPI would gain us). Paul From xav.fernandez at gmail.com Wed Mar 1 07:36:29 2017 From: xav.fernandez at gmail.com (Xavier Fernandez) Date: Wed, 1 Mar 2017 13:36:29 +0100 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: References: Message-ID: Great news ! Your plan seems reasonable. The first stage (RequirementSet refactor) seems to me to be the trickiest. Anyway I'm looking forward for your PRs :) Xavier -------------- next part -------------- An HTML attachment was scrubbed... URL: From robin at reportlab.com Wed Mar 1 09:17:28 2017 From: robin at reportlab.com (Robin Becker) Date: Wed, 1 Mar 2017 14:17:28 +0000 Subject: [Distutils] win amd_x64 Python 2.7.8 --> 2.7.13 woes Message-ID: <92171b3a-e9e8-ecf3-fc5b-8063e2561dad@chamonix.reportlab.co.uk> I find my extensions compiled for windows amd_x64 with python 2.7.8 no longer work after I updated python to 2.7.13. Is that expected? I had assumed that the cpy27 wheels that I make would work with any python 2.7, but this makes me doubt that. To get the reportlab tests to complete I rebuilt all my extensions and installed a newer version of pillow. In addition to that the uninstallation of the amd64 python 2.7.8 has also uninstalled the x86 version of python 2.7.8 :( -- Robin Becker From robin at reportlab.com Wed Mar 1 09:44:03 2017 From: robin at reportlab.com (Robin Becker) Date: Wed, 1 Mar 2017 14:44:03 +0000 Subject: [Distutils] win amd_x64 Python 2.7.8 --> 2.7.13 woes In-Reply-To: <92171b3a-e9e8-ecf3-fc5b-8063e2561dad@chamonix.reportlab.co.uk> References: <92171b3a-e9e8-ecf3-fc5b-8063e2561dad@chamonix.reportlab.co.uk> Message-ID: Ignore this; it was my duh :( of the day; seems the download button didn't give me the amd_x64 version so it carefully installed an x86 where I used to have my amd_x64. After banging my head with a hammer I downloaded the installers carefully and got things back to normal. On 1 March 2017 at 14:17, Robin Becker wrote: > I find my extensions compiled for windows amd_x64 with python 2.7.8 no > longer work after I updated python to 2.7.13. > > Is that expected? I had assumed that the cpy27 wheels that I make would > work with any python 2.7, but this makes me doubt that. > > To get the reportlab tests to complete I rebuilt all my extensions and > installed a newer version of pillow. > > In addition to that the uninstallation of the amd64 python 2.7.8 has also > uninstalled the x86 version of python 2.7.8 :( > -- > Robin Becker > -- Robin Becker -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Mar 1 15:02:09 2017 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 2 Mar 2017 09:02:09 +1300 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: References: Message-ID: On Wed, Mar 1, 2017 at 4:14 AM, Pradyun Gedam wrote: > Hello Everyone! > > Google released the list of accepted organizations for GSoC 2017 and PSF > is one of them. > I see pip is not yet listed as a PSF sub-org on http://python-gsoc.org/. This is pretty urgent to arrange: * "March 3* - Last day for Python sub-orgs to apply to participate with the PSF. (Assuming we get accepted by Google and can support sub-orgs, of course!) This deadline is for orgs who applies on their own and didn't make it, but still wish to participate under the umbrella. " The original deadline was Feb 7. There's a good chance that Pip will still be accepted after March 3, but I wouldn't gamble on it. There are instructions under "Project Ideas" on http://python-gsoc.org/ on how to get accepted as a sub-org. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Wed Mar 1 15:07:39 2017 From: donald at stufft.io (Donald Stufft) Date: Wed, 1 Mar 2017 15:07:39 -0500 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: References:

Message-ID: <0C091B31-E7C8-4025-A07C-6B103F93CB31@stufft.io> > On Mar 1, 2017, at 3:02 PM, Ralf Gommers wrote: > > > > On Wed, Mar 1, 2017 at 4:14 AM, Pradyun Gedam > wrote: > Hello Everyone! > > Google released the list of accepted organizations for GSoC 2017 and PSF is one of them. > > I see pip is not yet listed as a PSF sub-org on http://python-gsoc.org/ . This is pretty urgent to arrange: > > "March 3 - Last day for Python sub-orgs to apply to participate with the PSF. > (Assuming we get accepted by Google and can support sub-orgs, of course!) > This deadline is for orgs who applies on their own and didn't make it, but still > wish to participate under the umbrella. " > > The original deadline was Feb 7. There's a good chance that Pip will still be accepted after March 3, but I wouldn't gamble on it. > > There are instructions under "Project Ideas" on http://python-gsoc.org/ on how to get accepted as a sub-org. > Oh. I?ve never done this before and Pradyun reached out so I had no idea I had to do this. I?ll go ahead and do this. ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From pradyunsg at gmail.com Wed Mar 1 15:13:51 2017 From: pradyunsg at gmail.com (Pradyun Gedam) Date: Wed, 01 Mar 2017 20:13:51 +0000 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: <0C091B31-E7C8-4025-A07C-6B103F93CB31@stufft.io> References:

<0C091B31-E7C8-4025-A07C-6B103F93CB31@stufft.io> Message-ID: Thanks for the pointer Ralf! :) I was actually drafting a mail to send to Donald directly for thanking him for being willing to mentor me as well as pointing this out to him. I guess I can discard that draft now... On Thu, Mar 2, 2017, 01:37 Donald Stufft wrote: > > On Mar 1, 2017, at 3:02 PM, Ralf Gommers wrote: > > > > On Wed, Mar 1, 2017 at 4:14 AM, Pradyun Gedam wrote: > > Hello Everyone! > > Google released the list of accepted organizations for GSoC 2017 and PSF > is one of them. > > > I see pip is not yet listed as a PSF sub-org on http://python-gsoc.org/. > This is pretty urgent to arrange: > > * "March 3* - Last day for Python sub-orgs to apply to participate > with the PSF. > (Assuming we get accepted by Google and can support sub-orgs, of > course!) > This deadline is for orgs who applies on their own and didn't make it, > but still > wish to participate under the umbrella. " > > The original deadline was Feb 7. There's a good chance that Pip will still > be accepted after March 3, but I wouldn't gamble on it. > > There are instructions under "Project Ideas" on http://python-gsoc.org/ > on how to get accepted as a sub-org. > > > > Oh. I?ve never done this before and Pradyun reached out so I had no idea I > had to do this. I?ll go ahead and do this. > > > ? > > Donald Stufft > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Mar 1 16:31:34 2017 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 2 Mar 2017 10:31:34 +1300 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: <0C091B31-E7C8-4025-A07C-6B103F93CB31@stufft.io> References:

<0C091B31-E7C8-4025-A07C-6B103F93CB31@stufft.io> Message-ID: On Thu, Mar 2, 2017 at 9:07 AM, Donald Stufft wrote: > > On Mar 1, 2017, at 3:02 PM, Ralf Gommers wrote: > > > > On Wed, Mar 1, 2017 at 4:14 AM, Pradyun Gedam wrote: > >> Hello Everyone! >> >> Google released the list of accepted organizations for GSoC 2017 and PSF >> is one of them. >> > > I see pip is not yet listed as a PSF sub-org on http://python-gsoc.org/. > This is pretty urgent to arrange: > > * "March 3* - Last day for Python sub-orgs to apply to participate > with the PSF. > (Assuming we get accepted by Google and can support sub-orgs, of > course!) > This deadline is for orgs who applies on their own and didn't make it, > but still > wish to participate under the umbrella. " > > The original deadline was Feb 7. There's a good chance that Pip will still > be accepted after March 3, but I wouldn't gamble on it. > > There are instructions under "Project Ideas" on http://python-gsoc.org/ > on how to get accepted as a sub-org. > > > > Oh. I?ve never done this before and Pradyun reached out so I had no idea I > had to do this. I?ll go ahead and do this. > I'm the GSoC admin for SciPy, so need to keep track of the various deadlines/todos. I'd be happy to ping you each time one approaches if that helps. There's a PSF GSoC mentors list that's not noisy and useful to join. You'll be added to the Google GSoC-mentors list automatically if you start mentoring in the program, but you may want to mute it or not use your primary email address for it (it's high-traffic, very low signal to noise and you can't unsubscribe). Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From xo.olive at gmail.com Thu Mar 2 01:50:48 2017 From: xo.olive at gmail.com (Xavier Olive) Date: Thu, 2 Mar 2017 07:50:48 +0100 Subject: [Distutils] How to change the linking command in a setuptools building process? Message-ID: I maintain a Cython binding to some OCaml code (through their respective C interface). For past versions, I managed to cheat and distribute a wheel file for Windows through cross-compilation. Now, I finally managed a clean and native way to produce the library for Windows 64 bits. For the 32 bits cross-compiled version, I had a specific target in my setup.py with the proper commands to execute. Back on Windows, I would like to stick to a setuptoolsic way of doing, but the thing is I need to replace the regular linking command link.exe with a different tool (resp. flexlink.exe, shipped with OCaml on Windows) Don't panic: flexlink.exe just builds some assembler shit before compiling and linking with the regular link.exe. It is the proper way to link OCaml executables and shared libraries under Windows. For MacOS and Linux, the traditional Extension pattern works like a charm as follows (mlobject is produced by OCaml a bit earlier in the file after some timestamp checks, asmrunlib is the full path to the equivalent of python36.dll for OCaml) : extensions = [ Extension("foo", ["foo.pyx", "interface_c.c"], language="c", include_dirs=INCLUDE, extra_compile_args=compileargs, extra_link_args=[mlobject, asmrunlib, ] ) ] Let's say I limit myself to Python>=3.5, I guess (by comparison with too big projects like NumPy) I would need to start by extending distutils._msvccompiler.MSVCCompiler and replace the self.linker = _find_exe("link.exe", paths) with something based on flexlink.exe. The problem is that I have no idea how they manage the plumbing work that comes next (connecting this extended compiler and making it look like the regular msvc to the setup process). I suppose it is not thoroughly documented anywhere and that if they were able to do more than that in NumPy, I should be able to reach my goal somehow. My setup.py is still reasonably basic and a solution that keeps the whole building/packaging process in one single file would be great! Xavier (Question first asked here https://stackoverflow.com/q/42519377/1595335 before being advised this mailing-list) From donald at stufft.io Thu Mar 2 11:12:31 2017 From: donald at stufft.io (Donald Stufft) Date: Thu, 2 Mar 2017 11:12:31 -0500 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: References:

<0C091B31-E7C8-4025-A07C-6B103F93CB31@stufft.io> Message-ID: <77E7DCE9-BDE5-4BC7-AC16-973703041C10@stufft.io> Ok, so It appears besides me we need another one or two mentors to act as backup mentors. I guess in the event I?m not available or so. Probably ideally the backup mentor would either be familiar with pip?s codebase or else familiar with the ideas behind a backtracking resolver. I do have someone who can do it if needed, but I figured I?d poke distutils-sig first to see if anyone else wanted to do it as well. They suggest that at least one mentor be exclusive to the student but that the other mentors can work with multiple students. For pip we only have the one (yay Pradyun) and I?m not mentoring anyone else so we should be good on the exclusive front (of course, if someone is interested to help with this, they can also be exclusive). > On Mar 1, 2017, at 4:31 PM, Ralf Gommers wrote: > > > I'm the GSoC admin for SciPy, so need to keep track of the various deadlines/todos. I'd be happy to ping you each time one approaches if that helps. That would be awesome. I?m poking at the sites now to figure out everything I need to do to make sure all the administration bits are done properly, but having a double check that I don?t miss something would be great. > > There's a PSF GSoC mentors list that's not noisy and useful to join. You'll be added to the Google GSoC-mentors list automatically if you start mentoring in the program, but you may want to mute it or not use your primary email address for it (it's high-traffic, very low signal to noise and you can't unsubscribe). Ok cool. ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcappos at nyu.edu Thu Mar 2 11:31:55 2017 From: jcappos at nyu.edu (Justin Cappos) Date: Thu, 2 Mar 2017 11:31:55 -0500 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: <77E7DCE9-BDE5-4BC7-AC16-973703041C10@stufft.io> References:

<0C091B31-E7C8-4025-A07C-6B103F93CB31@stufft.io> <77E7DCE9-BDE5-4BC7-AC16-973703041C10@stufft.io> Message-ID: I'd be happy to help to provide mentorship for the backtracking dependency resolver aspect. I don't know pip's code well though. Thanks, Justin On Thu, Mar 2, 2017 at 11:12 AM, Donald Stufft wrote: > Ok, so It appears besides me we need another one or two mentors to act as > backup mentors. I guess in the event I?m not available or so. Probably > ideally the backup mentor would either be familiar with pip?s codebase or > else familiar with the ideas behind a backtracking resolver. I do have > someone who can do it if needed, but I figured I?d poke distutils-sig first > to see if anyone else wanted to do it as well. > > They suggest that at least one mentor be exclusive to the student but that > the other mentors can work with multiple students. For pip we only have the > one (yay Pradyun) and I?m not mentoring anyone else so we should be good on > the exclusive front (of course, if someone is interested to help with this, > they can also be exclusive). > > On Mar 1, 2017, at 4:31 PM, Ralf Gommers wrote: > > > I'm the GSoC admin for SciPy, so need to keep track of the various > deadlines/todos. I'd be happy to ping you each time one approaches if that > helps. > > > > That would be awesome. I?m poking at the sites now to figure out > everything I need to do to make sure all the administration bits are done > properly, but having a double check that I don?t miss something would be > great. > > > There's a PSF GSoC mentors list that's not noisy and useful to join. > You'll be added to the Google GSoC-mentors list automatically if you start > mentoring in the program, but you may want to mute it or not use your > primary email address for it (it's high-traffic, very low signal to noise > and you can't unsubscribe). > > > Ok cool. > > ? > Donald Stufft > > > > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Thu Mar 2 11:42:32 2017 From: donald at stufft.io (Donald Stufft) Date: Thu, 2 Mar 2017 11:42:32 -0500 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: References:

<0C091B31-E7C8-4025-A07C-6B103F93CB31@stufft.io> <77E7DCE9-BDE5-4BC7-AC16-973703041C10@stufft.io> Message-ID: <13A8A08D-3E89-4C61-8E70-ACB07B2F5EB6@stufft.io> > On Mar 2, 2017, at 11:31 AM, Justin Cappos wrote: > > I'd be happy to help to provide mentorship for the backtracking dependency resolver aspect. I don't know pip's code well though. > Awesome, that would work out well actually I think, because while I know pip?s code base, the actual resolver bits are not my strong suite (one of the main reasons I hadn?t done this work already is the research to actually figure out the right resolver tech and how it functions). ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From pradyunsg at gmail.com Sat Mar 4 12:25:54 2017 From: pradyunsg at gmail.com (Pradyun Gedam) Date: Sat, 04 Mar 2017 17:25:54 +0000 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: References:

Message-ID: On Wed, 1 Mar 2017 at 15:58 Pradyun Gedam wrote: > > > On Tue, Feb 28, 2017, 21:18 Jim Fulton wrote: > > On Tue, Feb 28, 2017 at 10:14 AM, Pradyun Gedam > wrote: > ... > > 4. (if time permits) Move any dependency resolution code out into a > separate library. > > This would make it possible for other projects (like buildout or a > future pip replacement) to reuse the dependency resolver. > > > Thank you! > > > Welcome! > > > ... > > I do intend to reuse some of the work done by Robert Collins in PR #2716 > on pip's GitHub repository. > > > Are you aware of the proof of concept in distlib? > > > I am. I had looked at it a few weeks back. IIRC it makes a dependency > graph using distlib and operates with that. > > I haven't really understood how it gets the information about dependencies > without downloading the packages... I'll give it another pass this weekend. > I went through it. As Paul Moore said, it is hitting http://www.red-dove.com/pypi/ which has metdata on what the requirements are of a package. (saying this on the basis of [1]) Since PyPI does not have such information in a static declarative format, that approach is not feasible. pip will have to download packages and execute setup.py to know what the dependencies are. [1]: https://www.red-dove.com/pypi/projects/S/Sphinx/package-1.3.json > > > https://distil.readthedocs.io/en/0.1.0/overview.html#actual-improvements > > Jim > > -- > Jim Fulton > http://jimfulton.info > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Sat Mar 4 12:28:32 2017 From: donald at stufft.io (Donald Stufft) Date: Sat, 4 Mar 2017 12:28:32 -0500 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: References:

Message-ID: <8D134761-2C9B-49B8-83F1-E2434F8BBD61@stufft.io> > On Mar 4, 2017, at 12:25 PM, Pradyun Gedam wrote: > > Since PyPI does not have such information in a static declarative format, that approach is not feasible. pip will have to download packages and execute setup.py to know what the dependencies are. I will note, that we can expose that information in PyPI for *wheels*, but not for sdists currently. It would be a lot more work though because it?d essentially require a whole new repository API and I doubt Pradyun wants to tackle that right now :) Keeping a future in mind where we can get a least some of that information without downloading would be good though, at least to keep in mind when structuring code. ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From pradyunsg at gmail.com Sun Mar 5 01:17:46 2017 From: pradyunsg at gmail.com (Pradyun Gedam) Date: Sun, 05 Mar 2017 06:17:46 +0000 Subject: [Distutils] GSoC 2017 - Plan of Action for dependency resolver In-Reply-To: <8D134761-2C9B-49B8-83F1-E2434F8BBD61@stufft.io> References:

<8D134761-2C9B-49B8-83F1-E2434F8BBD61@stufft.io> Message-ID: On Sat, 4 Mar 2017 at 22:58 Donald Stufft wrote: > > On Mar 4, 2017, at 12:25 PM, Pradyun Gedam wrote: > > Since PyPI does not have such information in a static declarative format, > that approach is not feasible. pip will have to download packages and > execute setup.py to know what the dependencies are. > > > > I will note, that we can expose that information in PyPI for *wheels*, but > not for sdists currently. It would be a lot more work though because it?d > essentially require a whole new repository API and I doubt Pradyun wants to > tackle that right now :) > Yeah... For now, it's just dependency resolution in pip. > Keeping a future in mind where we can get a least some of that information > without downloading would be good though, at least to keep in mind when > structuring code. > Duly noted. > ? > > Donald Stufft > -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Sun Mar 5 11:51:44 2017 From: donald at stufft.io (Donald Stufft) Date: Sun, 5 Mar 2017 11:51:44 -0500 Subject: [Distutils] Deprecating download counts in API? Message-ID: So, as most folks are aware PyPI has long had a cumulative download count available in it?s API. This has been on and off again broken for a *long* time and arguably the numbers in there have been ?wrong? even when it was working because we had no way to reproduce them from scratch (and thus whenever a bug occurred we?d flat out lose data or add incorrect data with no way to correct it). In the meantime, we?ve gotten a much better source of querying for download information available inside of Google?s BigQuery database [1][2]. Not only is this able to be recreated ?from scratch? so we can, if needed, fix massive data bugs but it provides MUCH more information than the previous downloads and a very powerful query language to go along with it. Unless there is some sort of massive outcry, I plan to deprecate and ultimately remove the download counts available in the PyPI API, instead preferring people to start using the BigQuery data instead. This more or less reflects the current state of things, since it has been on and off broken (typically broken) for something like a year now. [1] https://mail.python.org/pipermail/distutils-sig/2016-May/028986.html [2] https://langui.sh/2016/12/09/data-driven-decisions/ ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From glyph at twistedmatrix.com Mon Mar 6 01:41:12 2017 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Sun, 5 Mar 2017 22:41:12 -0800 Subject: [Distutils] Deprecating download counts in API? In-Reply-To: References: Message-ID: <37810ED1-9984-4CDC-9E16-F6B46ADB624A@twistedmatrix.com> > On Mar 5, 2017, at 8:51 AM, Donald Stufft wrote: > > So, as most folks are aware PyPI has long had a cumulative download count available in it?s API. This has been on and off again broken for a *long* time and arguably the numbers in there have been ?wrong? even when it was working because we had no way to reproduce them from scratch (and thus whenever a bug occurred we?d flat out lose data or add incorrect data with no way to correct it). > > In the meantime, we?ve gotten a much better source of querying for download information available inside of Google?s BigQuery database [1][2]. Not only is this able to be recreated ?from scratch? so we can, if needed, fix massive data bugs but it provides MUCH more information than the previous downloads and a very powerful query language to go along with it. > > Unless there is some sort of massive outcry, I plan to deprecate and ultimately remove the download counts available in the PyPI API, instead preferring people to start using the BigQuery data instead. This more or less reflects the current state of things, since it has been on and off broken (typically broken) for something like a year now. I fully realize that if I really wanted this, I could do it myself, and the last thing you need is someone signing you up for more work :). But, as someone who's been vaguely annoyed that `vanity` doesn't work for a while, I wonder: shouldn't it be easy for someone familiar with both systems to simply implement the existing "download count" API as a legacy / compatibility wrapper around BigQuery? If that isn't trivial, doesn't that point to something flawed in the way the data is presented in BigQuery? That said, I'm fully OK with the answer that even a tiny bit of work is too much, and the limited volunteer effort of PyPI should be spent elsewhere. -glyph -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominic.lund at lj-oz.com Sun Mar 5 16:43:43 2017 From: dominic.lund at lj-oz.com (Dominic Lund) Date: Mon, 6 Mar 2017 08:43:43 +1100 Subject: [Distutils] help required Message-ID: I have a third party python module I have just downloaded It is in a zip file in my download directory How do I use pip to install it? Or can I 'install' it manually? -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Mon Mar 6 04:04:02 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 6 Mar 2017 09:04:02 +0000 Subject: [Distutils] help required In-Reply-To: References: Message-ID: On 5 March 2017 at 21:43, Dominic Lund wrote: > I have a third party python module I have just downloaded > > It is in a zip file in my download directory > > How do I use pip to install it? > > Or can I ?install? it manually? Does the documentation for the module not tell you how to do so? If not, it's difficult to advise you with so little information. "pip install " might work, but be aware that that command will likely run some of the code in the zipfile, so if you're not sure it's the right thing to do, you should at least be sure that the code has come from somewhere you trust. For anyone here to help you, you'd need to at a minimum let us know what the module is, where you got it from, and what you have already tried (and what happened). Paul From donald at stufft.io Mon Mar 6 06:34:10 2017 From: donald at stufft.io (Donald Stufft) Date: Mon, 6 Mar 2017 06:34:10 -0500 Subject: [Distutils] Deprecating download counts in API? In-Reply-To: <37810ED1-9984-4CDC-9E16-F6B46ADB624A@twistedmatrix.com> References: <37810ED1-9984-4CDC-9E16-F6B46ADB624A@twistedmatrix.com> Message-ID: > On Mar 6, 2017, at 1:41 AM, Glyph Lefkowitz wrote: > > >> On Mar 5, 2017, at 8:51 AM, Donald Stufft > wrote: >> >> So, as most folks are aware PyPI has long had a cumulative download count available in it?s API. This has been on and off again broken for a *long* time and arguably the numbers in there have been ?wrong? even when it was working because we had no way to reproduce them from scratch (and thus whenever a bug occurred we?d flat out lose data or add incorrect data with no way to correct it). >> >> In the meantime, we?ve gotten a much better source of querying for download information available inside of Google?s BigQuery database [1][2]. Not only is this able to be recreated ?from scratch? so we can, if needed, fix massive data bugs but it provides MUCH more information than the previous downloads and a very powerful query language to go along with it. >> >> Unless there is some sort of massive outcry, I plan to deprecate and ultimately remove the download counts available in the PyPI API, instead preferring people to start using the BigQuery data instead. This more or less reflects the current state of things, since it has been on and off broken (typically broken) for something like a year now. > > I fully realize that if I really wanted this, I could do it myself, and the last thing you need is someone signing you up for more work :). But, as someone who's been vaguely annoyed that `vanity` doesn't work for a while, I wonder: shouldn't it be easy for someone familiar with both systems to simply implement the existing "download count" API as a legacy / compatibility wrapper around BigQuery? If that isn't trivial, doesn't that point to something flawed in the way the data is presented in BigQuery? > > That said, I'm fully OK with the answer that even a tiny bit of work is too much, and the limited volunteer effort of PyPI should be spent elsewhere. > > -glyph > It?s not hard at all, it?d just be (standard SQL mode): SELECT file.filename, COUNT(*) AS downloads FROM `the-psf.pypi.downloads*` WHERE file.project = "twisted" GROUP BY file.filename You can probably guess how to handle modifications to this query since it?s roughly just regular old SQL. There are a few reasons I don?t want to just do this in PyPI though. This query will take somewhere between 30 and 60 seconds to complete, so I can?t do it inline with the the HTTP request, and I?d need to have a periodic job go through and issue about 100k queries (or a single query with almost a million results) and then load that into the database. More importantly though, we don?t have an unlimited amount of BigQuery on PyPI. We get blocks of credits granted periodically and so the faster we use up ?spend? the more regularly I have to track down my contacts inside of Google and get them to re-up the credit. This adds an incentive to to try and reduce our spending where we can to limit the frequency and the amount of time I need to go between asking for more credits. Due to BigQuery?s billing model you get billed based upon how much data your query has to process which means that a query that fetches data for all time, will be the most expensive kind of query and gets more expensive every day. On the flip side, the BigQuery data is publicly query able and the account being used to query ?pays? for that query and every account gets 1TB of querying for free (and additional TBs are $5 per TB). Currently it takes ~215GB of data to do a ?full? query for twisted (the exact query I listed above) and I haven?t fully backfilled all of the data yet (I?m working on it). You can kind of extrapolate that out to what it would ?cost" to do that same query for all 100k projects even before I do the backfill (which would drastically raise the ?cost? of PyPI here). The smart thing to do with BigQuery is to do date limited querying so that your query doesn?t have to load as much data. For instance, adapting the above query so that it only queries the last 30 days (still using standard SQL) you would do: SELECT file.filename, COUNT(*) AS downloads FROM `the-psf.pypi.downloads*` WHERE file.project = "twisted" AND _TABLE_SUFFIX BETWEEN FORMAT_DATE("%Y%m%d", DATE_ADD(CURRENT_DATE(), INTERVAL -31 day)) AND FORMAT_DATE("%Y%m%d", DATE_ADD(CURRENT_DATE(), INTERVAL -1 day)) GROUP BY file.filename This touches a much more reasonable 27GB of data. For reference, we currently ?spend? about $50/month on BigQuery so doing like, daily updates of this data for everyone would be a drastic increase in the amount of BigQuery spending we do. So the tl;dr is I think it?s a better solution for vanity to talk to the BigQuery API itself, ideally limiting itself to a recent timeframe by default, and possibly adding a flag to get at the all time data for people who are OK with either using vanity less often or are willing to spend a couple bucks if they?re querying the full amount of data every day. Where Warehouse is starting to query BigQuery, I?m purposely limiting it to only the last N days (typically 30) so as not to regularly query the entire data set. ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Mon Mar 6 06:36:23 2017 From: donald at stufft.io (Donald Stufft) Date: Mon, 6 Mar 2017 06:36:23 -0500 Subject: [Distutils] Deprecating download counts in API? In-Reply-To: References: <37810ED1-9984-4CDC-9E16-F6B46ADB624A@twistedmatrix.com> Message-ID: > On Mar 6, 2017, at 6:34 AM, Donald Stufft wrote: > > On the flip side, the BigQuery data is publicly query able and the account being used to query ?pays? for that query and every account gets 1TB of querying for free (and additional TBs are $5 per TB). To be clear, each account gets 1TB of querying for free per month, not 1TB for the life of the account. ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From jim at jimfulton.info Mon Mar 6 13:09:56 2017 From: jim at jimfulton.info (Jim Fulton) Date: Mon, 6 Mar 2017 13:09:56 -0500 Subject: [Distutils] Announcing new Buildout documentation Message-ID: The old horrible doctest-based buildout documentation has finally been replaced: http://docs.buildout.org Jim -- Jim Fulton http://jimfulton.info -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Mon Mar 6 16:43:05 2017 From: wes.turner at gmail.com (Wes Turner) Date: Mon, 6 Mar 2017 15:43:05 -0600 Subject: [Distutils] Announcing new Buildout documentation In-Reply-To: References: Message-ID: Thanks! https://github.com/buildout/buildout/commits/master/doc On Monday, March 6, 2017, Jim Fulton wrote: > The old horrible doctest-based buildout documentation has finally been > replaced: > > http://docs.buildout.org > > Jim > > -- > Jim Fulton > http://jimfulton.info > -------------- next part -------------- An HTML attachment was scrubbed... URL: From glyph at twistedmatrix.com Mon Mar 6 22:24:19 2017 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Mon, 6 Mar 2017 19:24:19 -0800 Subject: [Distutils] Deprecating download counts in API? In-Reply-To: References: <37810ED1-9984-4CDC-9E16-F6B46ADB624A@twistedmatrix.com> Message-ID: <6424B8AA-2F62-41E6-ACA7-047CCF90B3F6@twistedmatrix.com> > On Mar 6, 2017, at 3:34 AM, Donald Stufft wrote: > > >> On Mar 6, 2017, at 1:41 AM, Glyph Lefkowitz > wrote: >> >> >>> On Mar 5, 2017, at 8:51 AM, Donald Stufft > wrote: >>> >>> Unless there is some sort of massive outcry, I plan to deprecate and ultimately remove the download counts available in the PyPI API, [...] >> >> [...] But, as someone who's been vaguely annoyed that `vanity` doesn't work for a while, I wonder: shouldn't it be easy for someone familiar with both systems to simply implement the existing "download count" API as a legacy / compatibility wrapper around BigQuery? [...] > > It?s not hard at all, it?d just be [...] Thanks for that super detailed and exhaustive explanation, I have a much better handle on the issues involved now. Sorry if you'd written it before and I'd missed it - I can now very clearly see why you want to get rid of it! -glyph -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail at timgolden.me.uk Tue Mar 7 09:24:14 2017 From: mail at timgolden.me.uk (Tim Golden) Date: Tue, 7 Mar 2017 14:24:14 +0000 Subject: [Distutils] install_requires setup.py install vs pip install Message-ID: <86780446-e173-f102-c407-e6e026b31c94@timgolden.me.uk> I have a setup.py which looks like this: from setuptools import setup setup( name='install_requires', py_modules = ["install_requires"], install_requires=['PyQt5'], ) For the purposes of the discussion, there is an install_requires.py in the same directory. I have created and activated a standard Python 3.5 venv on Windows: py -3.5 -mvenv .venv .venv\scripts\activate.bat python -mpip install --upgrade pip (I don't believe the Python version or the venv matter here, but including them for reproducibility). If I pip install the module, the PyQt5 install dependency is found and installed: (.venv) C:\work-in-progress\install_requires>pip install . Processing c:\work-in-progress\install_requires Collecting PyQt5 (from install-requires==0.0.0) Using cached PyQt5-5.8.1-5.8.0-cp35.cp36.cp37-none-win_amd64.whl Collecting sip==4.19 (from PyQt5->install-requires==0.0.0) Using cached sip-4.19-cp35-none-win_amd64.whl Installing collected packages: sip, PyQt5, install-requires Running setup.py install for install-requires ... done Successfully installed PyQt5-5.8.1 install-requires-0.0.0 sip-4.19 If, instead, I setup.py install the module, I get the following messages: Processing dependencies for install-requires==0.0.0 Searching for PyQt5 Reading https://pypi.python.org/simple/PyQt5/ No local packages or download links found for PyQt5 error: Could not find suitable distribution for Requirement.parse('PyQt5') However, if I substitute instead "requests" or "simplejson" (both well-known packages) then setup.py install succeeds. My cursory inspection of https://pypi.python.org/simple/pyqt5/ doesn't reveal anything obviously different except for the complexity of the filenames. I've searched around, including in the archives of this group, but can't find that this is a known issue. If I had to guess from the evidence, it would be that pip ships a more sophisticated parser of complex wheel filenames than setuptools. Can anyone advise, please? TJG From leorochael at gmail.com Tue Mar 7 09:38:34 2017 From: leorochael at gmail.com (Leonardo Rochael Almeida) Date: Tue, 7 Mar 2017 11:38:34 -0300 Subject: [Distutils] install_requires setup.py install vs pip install In-Reply-To: <86780446-e173-f102-c407-e6e026b31c94@timgolden.me.uk> References: <86780446-e173-f102-c407-e6e026b31c94@timgolden.me.uk> Message-ID: Hi Tim, The reason setuptools can't process your package is because setuptools itself doesn't yet know how to install wheels[1] which pip knows how to install, and PyQT5 is only available as wheels on PyPI (the files with `.whl` extension in the `simple` URL you linked). [1] https://github.com/pypa/setuptools/issues/78 The reason why setuptools can install "requests" or "simplejson" is that their pages contain `.tar.gz` files with the source distributions beside the `.whl` files. Incidentally, there are PyQT5 source distributions, and they're available in their own website[2]. IMO they should be present in PyPI as well. (Though those archive names with `_gpl` in the middle might confuse setuptools, and they might prefer to deal with "Could not find suitable distribution" error message than some obscure compilation error arising from missing system packages). [2] https://www.riverbankcomputing.com/software/pyqt/download5/ Cheers, Leo On 7 March 2017 at 11:24, Tim Golden wrote: > I have a setup.py which looks like this: > > from setuptools import setup > setup( > name='install_requires', > py_modules = ["install_requires"], > install_requires=['PyQt5'], > ) > > For the purposes of the discussion, there is an install_requires.py in the > same directory. > > I have created and activated a standard Python 3.5 venv on Windows: > > py -3.5 -mvenv .venv > .venv\scripts\activate.bat > python -mpip install --upgrade pip > > (I don't believe the Python version or the venv matter here, but including > them for reproducibility). > > If I pip install the module, the PyQt5 install dependency is found and > installed: > > (.venv) C:\work-in-progress\install_requires>pip install . > Processing c:\work-in-progress\install_requires > Collecting PyQt5 (from install-requires==0.0.0) > Using cached PyQt5-5.8.1-5.8.0-cp35.cp36.cp37-none-win_amd64.whl > Collecting sip==4.19 (from PyQt5->install-requires==0.0.0) > Using cached sip-4.19-cp35-none-win_amd64.whl > Installing collected packages: sip, PyQt5, install-requires > Running setup.py install for install-requires ... done > Successfully installed PyQt5-5.8.1 install-requires-0.0.0 sip-4.19 > > If, instead, I setup.py install the module, I get the following messages: > > Processing dependencies for install-requires==0.0.0 > Searching for PyQt5 > Reading https://pypi.python.org/simple/PyQt5/ > No local packages or download links found for PyQt5 > error: Could not find suitable distribution for Requirement.parse('PyQt5') > > However, if I substitute instead "requests" or "simplejson" (both > well-known packages) then setup.py install succeeds. My cursory inspection > of https://pypi.python.org/simple/pyqt5/ doesn't reveal anything > obviously different except for the complexity of the filenames. > > I've searched around, including in the archives of this group, but can't > find that this is a known issue. If I had to guess from the evidence, it > would be that pip ships a more sophisticated parser of complex wheel > filenames than setuptools. > > Can anyone advise, please? > > TJG > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail at timgolden.me.uk Tue Mar 7 09:53:24 2017 From: mail at timgolden.me.uk (Tim Golden) Date: Tue, 7 Mar 2017 14:53:24 +0000 Subject: [Distutils] install_requires setup.py install vs pip install In-Reply-To: References: <86780446-e173-f102-c407-e6e026b31c94@timgolden.me.uk> Message-ID: <7e4a8faa-a832-38f1-5717-31191214b4e8@timgolden.me.uk> On 07/03/2017 14:38, Leonardo Rochael Almeida wrote: > Hi Tim, > > The reason setuptools can't process your package is because setuptools > itself doesn't yet know how to install wheels[1] which pip knows how to > install, and PyQT5 is only available as wheels on PyPI (the files with > `.whl` extension in the `simple` URL you linked). > > [1] https://github.com/pypa/setuptools/issues/78 > > The reason why setuptools can install "requests" or "simplejson" is that > their pages contain `.tar.gz` files with the source distributions beside > the `.whl` files. > > Incidentally, there are PyQT5 source distributions, and they're > available in their own website[2]. > > IMO they should be present in PyPI as well. > > (Though those archive names with `_gpl` in the middle might confuse > setuptools, and they might prefer to deal with "Could not find suitable > distribution" error message than some obscure compilation error arising > from missing system packages). > > [2] https://www.riverbankcomputing.com/software/pyqt/download5/ Thanks, Leo. That was a much simple explanation than I'd been considering! I didn't think to look at the output for requests etc. Now that I do, it's clearly building eggs from sdists. Knowing this, I have ways forward. (This is actually about the mu editor which is aimed at teachers and other less techie people: https://github.com/mu-editor/mu) Thanks again for your help TJG From Gabriel.Ganne at enea.com Wed Mar 8 03:43:36 2017 From: Gabriel.Ganne at enea.com (Gabriel Ganne) Date: Wed, 8 Mar 2017 08:43:36 +0000 Subject: [Distutils] custom setup.py link arguments order Message-ID: Hi, I'm currently writing a python C module which has a chained dependency: - mymodule requires libb - libb requires liba To that effect, within setup.py, I link against both liba and libb libraries=['a', 'b'], Also, as I'm working on Ubuntu, I want to add -Wl,--no-as-needed to make sure that the symbols not immediately needed will still be stripped. extra_link_args=['-Wl,--no-as-needed'], However, it seems that the extra_link_args are systematically appended at the end of the link line, but for this to work, the '-Wl,--no-as-needed' argument need to be *before* the link against my two libraries. How can I choose the order of my link arguments that I pass to gcc using setup.py ? Best regards, -- Gabriel Ganne -- Gabriel Ganne -------------- next part -------------- An HTML attachment was scrubbed... URL: From ja.geb at me.com Tue Mar 7 06:06:35 2017 From: ja.geb at me.com (Jannis Gebauer) Date: Tue, 07 Mar 2017 12:06:35 +0100 Subject: [Distutils] Data on requirement files on GitHub Message-ID: <8D76AAE7-A57C-4CB4-97A1-96979CCA12B1@me.com> Hi, I ran a couple of queries against GitHubs public big query dataset [0] last week. I?m interested in requirement files in particular, so I ran a query extracting all available requirement files. Since queries against this dataset are rather expensive ($7 on all repos), I thought I?d share the raw data here [1]. The data contains the repo name, the requirements file path and the contents of the file. Every line represents a JSON blob, read it with: with open('data.json') as f: for line in f.readlines(): data = json.loads(line) Maybe that?s of interest to some of you. If you have any ideas on what to do with the data, please let me know. ? Jannis Gebauer [0]: https://cloud.google.com/bigquery/public-data/github [1]: https://github.com/jayfk/requirements-dataset -------------- next part -------------- An HTML attachment was scrubbed... URL: From prometheus235 at gmail.com Wed Mar 8 11:36:16 2017 From: prometheus235 at gmail.com (Nick Timkovich) Date: Wed, 8 Mar 2017 10:36:16 -0600 Subject: [Distutils] Data on requirement files on GitHub In-Reply-To: <8D76AAE7-A57C-4CB4-97A1-96979CCA12B1@me.com> References: <8D76AAE7-A57C-4CB4-97A1-96979CCA12B1@me.com> Message-ID: Looks like a fun chunk of data, what's the query you used? Can you add a README to the repo with some description if others want to iterate on it (maybe look into setup.py's?) Nick On Tue, Mar 7, 2017 at 5:06 AM, Jannis Gebauer wrote: > Hi, > > I ran a couple of queries against GitHubs public big query dataset [0] > last week. I?m interested in requirement files in particular, so I ran a > query extracting all available requirement files. > > Since queries against this dataset are rather expensive ($7 on all repos), > I thought I?d share the raw data here [1]. The data contains the repo name, > the requirements file path and the contents of the file. Every line > represents a JSON blob, read it with: > > with open('data.json') as f: > for line in f.readlines(): > data = json.loads(line) > > Maybe that?s of interest to some of you. > > If you have any ideas on what to do with the data, please let me know. > > ? > > Jannis Gebauer > > > > [0]: https://cloud.google.com/bigquery/public-data/github > [1]: https://github.com/jayfk/requirements-dataset > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lkraider at gmail.com Thu Mar 9 17:39:26 2017 From: lkraider at gmail.com (Paul Eipper) Date: Thu, 9 Mar 2017 19:39:26 -0300 Subject: [Distutils] Data on requirement files on GitHub In-Reply-To: References: <8D76AAE7-A57C-4CB4-97A1-96979CCA12B1@me.com> Message-ID: I had some fun parsing and plotting the data (very simple, just the top packages for now). See here: https://github.com/lkraider/requirements-dataset/blob/master/index.ipynb Let me know if you would accept a pull request so others can use that as a starting point. att, -- Paul Eipper On Wed, Mar 8, 2017 at 1:36 PM, Nick Timkovich wrote: > Looks like a fun chunk of data, what's the query you used? Can you add a > README to the repo with some description if others want to iterate on it > (maybe look into setup.py's?) > > Nick > > On Tue, Mar 7, 2017 at 5:06 AM, Jannis Gebauer wrote: > >> Hi, >> >> I ran a couple of queries against GitHubs public big query dataset [0] >> last week. I?m interested in requirement files in particular, so I ran a >> query extracting all available requirement files. >> >> Since queries against this dataset are rather expensive ($7 on all >> repos), I thought I?d share the raw data here [1]. The data contains the >> repo name, the requirements file path and the contents of the file. Every >> line represents a JSON blob, read it with: >> >> with open('data.json') as f: >> for line in f.readlines(): >> data = json.loads(line) >> >> Maybe that?s of interest to some of you. >> >> If you have any ideas on what to do with the data, please let me know. >> >> ? >> >> Jannis Gebauer >> >> >> >> [0]: https://cloud.google.com/bigquery/public-data/github >> [1]: https://github.com/jayfk/requirements-dataset >> >> _______________________________________________ >> Distutils-SIG maillist - Distutils-SIG at python.org >> https://mail.python.org/mailman/listinfo/distutils-sig >> >> > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lkraider at gmail.com Thu Mar 9 17:41:11 2017 From: lkraider at gmail.com (Paul Eipper) Date: Thu, 9 Mar 2017 19:41:11 -0300 Subject: [Distutils] Data on requirement files on GitHub In-Reply-To: References: <8D76AAE7-A57C-4CB4-97A1-96979CCA12B1@me.com>

Message-ID: PS: took 2 hours to parse the dataset into the linearized version (stored as "parsed.json") on my notebook. -- Paul Eipper On Thu, Mar 9, 2017 at 7:39 PM, Paul Eipper wrote: > I had some fun parsing and plotting the data (very simple, just the top > packages for now). See here: > https://github.com/lkraider/requirements-dataset/blob/master/index.ipynb > > Let me know if you would accept a pull request so others can use that as a > starting point. > > att, > > > -- > Paul Eipper > > On Wed, Mar 8, 2017 at 1:36 PM, Nick Timkovich > wrote: > >> Looks like a fun chunk of data, what's the query you used? Can you add a >> README to the repo with some description if others want to iterate on it >> (maybe look into setup.py's?) >> >> Nick >> >> On Tue, Mar 7, 2017 at 5:06 AM, Jannis Gebauer wrote: >> >>> Hi, >>> >>> I ran a couple of queries against GitHubs public big query dataset [0] >>> last week. I?m interested in requirement files in particular, so I ran a >>> query extracting all available requirement files. >>> >>> Since queries against this dataset are rather expensive ($7 on all >>> repos), I thought I?d share the raw data here [1]. The data contains the >>> repo name, the requirements file path and the contents of the file. Every >>> line represents a JSON blob, read it with: >>> >>> with open('data.json') as f: >>> for line in f.readlines(): >>> data = json.loads(line) >>> >>> Maybe that?s of interest to some of you. >>> >>> If you have any ideas on what to do with the data, please let me know. >>> >>> ? >>> >>> Jannis Gebauer >>> >>> >>> >>> [0]: https://cloud.google.com/bigquery/public-data/github >>> [1]: https://github.com/jayfk/requirements-dataset >>> >>> _______________________________________________ >>> Distutils-SIG maillist - Distutils-SIG at python.org >>> https://mail.python.org/mailman/listinfo/distutils-sig >>> >>> >> >> _______________________________________________ >> Distutils-SIG maillist - Distutils-SIG at python.org >> https://mail.python.org/mailman/listinfo/distutils-sig >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Thu Mar 9 22:57:14 2017 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 9 Mar 2017 21:57:14 -0600 Subject: [Distutils] Data on requirement files on GitHub In-Reply-To: <8D76AAE7-A57C-4CB4-97A1-96979CCA12B1@me.com> References: <8D76AAE7-A57C-4CB4-97A1-96979CCA12B1@me.com> Message-ID: https://en.wikipedia.org/wiki/BigQuery BigQuery Dashboards - http://bigqueri.es/c/github-archive - https://redash.io/data-sources/google-bigquery - https://github.com/getredash/redash - https://github.com/getredash/redash/blob/master/requirements.txt - https://github.com/getredash/redash/blob/master/Dockerfile - https://github.com/docker/docker/blob/master/builder/dockerfile/parser/parser.go - https://github.com/DBuildService/dockerfile-parse/issues - https://github.com/getredash/redash/blob/master/docker-compose.yml Software Configuration Management / Dependency Management applications for BigQuery: - https://opensource.googleblog.com/2017/03/operation-rosehub.html - "Googlers used BigQuery and GitHub to patch thousands of vulnerable projects" https://www.reddit.com/r/bigquery/comments/5x0x5z/googlers_used_bigquery_and_github_to_patch/ BigQuery Python Libraries google-cloud-bigquery - | Src: https://github.com/GoogleCloudPlatform/google-cloud-python - | Pypi: https://pypi.python.org/pypi/google-cloud-bigquery - | Docs: https://cloud.google.com/bigquery/docs/reference/libraries#client-libraries-resources-python google-api-python-client - | Src: https://github.com/google/google-api-python-client - | Pypi: https://pypi.python.org/pypi/google-api-python-client - pandas.io.gbq uses google-api-python-client: - Docs: http://pandas.pydata.org/pandas-docs/stable/io.html#google-bigquery-experimental - read_gbq() http://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.gbq.read_gbq.html#pandas.io.gbq.read_gbq - to_gbq() http://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.gbq.to_gbq.html#pandas-io-gbq-to-gbq Open Source Big Data Components for things like BigQuery: Apache Drill - | Wikipedia: https://en.wikipedia.org/wiki/Apache_Drill - Apache Drill is similar to Google Dremel (which powers Google BigQuery) - https://pypi.python.org/pypi/drillpy Apache Beam - | Wikipedia: https://en.wikipedia.org/wiki/Apache_Beam - | Src: https://github.com/apache/beam - | Docs: https://beam.apache.org/documentation/sdks/python/ - | Docs: https://beam.apache.org/get-started/quickstart-py/ - | Docs: https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples - Google Cloud Dataflow is now of Apache Beam - https://cloud.google.com/dataflow/model/bigquery-io Parsing (and MAINTAINING) Pip Requirements.txt Files: - | Src: https://github.com/pypa/pip/tree/master/pip/req - https://github.com/pypa/pip/issues/3884#issuecomment-236454008 - https://github.com/pypa/pip/issues/1479 - -> Pipfile, Pipfile.lock (``pipenv install pkgname --dev``) - https://github.com/pyupio/safety-db#tools - https://pyup.io/ - https://libraries.io/github/librariesio/pydeps - https://github.com/librariesio/pydeps - https://libraries.io/ - Pipfile, Pipfile.lock - | PyPI: https://pypi.python.org/pypi/pipenv - | PyPI: https://pypi.python.org/pypi/requirements-parser - | PyPI: https://pypi.python.org/pypi/pipfile - | Src: https://github.com/kennethreitz/pipenv - These save to the Pipfile: - ``pipenv install pkgname`` - ``pipenv install pkgname --dev`` - https://github.com/kennethreitz/pipenv/blob/master/pipenv/utils.py - pip reqs.txt <--> Pipfile ... Thought I'd get these together; hopefully they're useful. Cool Jupyter notebook! ( https://github.com/lkraider/requirements-dataset/blob/master/index.ipynb ) On Tue, Mar 7, 2017 at 5:06 AM, Jannis Gebauer wrote: > Hi, > > I ran a couple of queries against GitHubs public big query dataset [0] > last week. I?m interested in requirement files in particular, so I ran a > query extracting all available requirement files. > > Since queries against this dataset are rather expensive ($7 on all repos), > I thought I?d share the raw data here [1]. The data contains the repo name, > the requirements file path and the contents of the file. Every line > represents a JSON blob, read it with: > > with open('data.json') as f: > for line in f.readlines(): > data = json.loads(line) > > Maybe that?s of interest to some of you. > > If you have any ideas on what to do with the data, please let me know. > > ? > > Jannis Gebauer > > > > [0]: https://cloud.google.com/bigquery/public-data/github > [1]: https://github.com/jayfk/requirements-dataset > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Mar 10 04:26:41 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 10 Mar 2017 19:26:41 +1000 Subject: [Distutils] PEP 426 moved back to Draft status Message-ID: Hi folks, After a few years of dormancy, I've finally moved the metadata 2.0 specification back to Draft status: https://github.com/python/peps/commit/8ae8b612d4ea8b3bf5d8a7b795ae8aec48bbb7a3 Based on our last round of discussion, I've culled a lot of the complexity around dependency declarations, cutting it back to just 4 pre-declared extras (dev, doc, build, test), and some reserved extras that can be used to say "don't install this, even though you normally would" (self, runtime). I've also deleted a lot of the text related to thing that we now don't need to worry about until the first few standard metadata extensions are being defined. I think the biggest thing it needs right now is a major editing pass from someone that isn't me to help figure out which explanatory sections can be culled completely, while still having the specification itself make sense. >From a technical point of view, the main "different from today" piece that we have left is the Provide & Obsoleted-By fields, and I'm seriously wondering if it might make sense to just delete those entirely for now, and reconsider them later as a potential metadata extension. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Mar 10 09:52:34 2017 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 10 Mar 2017 06:52:34 -0800 Subject: [Distutils] PEP 426 moved back to Draft status In-Reply-To: References: Message-ID: On Fri, Mar 10, 2017 at 1:26 AM, Nick Coghlan wrote: > Hi folks, > > After a few years of dormancy, I've finally moved the metadata 2.0 > specification back to Draft status: > https://github.com/python/peps/commit/8ae8b612d4ea8b3bf5d8a7b795ae8aec48bbb7a3 We have lots of metadata files in the wild that already claim to be version 2.0. If you're reviving this I think you might need to change the version number? > Based on our last round of discussion, I've culled a lot of the complexity > around dependency declarations, cutting it back to just 4 pre-declared > extras (dev, doc, build, test), I think we can drop 'build' in favor of pyproject.toml? Actually all of the pre-declared extras are really relevant for sdists rather than wheels. Maybe they should all move into pyproject.toml? > and some reserved extras that can be used to > say "don't install this, even though you normally would" (self, runtime). Hmm. While it's not the most urgent problem we face, I really think in the long run we need to move the extras system to something like: https://mail.python.org/pipermail/distutils-sig/2015-October/027364.html The current extras system is inherently broken with respect to upgrades, and reified extras would solve this, along with several other intractable problems (e.g. numpy ABI tracking). So from that perspective, I'm wary of adding new special case "magic" to the extras system. Adding conventional names for things like test-dependencies is fine, that doesn't pose any new obstacles to a future migration. But adding complexity to the "extras language" like "*", "self", "runtime", etc. does make it harder to change how extras work in the future. I feel like most of the value we get out of these could be had by just standardizing the existing convention that packages should have an explicit "all" extra that includes all the feature-based extras, but not the special development extras? This also provides flexibility for cases like, a package where there are two extras that conflict with each other -- the package authors can pick which one they recommend to put into "all". > I've also deleted a lot of the text related to thing that we now don't need > to worry about until the first few standard metadata extensions are being > defined. > > I think the biggest thing it needs right now is a major editing pass from > someone that isn't me to help figure out which explanatory sections can be > culled completely, while still having the specification itself make sense. > > From a technical point of view, the main "different from today" piece that > we have left is the Provide & Obsoleted-By fields, and I'm seriously > wondering if it might make sense to just delete those entirely for now, and > reconsider them later as a potential metadata extension. Overall the vibe I get from the Provides and Obsoleted-By sections is that these are surprisingly complicated and could really do with their own PEP, yeah, where the spec will have room to breathe and properly cover all the details. In particular, the language in the "provides" spec about how the interpretation of the metadata depends on whether you get it from a public index server versus somewhere else makes me really nervous. Experience suggests that splitting up packaging PEPs is basically never a bad idea, right? :-) As a general note I guess I should say that I'm still not convinced that migrating to json is worth the effort, but you've heard those arguments before and I don't have anything new to add now, so :-). -n -- Nathaniel J. Smith -- https://vorpus.org From ncoghlan at gmail.com Fri Mar 10 10:55:49 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 11 Mar 2017 01:55:49 +1000 Subject: [Distutils] PEP 426 moved back to Draft status In-Reply-To: References:

Message-ID: On 11 March 2017 at 00:52, Nathaniel Smith wrote: > On Fri, Mar 10, 2017 at 1:26 AM, Nick Coghlan wrote: > > Hi folks, > > > > After a few years of dormancy, I've finally moved the metadata 2.0 > > specification back to Draft status: > > https://github.com/python/peps/commit/8ae8b612d4ea8b3bf5d8a7b795ae8a > ec48bbb7a3 > > We have lots of metadata files in the wild that already claim to be > version 2.0. If you're reviving this I think you might need to change > the version number? > They're mostly in metadata.json files, though. That said, version numbers are cheap, so I'm happy to skip straight to 3.0 if folks think it makes more sense. > > Based on our last round of discussion, I've culled a lot of the > complexity > > around dependency declarations, cutting it back to just 4 pre-declared > > extras (dev, doc, build, test), > > I think we can drop 'build' in favor of pyproject.toml? > No, as that's a human edited input file, not an output file from the sdist generation process. > Actually all of the pre-declared extras are really relevant for sdists > rather than wheels. Maybe they should all move into pyproject.toml? > Think "static release metadata in an API response from PyPI" for this particular specification, rather than something you'd necessarily check into source control. That's actually one of the big benefits of doing this post pyproject.toml - with that taking care of the build system bootstrapping problem, it frees up pydist.json to be entirely an artifact of the sdist generation process (and then copying it along to the wheel archives and the installed package as well). That said, that's actually an important open question: is pydist.json always preserved unmodified through the sdist->wheel->install and sdist->install process? There's a lot to be said for treating the file as immutable, and instead adding *other* metadata files as a component moves through the distribution process. If so, then it may actually be more appropriate to call the rendered file "pysdist.json", since it contains the sdist metadata specifically, rather than arbitrary distribution metadata. > > > and some reserved extras that can be used to > > say "don't install this, even though you normally would" (self, runtime). > > Hmm. While it's not the most urgent problem we face, I really think in > the long run we need to move the extras system to something like: > > https://mail.python.org/pipermail/distutils-sig/2015- > October/027364.html > > The current extras system is inherently broken with respect to > upgrades, and reified extras would solve this, along with several > other intractable problems (e.g. numpy ABI tracking). > > So from that perspective, I'm wary of adding new special case "magic" > to the extras system. Adding conventional names for things like > test-dependencies is fine, that doesn't pose any new obstacles to a > future migration. But adding complexity to the "extras language" like > "*", "self", "runtime", etc. does make it harder to change how extras > work in the future. > Technically the only part of that which the PEP really locks in is barring the use of "self" and "runtime" as extras names (which needs to be validated by a check against currently published metadata to see if anyone is already using them). '*' is already illegal due to the naming rules, and the '-extra' syntax is also an illegal name, so neither of those actually impacts the metadata format, only what installation tools allow. The main purpose of having them in the PEP is to disallow using those spellings for anything else and instead reserve them for the purposes described in the PEP. I'd also be fairly strongly opposed to converting extras from an optional dependency management system to a "let multiple PyPI packages target the same site-packages subdirectory" because we already know that's a nightmare from the Linux distro experience (having a clear "main" package that owns the parent directory with optional subpackages solves *some* of the problems, but my main reaction is still "Run awaaay"). It especially isn't needed just to solve the "pip forgets what extras it installed" problem - that technically doesn't even need a PEP to resolve, it just needs pip to drop a pip specific file into the PEP 376 dist-info directory that says what extras to request when doing future upgrades. Similarly, the import system offers so much flexibility in checking for optional packages at startup and lying about where imports are coming from that it would be hard to convince me that installation customisation to use particular optional dependencies *had* to be done at install time. > I feel like most of the value we get out of these could be had by just > standardizing the existing convention that packages should have an > explicit "all" extra that includes all the feature-based extras, That's the first I've heard of that convention, so it may not be as widespread as you thought it was :) > but > not the special development extras? This also provides flexibility for > cases like, a package where there are two extras that conflict with > each other -- the package authors can pick which one they recommend to > put into "all". > That's actually the main problem I had with '*' - it didn't work anywhere near as nicely once the semantic dependencies were migrated over to being part of the extras system. Repeating the same dependencies under multiple extra names in order to model pseudo-sets seems error prone and messy to me, though. So perhaps we should add the notion of "extra_sets" as a first class entity, where they're named sets of declared extras? And if you don't declare an "all" set explicitly, you get an implied one that consists of all your declared extras. For migration of existing metadata that uses "all" as a normal extra, the translation would be: - declared extras are added to "all" in order until all of the dependencies in all are covered or all declared extras are included - any dependency in "all" that isn't in another extra gets added to a new "_all" extra - "extras" and "extra_sets" are populated accordingly Tools consuming the metadata would then just need to read "extra_sets" and expand any named sets before passing the list of extras over to their existing dependency processing machinery. > I've also deleted a lot of the text related to thing that we now don't > need > > to worry about until the first few standard metadata extensions are being > > defined. > > > > I think the biggest thing it needs right now is a major editing pass from > > someone that isn't me to help figure out which explanatory sections can > be > > culled completely, while still having the specification itself make > sense. > > > > From a technical point of view, the main "different from today" piece > that > > we have left is the Provide & Obsoleted-By fields, and I'm seriously > > wondering if it might make sense to just delete those entirely for now, > and > > reconsider them later as a potential metadata extension. > > Overall the vibe I get from the Provides and Obsoleted-By sections is > that these are surprisingly complicated and could really do with their > own PEP, yeah, where the spec will have room to breathe and properly > cover all the details. > > In particular, the language in the "provides" spec about how the > interpretation of the metadata depends on whether you get it from a > public index server versus somewhere else makes me really nervous. > Yeah, virtual provides are a security nightmare on a public index server - distros are only able to get away with it because they maintain relatively strict control over the package review process. > Experience suggests that splitting up packaging PEPs is basically > never a bad idea, right? :-) > Indeed :) OK, I'll put them on the chopping block too, under the assumption they may come back as an extension some day if it ever makes it to the top of someone's list of "thing that bothers them enough about Python packaging to do something about it". > As a general note I guess I should say that I'm still not convinced > that migrating to json is worth the effort, but you've heard those > arguments before and I don't have anything new to add now, so :-). > The main benefit I see will be to empower utility APIs like distlib (and potentially Warehouse itself) to better hide both the historical and migratory cruft by translating everything to the PEP 426 format, even if the source artifact only includes the legacy metadata. Unless the plumbing actually breaks, nobody other than the plumber cares when it's a mess, as long as the porcelain is shiny and clean :) Cheers, Nick. P.S. Something I'm getting out of this experience: if you can afford to sit on your hands for 3-4 years, that's a *really good way* to avoid falling prey to "second system syndrome" [1] :) P.P.S Having no budget to pay anyone else and only limited time and attention of your own also turns out to make it easier to avoid ;) [1] http://coliveira.net/software/what-is-second-system-syndrome/ -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Fri Mar 10 13:14:00 2017 From: brett at python.org (Brett Cannon) Date: Fri, 10 Mar 2017 18:14:00 +0000 Subject: [Distutils] PEP 426 moved back to Draft status In-Reply-To: References:

Message-ID: On Fri, 10 Mar 2017 at 07:56 Nick Coghlan wrote: > On 11 March 2017 at 00:52, Nathaniel Smith wrote: > > On Fri, Mar 10, 2017 at 1:26 AM, Nick Coghlan wrote: > > Hi folks, > > > > After a few years of dormancy, I've finally moved the metadata 2.0 > > specification back to Draft status: > > > https://github.com/python/peps/commit/8ae8b612d4ea8b3bf5d8a7b795ae8aec48bbb7a3 > > We have lots of metadata files in the wild that already claim to be > version 2.0. If you're reviving this I think you might need to change > the version number? > > > They're mostly in metadata.json files, though. That said, version numbers > are cheap, so I'm happy to skip straight to 3.0 if folks think it makes > more sense. > +1 on jumping. > > > > Based on our last round of discussion, I've culled a lot of the > complexity > > around dependency declarations, cutting it back to just 4 pre-declared > > extras (dev, doc, build, test), > > I think we can drop 'build' in favor of pyproject.toml? > > > No, as that's a human edited input file, not an output file from the sdist > generation process. > > > Actually all of the pre-declared extras are really relevant for sdists > rather than wheels. Maybe they should all move into pyproject.toml? > > > Think "static release metadata in an API response from PyPI" for this > particular specification, rather than something you'd necessarily check > into source control. > Or "stuff PyPI has to parse, not you". ;) > That's actually one of the big benefits of doing this post pyproject.toml > - with that taking care of the build system bootstrapping problem, it > frees up pydist.json to be entirely an artifact of the sdist generation > process (and then copying it along to the wheel archives and the installed > package as well). > > That said, that's actually an important open question: is pydist.json > always preserved unmodified through the sdist->wheel->install and > sdist->install process? > Is there a reason not to? > > There's a lot to be said for treating the file as immutable, and instead > adding *other* metadata files as a component moves through the distribution > process. If so, then it may actually be more appropriate to call the > rendered file "pysdist.json", since it contains the sdist metadata > specifically, rather than arbitrary distribution metadata. > Since this is meant for tool consumption and not human consumption, breaking the steps into individual files so that they are considered immutable by tools farther down the toolchain makes sense to me. > > > > and some reserved extras that can be used to > > say "don't install this, even though you normally would" (self, runtime). > > Hmm. While it's not the most urgent problem we face, I really think in > the long run we need to move the extras system to something like: > > > https://mail.python.org/pipermail/distutils-sig/2015-October/027364.html > > The current extras system is inherently broken with respect to > upgrades, and reified extras would solve this, along with several > other intractable problems (e.g. numpy ABI tracking). > > So from that perspective, I'm wary of adding new special case "magic" > to the extras system. Adding conventional names for things like > test-dependencies is fine, that doesn't pose any new obstacles to a > future migration. But adding complexity to the "extras language" like > "*", "self", "runtime", etc. does make it harder to change how extras > work in the future. > > > Technically the only part of that which the PEP really locks in is barring > the use of "self" and "runtime" as extras names (which needs to be > validated by a check against currently published metadata to see if anyone > is already using them). > Do you have something planned for these names? > > '*' is already illegal due to the naming rules, and the '-extra' syntax is > also an illegal name, so neither of those actually impacts the metadata > format, only what installation tools allow. The main purpose of having them > in the PEP is to disallow using those spellings for anything else and > instead reserve them for the purposes described in the PEP. > > I'd also be fairly strongly opposed to converting extras from an optional > dependency management system to a "let multiple PyPI packages target the > same site-packages subdirectory" because we already know that's a nightmare > from the Linux distro experience (having a clear "main" package that owns > the parent directory with optional subpackages solves *some* of the > problems, but my main reaction is still "Run awaaay"). > > It especially isn't needed just to solve the "pip forgets what extras it > installed" problem - that technically doesn't even need a PEP to resolve, > it just needs pip to drop a pip specific file into the PEP 376 dist-info > directory that says what extras to request when doing future upgrades. > Similarly, the import system offers so much flexibility in checking for > optional packages at startup and lying about where imports are coming from > that it would be hard to convince me that installation customisation to use > particular optional dependencies *had* to be done at install time. > > > I feel like most of the value we get out of these could be had by just > standardizing the existing convention that packages should have an > explicit "all" extra that includes all the feature-based extras, > > > That's the first I've heard of that convention, so it may not be as > widespread as you thought it was :) > > > but > not the special development extras? This also provides flexibility for > cases like, a package where there are two extras that conflict with > each other -- the package authors can pick which one they recommend to > put into "all". > > > That's actually the main problem I had with '*' - it didn't work anywhere > near as nicely once the semantic dependencies were migrated over to being > part of the extras system. > > Repeating the same dependencies under multiple extra names in order to > model pseudo-sets seems error prone and messy to me, though. > > So perhaps we should add the notion of "extra_sets" as a first class > entity, where they're named sets of declared extras? And if you don't > declare an "all" set explicitly, you get an implied one that consists of > all your declared extras. > I think that's a tool decision that doesn't tie into the PEP (unless you're going to ban the use of the name "all"). > > For migration of existing metadata that uses "all" as a normal extra, the > translation would be: > > - declared extras are added to "all" in order until all of the > dependencies in all are covered or all declared extras are included > - any dependency in "all" that isn't in another extra gets added to a new > "_all" extra > - "extras" and "extra_sets" are populated accordingly > > Tools consuming the metadata would then just need to read "extra_sets" and > expand any named sets before passing the list of extras over to their > existing dependency processing machinery. > If this is meant to be generated by pyproject.toml consumers then I think it should be up to the build tools to support that concept. Then the build tools can statically declare the union of some extras to get extra sets since the information isn't changing once the pydist.json file is generated (dynamic calculation is only necessary if the value could change between data generation and consumption). > > > I've also deleted a lot of the text related to thing that we now don't > need > > to worry about until the first few standard metadata extensions are being > > defined. > > > > I think the biggest thing it needs right now is a major editing pass from > > someone that isn't me to help figure out which explanatory sections can > be > > culled completely, while still having the specification itself make > sense. > > > > From a technical point of view, the main "different from today" piece > that > > we have left is the Provide & Obsoleted-By fields, and I'm seriously > > wondering if it might make sense to just delete those entirely for now, > and > > reconsider them later as a potential metadata extension. > > Overall the vibe I get from the Provides and Obsoleted-By sections is > that these are surprisingly complicated and could really do with their > own PEP, yeah, where the spec will have room to breathe and properly > cover all the details. > > In particular, the language in the "provides" spec about how the > interpretation of the metadata depends on whether you get it from a > public index server versus somewhere else makes me really nervous. > > > Yeah, virtual provides are a security nightmare on a public index server - > distros are only able to get away with it because they maintain relatively > strict control over the package review process. > > > Experience suggests that splitting up packaging PEPs is basically > never a bad idea, right? :-) > > > Indeed :) > > OK, I'll put them on the chopping block too, under the assumption they may > come back as an extension some day if it ever makes it to the top of > someone's list of "thing that bothers them enough about Python packaging to > do something about it". > > > As a general note I guess I should say that I'm still not convinced > that migrating to json is worth the effort, but you've heard those > arguments before and I don't have anything new to add now, so :-). > > > The main benefit I see will be to empower utility APIs like distlib (and > potentially Warehouse itself) to better hide both the historical and > migratory cruft by translating everything to the PEP 426 format, even if > the source artifact only includes the legacy metadata. Unless the plumbing > actually breaks, nobody other than the plumber cares when it's a mess, as > long as the porcelain is shiny and clean :) > > Cheers, > Nick. > > P.S. Something I'm getting out of this experience: if you can afford to > sit on your hands for 3-4 years, that's a *really good way* to avoid > falling prey to "second system syndrome" [1] :) > > P.P.S Having no budget to pay anyone else and only limited time and > attention of your own also turns out to make it easier to avoid ;) > Yes, getting to stew on an idea for any length of time lets those random ideas one gets to properly die when they are bad. ;) -------------- next part -------------- An HTML attachment was scrubbed... URL: From dholth at gmail.com Fri Mar 10 16:03:20 2017 From: dholth at gmail.com (Daniel Holth) Date: Fri, 10 Mar 2017 21:03:20 +0000 Subject: [Distutils] PEP 426 moved back to Draft status In-Reply-To: References:

Message-ID: You lost me a bit at 'extra sets'. FYI it is already possible to depend on your own extras in another extra. Extra pseudo code: spampackage extra['spam'] = 'spampackage[eggs]' extra['eggs'] = ... +1 on extras. The extras feature has the wonderful property that people understand it. Lots of projects have a 'test' extra instead of tests_require for example, and you don't have to look up how to install them. On Fri, Mar 10, 2017 at 1:14 PM Brett Cannon wrote: On Fri, 10 Mar 2017 at 07:56 Nick Coghlan wrote: On 11 March 2017 at 00:52, Nathaniel Smith wrote: On Fri, Mar 10, 2017 at 1:26 AM, Nick Coghlan wrote: > Hi folks, > > After a few years of dormancy, I've finally moved the metadata 2.0 > specification back to Draft status: > https://github.com/python/peps/commit/8ae8b612d4ea8b3bf5d8a7b795ae8aec48bbb7a3 We have lots of metadata files in the wild that already claim to be version 2.0. If you're reviving this I think you might need to change the version number? They're mostly in metadata.json files, though. That said, version numbers are cheap, so I'm happy to skip straight to 3.0 if folks think it makes more sense. +1 on jumping. > Based on our last round of discussion, I've culled a lot of the complexity > around dependency declarations, cutting it back to just 4 pre-declared > extras (dev, doc, build, test), I think we can drop 'build' in favor of pyproject.toml? No, as that's a human edited input file, not an output file from the sdist generation process. Actually all of the pre-declared extras are really relevant for sdists rather than wheels. Maybe they should all move into pyproject.toml? Think "static release metadata in an API response from PyPI" for this particular specification, rather than something you'd necessarily check into source control. Or "stuff PyPI has to parse, not you". ;) That's actually one of the big benefits of doing this post pyproject.toml - with that taking care of the build system bootstrapping problem, it frees up pydist.json to be entirely an artifact of the sdist generation process (and then copying it along to the wheel archives and the installed package as well). That said, that's actually an important open question: is pydist.json always preserved unmodified through the sdist->wheel->install and sdist->install process? Is there a reason not to? There's a lot to be said for treating the file as immutable, and instead adding *other* metadata files as a component moves through the distribution process. If so, then it may actually be more appropriate to call the rendered file "pysdist.json", since it contains the sdist metadata specifically, rather than arbitrary distribution metadata. Since this is meant for tool consumption and not human consumption, breaking the steps into individual files so that they are considered immutable by tools farther down the toolchain makes sense to me. > and some reserved extras that can be used to > say "don't install this, even though you normally would" (self, runtime). Hmm. While it's not the most urgent problem we face, I really think in the long run we need to move the extras system to something like: https://mail.python.org/pipermail/distutils-sig/2015-October/027364.html The current extras system is inherently broken with respect to upgrades, and reified extras would solve this, along with several other intractable problems (e.g. numpy ABI tracking). So from that perspective, I'm wary of adding new special case "magic" to the extras system. Adding conventional names for things like test-dependencies is fine, that doesn't pose any new obstacles to a future migration. But adding complexity to the "extras language" like "*", "self", "runtime", etc. does make it harder to change how extras work in the future. Technically the only part of that which the PEP really locks in is barring the use of "self" and "runtime" as extras names (which needs to be validated by a check against currently published metadata to see if anyone is already using them). Do you have something planned for these names? '*' is already illegal due to the naming rules, and the '-extra' syntax is also an illegal name, so neither of those actually impacts the metadata format, only what installation tools allow. The main purpose of having them in the PEP is to disallow using those spellings for anything else and instead reserve them for the purposes described in the PEP. I'd also be fairly strongly opposed to converting extras from an optional dependency management system to a "let multiple PyPI packages target the same site-packages subdirectory" because we already know that's a nightmare from the Linux distro experience (having a clear "main" package that owns the parent directory with optional subpackages solves *some* of the problems, but my main reaction is still "Run awaaay"). It especially isn't needed just to solve the "pip forgets what extras it installed" problem - that technically doesn't even need a PEP to resolve, it just needs pip to drop a pip specific file into the PEP 376 dist-info directory that says what extras to request when doing future upgrades. Similarly, the import system offers so much flexibility in checking for optional packages at startup and lying about where imports are coming from that it would be hard to convince me that installation customisation to use particular optional dependencies *had* to be done at install time. I feel like most of the value we get out of these could be had by just standardizing the existing convention that packages should have an explicit "all" extra that includes all the feature-based extras, That's the first I've heard of that convention, so it may not be as widespread as you thought it was :) but not the special development extras? This also provides flexibility for cases like, a package where there are two extras that conflict with each other -- the package authors can pick which one they recommend to put into "all". That's actually the main problem I had with '*' - it didn't work anywhere near as nicely once the semantic dependencies were migrated over to being part of the extras system. Repeating the same dependencies under multiple extra names in order to model pseudo-sets seems error prone and messy to me, though. So perhaps we should add the notion of "extra_sets" as a first class entity, where they're named sets of declared extras? And if you don't declare an "all" set explicitly, you get an implied one that consists of all your declared extras. I think that's a tool decision that doesn't tie into the PEP (unless you're going to ban the use of the name "all"). For migration of existing metadata that uses "all" as a normal extra, the translation would be: - declared extras are added to "all" in order until all of the dependencies in all are covered or all declared extras are included - any dependency in "all" that isn't in another extra gets added to a new "_all" extra - "extras" and "extra_sets" are populated accordingly Tools consuming the metadata would then just need to read "extra_sets" and expand any named sets before passing the list of extras over to their existing dependency processing machinery. If this is meant to be generated by pyproject.toml consumers then I think it should be up to the build tools to support that concept. Then the build tools can statically declare the union of some extras to get extra sets since the information isn't changing once the pydist.json file is generated (dynamic calculation is only necessary if the value could change between data generation and consumption). > I've also deleted a lot of the text related to thing that we now don't need > to worry about until the first few standard metadata extensions are being > defined. > > I think the biggest thing it needs right now is a major editing pass from > someone that isn't me to help figure out which explanatory sections can be > culled completely, while still having the specification itself make sense. > > From a technical point of view, the main "different from today" piece that > we have left is the Provide & Obsoleted-By fields, and I'm seriously > wondering if it might make sense to just delete those entirely for now, and > reconsider them later as a potential metadata extension. Overall the vibe I get from the Provides and Obsoleted-By sections is that these are surprisingly complicated and could really do with their own PEP, yeah, where the spec will have room to breathe and properly cover all the details. In particular, the language in the "provides" spec about how the interpretation of the metadata depends on whether you get it from a public index server versus somewhere else makes me really nervous. Yeah, virtual provides are a security nightmare on a public index server - distros are only able to get away with it because they maintain relatively strict control over the package review process. Experience suggests that splitting up packaging PEPs is basically never a bad idea, right? :-) Indeed :) OK, I'll put them on the chopping block too, under the assumption they may come back as an extension some day if it ever makes it to the top of someone's list of "thing that bothers them enough about Python packaging to do something about it". As a general note I guess I should say that I'm still not convinced that migrating to json is worth the effort, but you've heard those arguments before and I don't have anything new to add now, so :-). The main benefit I see will be to empower utility APIs like distlib (and potentially Warehouse itself) to better hide both the historical and migratory cruft by translating everything to the PEP 426 format, even if the source artifact only includes the legacy metadata. Unless the plumbing actually breaks, nobody other than the plumber cares when it's a mess, as long as the porcelain is shiny and clean :) Cheers, Nick. P.S. Something I'm getting out of this experience: if you can afford to sit on your hands for 3-4 years, that's a *really good way* to avoid falling prey to "second system syndrome" [1] :) P.P.S Having no budget to pay anyone else and only limited time and attention of your own also turns out to make it easier to avoid ;) Yes, getting to stew on an idea for any length of time lets those random ideas one gets to properly die when they are bad. ;) _______________________________________________ Distutils-SIG maillist - Distutils-SIG at python.org https://mail.python.org/mailman/listinfo/distutils-sig -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Mar 10 23:17:58 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 11 Mar 2017 14:17:58 +1000 Subject: [Distutils] PEP 426 moved back to Draft status In-Reply-To: References:

Message-ID: On 11 March 2017 at 07:03, Daniel Holth wrote: > You lost me a bit at 'extra sets'. FYI it is already possible to depend on > your own extras in another extra. > > Extra pseudo code: > spampackage > extra['spam'] = 'spampackage[eggs]' > extra['eggs'] = ... > Oh, nice. In that case, we can drop the '*' idea and just make "all" another pre-declared extra with a SHOULD that says sdist build tools should implicitly populate it as: { "requires": "thisproject[extra1,extra2,extra3,extra4]" "extra": "all" } given an extras clause containing '["extra1",'extra2","extra3","extra4"]'. Endorsing that approach to handling "extra sets" does impose a design constraint though, which is that installation tools will need to special-case self-referential requirements so they don't get stuck in a recursive loop. (That will become a new MUST in the spec) That just leaves the question of how to install build & test requirements without installing the project itself, and I guess we don't actually need to handle that at the Python metadata level - it can be done by external tools. For example, in the pyp2rpm case, it's handled by the translation to BuildRequires and Requires terms at the RPM level, with RPM then handling the task of setting up the build environment correctly. > +1 on extras. The extras feature has the wonderful property that people > understand it. Lots of projects have a 'test' extra instead of > tests_require for example, and you don't have to look up how to install > them. > Yeah, it was really helpful to me to work through the "How would I replace this proposal with the existing extras system?", since the end result achieved everything I was aiming for without requiring any fundamentally new concepts or tech. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From amine.djillali at gmail.com Sat Mar 11 10:23:28 2017 From: amine.djillali at gmail.com (Adh) Date: Sat, 11 Mar 2017 16:23:28 +0100 Subject: [Distutils] Python 3.5 Message-ID: Hello, I do not arrive to install python 3.5 in the terminal, I use Ubuntu. Wich command do I have to tape to install it? Thank you for your answer -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrey at futoin.org Fri Mar 10 17:25:06 2017 From: andrey at futoin.org (Andrey Galkin) Date: Sat, 11 Mar 2017 00:25:06 +0200 Subject: [Distutils] Critical: PR for packaging.specifiers not found issue Message-ID: Can someone please take a look at https://github.com/pypa/setuptools/pull/990 ? Previously, the issue was reported by another user and then rejected: https://github.com/pypa/setuptools/issues/967 The problem is reproducible on both Python 2.7.13 and 3.5.3 shipped in Debian Stretch. Yes, it's not yet visible in other OSes including Ubuntu with previous patch versions of 2.7 & 3.5. I believe it's related to this change: bpo-27419: Standard __import__() no longer look up ?__import__? in globals or builtins for importing submodules or ?from import?. Fixed handling an error of non-string package name. https://bugs.python.org/issue27419 The packaging module does not export specifiers in __init__.py. It can be easily triggered with "pip -e source_dir". I can confirm the issue gets vanished once PR with one-liner is applied to latest setuptools located in virtualenv. Error output: Complete output from command python setup.py egg_info: Traceback (most recent call last): File "", line 1, in File "/vagrant/setup.py", line 67, in setup(**config) File "/usr/lib/python2.7/distutils/core.py", line 111, in setup _setup_distribution = dist = klass(attrs) File "/home/vagrant/.virtualenv-2.7/local/lib/python2.7/site-packages/setuptools/dist.py", line 320, in __init__ _Distribution.__init__(self, attrs) File "/usr/lib/python2.7/distutils/dist.py", line 287, in __init__ self.finalize_options() File "/home/vagrant/.virtualenv-2.7/local/lib/python2.7/site-packages/setuptools/dist.py", line 387, in finalize_options ep.load()(self, ep.name, value) File "/home/vagrant/.virtualenv-2.7/local/lib/python2.7/site-packages/setuptools/dist.py", line 166, in check_specifier except packaging.specifiers.InvalidSpecifier as error: AttributeError: 'module' object has no attribute 'specifiers' Complete output from command python setup.py egg_info: Traceback (most recent call last): File "/home/vagrant/.virtualenv-3.5/lib/python3.5/site-packages/setuptools/dist.py", line 165, in check_specifier packaging.specifiers.SpecifierSet(value) AttributeError: module 'packaging' has no attribute 'specifiers' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "", line 1, in File "/vagrant/setup.py", line 67, in setup(**config) File "/usr/lib/python3.5/distutils/core.py", line 108, in setup _setup_distribution = dist = klass(attrs) File "/home/vagrant/.virtualenv-3.5/lib/python3.5/site-packages/setuptools/dist.py", line 320, in __init__ _Distribution.__init__(self, attrs) File "/usr/lib/python3.5/distutils/dist.py", line 281, in __init__ self.finalize_options() File "/home/vagrant/.virtualenv-3.5/lib/python3.5/site-packages/setuptools/dist.py", line 387, in finalize_options ep.load()(self, ep.name, value) File "/home/vagrant/.virtualenv-3.5/lib/python3.5/site-packages/setuptools/dist.py", line 166, in check_specifier except packaging.specifiers.InvalidSpecifier as error: AttributeError: module 'packaging' has no attribute 'specifiers' From ben+python at benfinney.id.au Sat Mar 11 18:57:32 2017 From: ben+python at benfinney.id.au (Ben Finney) Date: Sun, 12 Mar 2017 10:57:32 +1100 Subject: [Distutils] Python 3.5 References: Message-ID: <85varfgydv.fsf@benfinney.id.au> Adh writes: > Hello, I do not arrive to install python 3.5 in the terminal, I use > Ubuntu. You should ask general usage questions in the main user forum for Python, . Please subscribe there, tell them which Ubuntu version you are using, what command you type and what is the result. They will help you from there. -- \ ?Simplicity is prerequisite for reliability.? ?Edsger W. | `\ Dijkstra | _o__) | Ben Finney From graffatcolmingov at gmail.com Sat Mar 11 21:26:20 2017 From: graffatcolmingov at gmail.com (Ian Cordasco) Date: Sat, 11 Mar 2017 20:26:20 -0600 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: <85r323gw48.fsf@benfinney.id.au> References: <85r323gw48.fsf@benfinney.id.au> Message-ID: On Mar 11, 2017 6:47 PM, "Ben Finney" wrote: Howdy all, What prospects are there for PyPI to have GnuPG-signed packages by default? Debian's UScan has the ability to find, download, and verify the GnuPG signature for a package source release. Lintian will remind the maintainer if a Debian source package is not taking advantage of this. However, this only works if upstream releases are actually accompanied by a valid GnuPG signature each time. The PyPI infrastructure supports this; why isn't it more widely encouraged? This thread from 2016 has a possible answer: while you can use GPG as is to verify that yes, "Donald Stufft" signed a particular package, you cannot use it to determine if "Donald Stufft" is *allowed* to sign for that package, a valid signature from me on the requests project should be just as invalid as an invalid signature from anyone on the requests project. The only namespacing provided by GPG itself is "trusted key" vs "not trusted key". [?] I am aware of a single tool anywhere that actively supports verifying the signatures that people upload to PyPI, and that is Debian's uscan program. [?] All in all, I think that there is not a whole lot of point to having this feature in PyPI, it is predicated a bunch of invalid assumptions (as detailed above) and I do not believe end users are actually even using the keys that are being uploaded. [?] Thus, I would like to remove this feature from PyPI [?]. The thread has some discussion, and Barry Warsaw makes the case for Debian's use for signed releases. The last (?) post in the thread has a kind of interim conclusion: My main concern when implementing this is how to communicate it to users [?]. [A move that gives the impression] "we're getting rid of this thing that only kinda works now in favor of something amazing that doesn't exist yet" is just not a popular move. In response to polite requests for signed releases, some upstream I've only ever seen condescending requests in the past but perhaps we have different definitions of "polite" or perhaps things have genuinely changed. maintainers are now pointing to that thread and closing bug reports as ?won't fix?. You may have noticed in that thread that there are plans for better mechanisms. Mechanisms that don't add significantly more burden to maintainers of the software we know and love who do this for free and with their spare time. What prospect is there in the Python community to get signed upstream releases become the obvious norm? Not every package on PyPI is redistributed via Linux packagers. Why then should someone publishing their tiny little first package have to go through the hassle of creating a GPG key? As a maintainer of Twine, I will never force someone to have learned how to install GPG on their platform, create a key that package maintainers won't belittle them for, and maintain the key's security in order to upload something to PyPI. Further GPG depends on trust. Do you mean to imply that Debian trusts PyPI packages with a signature more than those without? Even if the key used to sign it has never been signed by another person? What about keys signed by people you've never met? Someone can manufacture their own web of trust if they want to. Why is GPG seen as done kind of magic authenticity bullet? If you can find a tool that is easy to install on Linux, Windows, and Mac, which solves the problems above by virtue of having very good defaults, and is accessible to anyone with less than a few hours to waste on it... Then maybe I would collaborate to make it a requirement. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben+python at benfinney.id.au Sun Mar 12 03:15:03 2017 From: ben+python at benfinney.id.au (Ben Finney) Date: Sun, 12 Mar 2017 18:15:03 +1100 Subject: [Distutils] GnuPG signatures on PyPI: why so few? References: <85r323gw48.fsf@benfinney.id.au> Message-ID: <85h92zge4o.fsf@benfinney.id.au> (Ian, your messages are failing to properly quote material you're responding to. The message you posted has no quote leaders on my material, which looks like it was written by you; see the message at . If this is some mangling done by GMail, you may need to change its configuration or post using something else until it's fixed.) Ian Cordasco writes: > If you can find a tool that is easy to install on Linux, Windows, and Mac, > which solves the problems above by virtue of having very good defaults, and > is accessible to anyone with less than a few hours to waste on it... Then > maybe I would collaborate to make it a requirement. No-one here has argued that it be a requirement as things stand now. I'm talking about encouraging it as a norm, by improving tool support to make it easier. -- \ ?The fact of your own existence is the most astonishing fact | `\ you'll ever have to confront. Don't dare ever see your life as | _o__) boring, monotonous, or joyless.? ?Richard Dawkins, 2010-03-10 | Ben Finney From p.f.moore at gmail.com Sun Mar 12 07:49:16 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 12 Mar 2017 11:49:16 +0000 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: <85h92zge4o.fsf@benfinney.id.au> References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> Message-ID: On 12 March 2017 at 07:15, Ben Finney wrote: >> If you can find a tool that is easy to install on Linux, Windows, and Mac, >> which solves the problems above by virtue of having very good defaults, and >> is accessible to anyone with less than a few hours to waste on it... Then >> maybe I would collaborate to make it a requirement. > > No-one here has argued that it be a requirement as things stand now. I'm > talking about encouraging it as a norm, by improving tool support to > make it easier. One tool that needs improvement to be easier to use for this to happen is GPG itself. As a Windows user, I've "played" with it in the past, and found it frustratingly difficult. It's fiddly to set up, it's not officially supported on Windows, it's intrusive (needs an installer rather than having a portable version), and doesn't give me any assistance in managing the generated key that I might only need once every year or two, and not always on the same machine (and at least one of the machines involved has all access to "internet shared storage" blocked). If I were publishing code that was used extensively by others, and I was being paid to set up a production quality distribution, then I'd be fine with all this. But for putting up my hobby program for others to take a look at if they are interested, it's way too much to expect. (And I'd strongly resist suggestions that such hobby programs be refused permission to publish on PyPI - everything that's available on PyPI started off in just that way). Paul From ben+python at benfinney.id.au Sun Mar 12 08:13:37 2017 From: ben+python at benfinney.id.au (Ben Finney) Date: Sun, 12 Mar 2017 23:13:37 +1100 Subject: [Distutils] GnuPG signatures on PyPI: why so few? References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> Message-ID: <85d1dmhevi.fsf@benfinney.id.au> Paul Moore writes: > One tool that needs improvement to be easier to use for this to happen > is GPG itself. No disagreement from me on that. And indeed, the GnuPG project's chronic under-funding eventually drew attention from the new Core Infrastructure Initiative to improve it faster than was historically the case. This is thanks in large part to the amazing work of Nadia Eghbal in drawing attention to how critical free software, such as GnuPG, benefits society enormously and must receive reliable funding from the organisations who benefit. If anyone reading this works for any organisation that wants to ensure such critical free-software infrastructure continues to be consistently funded and maintained, encourage regular financial contribution to the Core Infrastructure Initiative or similar projects. > As a Windows user, I've "played" with it in the past, and found it > frustratingly difficult. I hope many people here will find the guide published by the FSF, Email Self-Defense , a useful walk through how to set it up properly. -- \ ?I must say that I find television very educational. The minute | `\ somebody turns it on, I go to the library and read a book.? | _o__) ?Groucho Marx | Ben Finney From p.f.moore at gmail.com Sun Mar 12 10:35:44 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 12 Mar 2017 14:35:44 +0000 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: <85d1dmhevi.fsf@benfinney.id.au> References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> Message-ID: On 12 March 2017 at 12:13, Ben Finney wrote: > >> As a Windows user, I've "played" with it in the past, and found it >> frustratingly difficult. > > I hope many people here will find the guide published by the FSF, Email > Self-Defense , a useful walk > through how to set it up properly. That's about email, though, and as such irrelevant here. I have no interest in setting up GPG for my email. Part of what I meant by "intrusive" was "installs plugins for things like email and file encryption that I don't want". Part of my issue here is that people promoting signing tend to think of it as a way of life, rather than as an annoying little extra step that is needed for one specific activity (publishing to PyPI in the context of this thread). There's essentially nothing written from the POV of "you have no interest in signing, and are only doing it because someone's insisting that you do - so here's how to do the least possible to make them shut up". You may not agree with that attitude, but it is very common in my experience, and documents that start by trying to change the reader's opinion get discarded *remarkably* fast. But this is way off-topic, so I'll refrain from saying anything more. Paul From steve.dower at python.org Sun Mar 12 14:57:49 2017 From: steve.dower at python.org (Steve Dower) Date: Sun, 12 Mar 2017 11:57:49 -0700 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> Message-ID: FWIW, I dropped a portable version into the windows-installer externals that are pulled down by the release scripts (from svn.p.o). It does require me to import my key on new machines, but since I don't use it for anything but re-signing the releases it's worth it to avoid all the intrusions. So it's definitely possible, just a matter of finding and including the right dependencies to copy around. Cheers, Steve Top-posted from my Windows Phone -----Original Message----- From: "Paul Moore" Sent: ?3/?12/?2017 7:36 To: "Ben Finney" Cc: "Distutils" Subject: Re: [Distutils] GnuPG signatures on PyPI: why so few? On 12 March 2017 at 12:13, Ben Finney wrote: > >> As a Windows user, I've "played" with it in the past, and found it >> frustratingly difficult. > > I hope many people here will find the guide published by the FSF, Email > Self-Defense , a useful walk > through how to set it up properly. That's about email, though, and as such irrelevant here. I have no interest in setting up GPG for my email. Part of what I meant by "intrusive" was "installs plugins for things like email and file encryption that I don't want". Part of my issue here is that people promoting signing tend to think of it as a way of life, rather than as an annoying little extra step that is needed for one specific activity (publishing to PyPI in the context of this thread). There's essentially nothing written from the POV of "you have no interest in signing, and are only doing it because someone's insisting that you do - so here's how to do the least possible to make them shut up". You may not agree with that attitude, but it is very common in my experience, and documents that start by trying to change the reader's opinion get discarded *remarkably* fast. But this is way off-topic, so I'll refrain from saying anything more. Paul _______________________________________________ Distutils-SIG maillist - Distutils-SIG at python.org https://mail.python.org/mailman/listinfo/distutils-sig -------------- next part -------------- An HTML attachment was scrubbed... URL: From glyph at twistedmatrix.com Sun Mar 12 15:51:13 2017 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Sun, 12 Mar 2017 12:51:13 -0700 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: <85d1dmhevi.fsf@benfinney.id.au> References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> Message-ID: <2CE39A31-C1AC-4909-833B-4B09457FD785@twistedmatrix.com> > On Mar 12, 2017, at 5:13 AM, Ben Finney wrote: > > Paul Moore writes: > >> One tool that needs improvement to be easier to use for this to happen >> is GPG itself. > > No disagreement from me on that. And indeed, the GnuPG project's chronic > under-funding eventually drew attention from the new Core Infrastructure > Initiative to improve it > faster than was historically the case. > > This is thanks in large part to the amazing work of Nadia Eghbal > in drawing attention to how critical > free software, such as GnuPG, benefits society enormously and must > receive reliable funding from the organisations who benefit. > > If anyone reading this works for any organisation that wants to ensure > such critical free-software infrastructure continues to be consistently > funded and maintained, encourage regular financial contribution to the > Core Infrastructure Initiative > or similar projects. No disrespect to GPG's maintainers, who are indeed beleaguered and underfunded, but the poor usability of the tool isn't entirely down to a lack of resources. One reason we may not want to require or even encourage the use of GPG is that GPG is bad. Publishing your own heartfelt screed about why you used to like GPG but really, we need to abandon it now, has become the national sport of the information security community: https://blog.cryptographyengineering.com/2014/08/13/whats-matter-with-pgp/ https://blog.filippo.io/giving-up-on-long-term-pgp/ https://moxie.org/blog/gpg-and-me/ These posts are talking a lot about email, but many of the problems are just fundamental; in particular the "museum of 90s crypto" aspect is fundamentally un-solvable within the confines of the OpenPGP specification. "Unusable email clients" in this case could be replaced with "unusable packaging tooling". If you're retrieving packages from PyPI over TLS, they're already cryptographically signed at the time of retrieval, by an entity with a very good reputation in the community (the PSF) that you already have to trust anyway because that's where Python comes from. So if we could get away from GPG as a specific piece of tooling here and focus on the problem a detached GPG signature could solve, it's "direct trust of packagers rather than the index". The only way that Debian maintainers can supply this trust metadata right now is to manually populate debian/upstream/signing-key.asc. This is a terrible mechanism that is full of flaws, but requiring a human being to at least look at the keys is at least a potential benefit because maybe they'll notice that it's odd that the key got rotated. If PyPI required signatures from everybody then it would be very tempting to skip this manual step and just retrieve the signing key from the PyPI account uploading the packages, which is the exact same guarantee you had before via the crypto TLS gave you (i.e. the PSF via PyPI makes some highly ambiguous attestation as to the authenticity of the package, basically just "its name matches") but now you're involving a pile of highly-complex software with fundamentally worse crypto than OpenSSL would have given you. To summarize: Even if we only cared about supplying package upstreams to Debian (and that is a tiny part of PyPI's mission), right now, using the existing tooling of uscan and lintian, the only security value that could _possibly_ be conveyed here would be an out-of-band conversation between the maintainer and upstream about what their signing keys are and how the signing process works. Any kind of automation would make it less likely that would happen, which means that providing tool support to automate this process would actually make things worse. >> As a Windows user, I've "played" with it in the past, and found it >> frustratingly difficult. > > I hope many people here will find the guide published by the FSF, Email > Self-Defense , a useful walk > through how to set it up properly. > > -- > \ ?I must say that I find television very educational. The minute | > `\ somebody turns it on, I go to the library and read a book.? | > _o__) ?Groucho Marx | > Ben Finney > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Mar 13 03:45:28 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 13 Mar 2017 17:45:28 +1000 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: <2CE39A31-C1AC-4909-833B-4B09457FD785@twistedmatrix.com> References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> <2CE39A31-C1AC-4909-833B-4B09457FD785@twistedmatrix.com> Message-ID: On 13 March 2017 at 05:51, Glyph Lefkowitz wrote: > To summarize: Even if we only cared about supplying package upstreams to > Debian (and that is a tiny part of PyPI's mission), right now, using the > existing tooling of uscan and lintian, the only security value that could > _possibly_ be conveyed here would be an out-of-band conversation between > the maintainer and upstream about what their signing keys are and how the > signing process works. Any kind of automation would make it less likely > that would happen, which means that providing tool support to automate this > process would actually make things *worse*. > And much of the same benefits can be obtained by Debian and other third parties maintaining "known hashes" for historical PyPI releases and complaining if they ever change. The only aspect that end-to-end package signing can potentially help with is bypassing PyPI as a potential point of compromise for *new* never-before-seen releases, and much of *that* benefit can be gained by way of publishers providing a list of "expected artifact hashes" through a trusted channel that they control and the PyPI service can't influence. GPG signatures of the artifacts themselves is just one way of establishing that trusted information channel, and it's a particularly publisher-hostile one that's also pretty end-user-hostile as well. The TUF based approach in PEP 458 and PEP 480 has at least in principle support from both Donald and I, but in addition to still relying on HTTPS to bootstrap initial trust, it is also gated behind the Warehouse migration and shutting down the legacy PyPI implementation (which is a sufficiently tedious activity that we think the chances of achieving it with purely volunteer and part-time labour are basically zero). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From lele at metapensiero.it Mon Mar 13 05:47:54 2017 From: lele at metapensiero.it (Lele Gaifax) Date: Mon, 13 Mar 2017 10:47:54 +0100 Subject: [Distutils] Best practice to build binary wheels on Github+Travis and upload to PyPI Message-ID: <871su1pkxh.fsf@metapensiero.it> Hi all, I'd like to learn how to configure a project I keep on Github so that at release time it will trigger a build of binary wheels for different versions of Python 3 and eventually uploading them to PyPI. At first I tried to follow the Travis deploy instruction[1], but while that works for source distribution it cannot be used to deploy binary wheels because AFAICT Travis does not build ?manylinux1?-marked wheels. I then found the manylinux-demo project[2] that uses Docker and contains a a script able to build the wheels for every available version of Python. OTOH, it does not tackle to PyPI upload step. I will try to distill a custom recipe for my own needs looking at how other packages implemented this goal, but I wonder if there is already some documentation that could help me understanding better how to intersect the above steps. Thanks in advance for any hint, ciao, lele. [1] https://docs.travis-ci.com/user/deployment/pypi/ [2] https://github.com/pypa/python-manylinux-demo -- nickname: Lele Gaifax | Quando vivr? di quello che ho pensato ieri real: Emanuele Gaifas | comincer? ad aver paura di chi mi copia. lele at metapensiero.it | -- Fortunato Depero, 1929. From ralf.gommers at gmail.com Mon Mar 13 05:58:03 2017 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 13 Mar 2017 22:58:03 +1300 Subject: [Distutils] Best practice to build binary wheels on Github+Travis and upload to PyPI In-Reply-To: <871su1pkxh.fsf@metapensiero.it> References: <871su1pkxh.fsf@metapensiero.it> Message-ID: On Mon, Mar 13, 2017 at 10:47 PM, Lele Gaifax wrote: > Hi all, > > I'd like to learn how to configure a project I keep on Github so that at > release time it will trigger a build of binary wheels for different > versions > of Python 3 and eventually uploading them to PyPI. > > At first I tried to follow the Travis deploy instruction[1], but while that > works for source distribution it cannot be used to deploy binary wheels > because AFAICT Travis does not build ?manylinux1?-marked wheels. > > I then found the manylinux-demo project[2] that uses Docker and contains a > a script able to build the wheels for every available version of Python. > OTOH, > it does not tackle to PyPI upload step. > > I will try to distill a custom recipe for my own needs looking at how other > packages implemented this goal, but I wonder if there is already some > documentation that could help me understanding better how to intersect the > above steps. > Multibuild is probably the best place to start: https://github.com/matthew-brett/multibuild Here's a relatively simple and up-to-date example of how to produce wheels for Windows, Linux and OS X automatically using multibuild: https://github.com/MacPython/pywavelets-wheels Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Mar 13 06:32:48 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 13 Mar 2017 20:32:48 +1000 Subject: [Distutils] PEP 426 moved back to Draft status In-Reply-To: References:

Message-ID: On 11 March 2017 at 14:17, Nick Coghlan wrote: > On 11 March 2017 at 07:03, Daniel Holth wrote: > >> You lost me a bit at 'extra sets'. FYI it is already possible to depend >> on your own extras in another extra. >> >> Extra pseudo code: >> spampackage >> extra['spam'] = 'spampackage[eggs]' >> extra['eggs'] = ... >> > > Oh, nice. In that case, we can drop the '*' idea and just make "all" > another pre-declared extra with a SHOULD that says sdist build tools should > implicitly populate it as: > > { > "requires": "thisproject[extra1,extra2,extra3,extra4]" > "extra": "all" > } > > given an extras clause containing '["extra1",'extra2","extra3","extra4"]'. > > Endorsing that approach to handling "extra sets" does impose a design > constraint though, which is that installation tools will need to > special-case self-referential requirements so they don't get stuck in a > recursive loop. (That will become a new MUST in the spec) > Next update: https://github.com/python/peps/commit/24cd02b34cea1bf35443048fd665485dffd0de93 - metadata version bumped to 3.0 - expected filename changed to pysdist.json and stated to be immutable once generated for a given release - project obsolescence changes deferred to a possible future metadata extension - no proposed changes to extras syntax and the "self" and "runtime" pseudo-extras dropped - "all" added as an implied extra for all declared extras - "alldev" added as an implied superset of "test", "build", "doc" and "dev" Even though it's not strictly necessary, I'd still kind of like to have a standard way to say "install all the dev dependencies, but not the package itself or its runtime dependencies". I guess if we take distro build tools as an example though, they handle that as a separate command (e.g. "dnf builddep" vs "dnf install") rather than as a variation on the normal install command. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Mon Mar 13 11:23:25 2017 From: donald at stufft.io (Donald Stufft) Date: Mon, 13 Mar 2017 11:23:25 -0400 Subject: [Distutils] FYI - "Trending" on Warehouse Message-ID: Just an FYI, I?ve replaced the long stagnation ?top downloads? on the Warehouse / pypi.org homepage with ?Trending? projects. Since ?trending? can mean a lot of different things as far as how it?s computed, here?s how I?m currently doing it [1]: Using a look back over the last 30 days of downloads I compute a ?zscore? for each project for yesterday (effectively, how many standard deviations away from from the mean yesterday was for that project in total downloads). The trending projects is then the top 5 projects in terms of zscore for yesterday (recomputed every day at ~3am UTC). Because it?s a lot easier for a project with an average of 5 downloads to jump to 100 than it is for a project with 50000 downloads jump to 1000000 I have tried to exclude any projects with very few downloads from this, so in order to qualify to be trending a project must receive at least 5,000 downloads in a month. If you happen to be some sort of sciencey person and you know of a better way to query what is effectively a table with a row for every download for every project to determine which ones are trending, feel free to open an issue or create a PR or something. I don?t really know what I?m doing here :) Anyways, that?s all! [1] https://github.com/pypa/warehouse/blob/a36435b9865000cdaae97b948af48c33f7d8fe8e/warehouse/packaging/tasks.py#L19-L102 ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve.dower at python.org Mon Mar 13 13:46:02 2017 From: steve.dower at python.org (Steve Dower) Date: Mon, 13 Mar 2017 10:46:02 -0700 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> <2CE39A31-C1AC-4909-833B-4B09457FD785@twistedmatrix.com> Message-ID: Another drive-by contribution: what if twine printed the hashes for anything it uploads with a message basically saying "here are the things you should publish somewhere for this release so people can check the validity of your packages after they download them"? I suspect many publishers have never considered this is something they could or should do. Some very basic prompting could easily lead to it becoming part of the normal workflow. Top-posted from my Windows Phone -----Original Message----- From: "Nick Coghlan" Sent: ?3/?13/?2017 0:53 To: "Glyph Lefkowitz" Cc: "DistUtils mailing list" ; "Ben Finney" Subject: Re: [Distutils] GnuPG signatures on PyPI: why so few? On 13 March 2017 at 05:51, Glyph Lefkowitz wrote: To summarize: Even if we only cared about supplying package upstreams to Debian (and that is a tiny part of PyPI's mission), right now, using the existing tooling of uscan and lintian, the only security value that could _possibly_ be conveyed here would be an out-of-band conversation between the maintainer and upstream about what their signing keys are and how the signing process works. Any kind of automation would make it less likely that would happen, which means that providing tool support to automate this process would actually make things worse. And much of the same benefits can be obtained by Debian and other third parties maintaining "known hashes" for historical PyPI releases and complaining if they ever change. The only aspect that end-to-end package signing can potentially help with is bypassing PyPI as a potential point of compromise for *new* never-before-seen releases, and much of *that* benefit can be gained by way of publishers providing a list of "expected artifact hashes" through a trusted channel that they control and the PyPI service can't influence. GPG signatures of the artifacts themselves is just one way of establishing that trusted information channel, and it's a particularly publisher-hostile one that's also pretty end-user-hostile as well. The TUF based approach in PEP 458 and PEP 480 has at least in principle support from both Donald and I, but in addition to still relying on HTTPS to bootstrap initial trust, it is also gated behind the Warehouse migration and shutting down the legacy PyPI implementation (which is a sufficiently tedious activity that we think the chances of achieving it with purely volunteer and part-time labour are basically zero). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Mar 13 19:41:01 2017 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 13 Mar 2017 16:41:01 -0700 Subject: [Distutils] PEP 426 moved back to Draft status In-Reply-To: References:

Message-ID: On Fri, Mar 10, 2017 at 7:55 AM, Nick Coghlan wrote: > On 11 March 2017 at 00:52, Nathaniel Smith wrote: >> >> On Fri, Mar 10, 2017 at 1:26 AM, Nick Coghlan wrote: >> > Hi folks, >> > >> > After a few years of dormancy, I've finally moved the metadata 2.0 >> > specification back to Draft status: >> > >> > https://github.com/python/peps/commit/8ae8b612d4ea8b3bf5d8a7b795ae8aec48bbb7a3 >> >> We have lots of metadata files in the wild that already claim to be >> version 2.0. If you're reviving this I think you might need to change >> the version number? > > > They're mostly in metadata.json files, though. That said, version numbers > are cheap, so I'm happy to skip straight to 3.0 if folks think it makes more > sense. AFAICT bdist_wheel produces METADATA files with Metadata-Version: 2.0 by default, and has for some time. Certainly this one I just spot-checked does that. >> > Based on our last round of discussion, I've culled a lot of the >> > complexity >> > around dependency declarations, cutting it back to just 4 pre-declared >> > extras (dev, doc, build, test), >> >> I think we can drop 'build' in favor of pyproject.toml? > > > No, as that's a human edited input file, not an output file from the sdist > generation process. > >> >> Actually all of the pre-declared extras are really relevant for sdists >> rather than wheels. Maybe they should all move into pyproject.toml? > > > Think "static release metadata in an API response from PyPI" for this > particular specification, rather than something you'd necessarily check into > source control. That's actually one of the big benefits of doing this post > pyproject.toml - with that taking care of the build system bootstrapping > problem, it frees up pydist.json to be entirely an artifact of the sdist > generation process (and then copying it along to the wheel archives and the > installed package as well). > > That said, that's actually an important open question: is pydist.json always > preserved unmodified through the sdist->wheel->install and sdist->install > process? > > There's a lot to be said for treating the file as immutable, and instead > adding *other* metadata files as a component moves through the distribution > process. If so, then it may actually be more appropriate to call the > rendered file "pysdist.json", since it contains the sdist metadata > specifically, rather than arbitrary distribution metadata. I guess there are three possible kinds of build dependencies: - those that are known statically - those that are determined by running some code at sdist creation time - those that are determined by running some code at build time But all the examples I can think of fall into either bucket A (which pyproject.toml handles), or bucket C (which pydist.json can't handle). So it seems like the metadata here is either going to be redundant or wrong? I'm not sure I understand the motivation for wanting wheels to have a file which says "here's the metadata describing the sdist that you would have, if you had an sdist (which you don't)"? I guess it doesn't hurt anything, but it seems odd. > I'd also be fairly strongly opposed to converting extras from an optional > dependency management system to a "let multiple PyPI packages target the > same site-packages subdirectory" because we already know that's a nightmare > from the Linux distro experience (having a clear "main" package that owns > the parent directory with optional subpackages solves *some* of the > problems, but my main reaction is still "Run awaaay"). The "let multiple PyPI packages target the same site-packages directory" problem is orthogonal to the reified extras proposal. I actually think we can't avoid handling the same site-packages directory problem, but the solution is namespace packages and/or better Conflicts: metadata. Example illustrating why the site-packages conflict problem arises independently of reified extras: people want to distribute numpy built against different BLAS backends, especially MKL (which is good but zero-cost proprietary) versus OpenBLAS (which is not as good but is free). Right now that's possible by distributing 'numpy' and 'numpy-mkl' packages, but of course ugly stuff happens if you try to install both; some sort of Conflicts: metadata would help. If we instead have the packages be named 'numpy' and 'numpy[mkl]', then they're in exactly the same position with respect to conflicts. The very significant advantage is that we know that 'numpy[mkl]' "belongs to" the numpy project, so 'numpy[mkl]' can say 'Provides-Dist: numpy' without all the security issues that Provides-Dist otherwise runs into. Example illustrating why reifed extras are useful totally independently of site-packages conflicts: it would be REALLY NICE if numpy could say 'Provides-Dist: numpy[abi=7]' and then packages could depend on 'numpy[abi=7]' and have that do something sensible. This would be a pure virtual package. > It especially isn't needed just to solve the "pip forgets what extras it > installed" problem - that technically doesn't even need a PEP to resolve, it > just needs pip to drop a pip specific file into the PEP 376 dist-info > directory that says what extras to request when doing future upgrades. But that breaks if people use a package manager other than pip, which is something we want to support, right? And in any case it requires a bunch more redundant special-case logic inside pip, to basically make extras act like virtual packages. -n -- Nathaniel J. Smith -- https://vorpus.org From ncoghlan at gmail.com Tue Mar 14 00:23:55 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 14 Mar 2017 14:23:55 +1000 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> <2CE39A31-C1AC-4909-833B-4B09457FD785@twistedmatrix.com>

Message-ID: On 14 March 2017 at 03:46, Steve Dower wrote: > Another drive-by contribution: what if twine printed the hashes for > anything it uploads with a message basically saying "here are the things > you should publish somewhere for this release so people can check the > validity of your packages after they download them"? > > I suspect many publishers have never considered this is something they > could or should do. Some very basic prompting could easily lead to it > becoming part of the normal workflow. > Huh, and with most PyPI publishers using public version control systems, their source control repo itself could even serve as "a trusted channel that they control and the PyPI service can't influence". For example, the artifact hashes could be written out by default to: .released_artifacts//.sha256 And if twine sees the hash file exists before it starts the upload, it could complain that the given artifact had already been published even before PyPI complains about it. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From glyph at twistedmatrix.com Tue Mar 14 01:48:19 2017 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Mon, 13 Mar 2017 22:48:19 -0700 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> <2CE39A31-C1AC-4909-833B-4B09457FD785@twistedmatrix.com>

Message-ID: <99F7634D-17C5-4344-A6C3-0FF318FA5BFB@twistedmatrix.com> > On Mar 13, 2017, at 9:23 PM, Nick Coghlan wrote: > > On 14 March 2017 at 03:46, Steve Dower > wrote: > Another drive-by contribution: what if twine printed the hashes for anything it uploads with a message basically saying "here are the things you should publish somewhere for this release so people can check the validity of your packages after they download them"? > > I suspect many publishers have never considered this is something they could or should do. Some very basic prompting could easily lead to it becoming part of the normal workflow. > > Huh, and with most PyPI publishers using public version control systems, their source control repo itself could even serve as "a trusted channel that they control and the PyPI service can't influence". For example, the artifact hashes could be written out by default to: > > .released_artifacts//.sha256 > > And if twine sees the hash file exists before it starts the upload, it could complain that the given artifact had already been published even before PyPI complains about it. 1. This sounds like it could be very cool. 2. Except, as stated - i.e. hashes without signatures - this just means we all trust Github rather than PyPI :). 3. A simple signing scheme, like https://minilock.io but for plaintext signatures rather than encryption , could potentially address this problem. 4. Cool as that would be, someone would need to design that thing first, and that person would need to be a cryptographer. 5. Now all you need to do is design a globally addressable PKI system. Good luck everybody ;-). -glyph -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Tue Mar 14 01:55:08 2017 From: donald at stufft.io (Donald Stufft) Date: Tue, 14 Mar 2017 01:55:08 -0400 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: <99F7634D-17C5-4344-A6C3-0FF318FA5BFB@twistedmatrix.com> References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> <2CE39A31-C1AC-4909-833B-4B09457FD785@twistedmatrix.com>

<99F7634D-17C5-4344-A6C3-0FF318FA5BFB@twistedmatrix.com> Message-ID: <84CD63B4-0EF0-475E-BA41-B6DA2C468A69@stufft.io> > On Mar 14, 2017, at 1:48 AM, Glyph Lefkowitz wrote: > > 3. A simple signing scheme, like https://minilock.io but for plaintext signatures rather than encryption , could potentially address this problem. This is basically the plan, using it in conjunction with TUF for the fiddly bits (Because simply signing files isn?t good enough). ? Donald Stufft -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Mar 14 02:52:14 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 14 Mar 2017 16:52:14 +1000 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: <99F7634D-17C5-4344-A6C3-0FF318FA5BFB@twistedmatrix.com> References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> <2CE39A31-C1AC-4909-833B-4B09457FD785@twistedmatrix.com>

<99F7634D-17C5-4344-A6C3-0FF318FA5BFB@twistedmatrix.com> Message-ID: On 14 March 2017 at 15:48, Glyph Lefkowitz wrote: > > 2. Except, as stated - i.e. hashes without signatures - this just means we > all trust Github rather than PyPI :). > Yeah, HTTPS would still be a common point of compromise - that kind of simple scheme would just let the repo hosting and PyPI serve as cross-checks on each other, such that you had to compromise both (or the original publisher's system) in order to corrupt both the published artifact *and* the publisher's record of the expected artifact hash. It would also be enough to let publishers check that the artifacts that PyPI is serving match what they originally uploaded - treating it as a QA problem as much as a security one. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Mar 14 03:34:21 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 14 Mar 2017 17:34:21 +1000 Subject: [Distutils] PEP 426 moved back to Draft status In-Reply-To: References:

Message-ID: On 14 March 2017 at 09:41, Nathaniel Smith wrote: > On Fri, Mar 10, 2017 at 7:55 AM, Nick Coghlan wrote: > > On 11 March 2017 at 00:52, Nathaniel Smith wrote: > >> We have lots of metadata files in the wild that already claim to be > >> version 2.0. If you're reviving this I think you might need to change > >> the version number? > > > > They're mostly in metadata.json files, though. That said, version numbers > > are cheap, so I'm happy to skip straight to 3.0 if folks think it makes > more > > sense. > > AFAICT bdist_wheel produces METADATA files with Metadata-Version: 2.0 > by default, and has for some time. Certainly this one I just > spot-checked does that. > We could always retroactively declare "2.0" to just mean 1.3 + Provides-Extra + (optionally) Description-Content-Type (once that has been defined in a way that makes sense for PyPI). Either way, I'm convinced that the JSON based format should start out at 3.0. > > There's a lot to be said for treating the file as immutable, and instead > > adding *other* metadata files as a component moves through the > distribution > > process. If so, then it may actually be more appropriate to call the > > rendered file "pysdist.json", since it contains the sdist metadata > > specifically, rather than arbitrary distribution metadata. > > I guess there are three possible kinds of build dependencies: > - those that are known statically > - those that are determined by running some code at sdist creation time > - those that are determined by running some code at build time > > But all the examples I can think of fall into either bucket A (which > pyproject.toml handles), or bucket C (which pydist.json can't handle). > So it seems like the metadata here is either going to be redundant or > wrong? > pyproject.toml only handles the bootstrapping dependencies for the build system itself, it *doesn't* necessarily include all the build dependencies, which may be in tool specific files (like setup_requires in setup.py) or otherwise added by the build system without and record of it in pyproject.toml. The build system knows the latter when it generates the sdist, and it means PyPI can extract and republish them without having to actually invoke the build system. For dynamic dependencies where the environment marker system isn't flexible enough to express the installation conditions (so they can't be generated at sdist creation time), that will be something for the publishers of a particular project to resolve with the folks that want the ability to do builds in environments that are isolated from the internet, and hence can't download arbitrary additional dependencies at build time. > I'm not sure I understand the motivation for wanting wheels to have a > file which says "here's the metadata describing the sdist that you > would have, if you had an sdist (which you don't)"? I guess it doesn't > hurt anything, but it seems odd. > Wheels still have a corresponding source artifact, even if it hasn't been published anywhere using the Python-specific sdist format. Accordingly, I don't think it makes sense to be able to tell just from looking at a wheel file whether the generation process was: * tree -> sdist -> wheel; or * tree -> wheel > I'd also be fairly strongly opposed to converting extras from an optional > > dependency management system to a "let multiple PyPI packages target the > > same site-packages subdirectory" because we already know that's a > nightmare > > from the Linux distro experience (having a clear "main" package that owns > > the parent directory with optional subpackages solves *some* of the > > problems, but my main reaction is still "Run awaaay"). > > The "let multiple PyPI packages target the same site-packages > directory" problem is orthogonal to the reified extras proposal. I > actually think we can't avoid handling the same site-packages > directory problem, but the solution is namespace packages and/or > better Conflicts: metadata. > > Example illustrating why the site-packages conflict problem arises > independently of reified extras: people want to distribute numpy built > against different BLAS backends, especially MKL (which is good but > zero-cost proprietary) versus OpenBLAS (which is not as good but is > free). Right now that's possible by distributing 'numpy' and > 'numpy-mkl' packages, but of course ugly stuff happens if you try to > install both; some sort of Conflicts: metadata would help. If we > instead have the packages be named 'numpy' and 'numpy[mkl]', then > they're in exactly the same position with respect to conflicts. The > very significant advantage is that we know that 'numpy[mkl]' "belongs > to" the numpy project, so 'numpy[mkl]' can say 'Provides-Dist: numpy' > without all the security issues that Provides-Dist otherwise runs > into. > Do other components need to be rebuilt or relinked if the NumPy BLAS backend changes? If the answer is yes, then this is something I'd strongly prefer to leave to conda and other package management systems like Nix that better support parallel installation of multiple versions of C/C++ dependencies. If the answer is no, then it seems like a better solution might be to allow for rich dependencies, where numpy could depend on "_numpy_backends.openblas or _numpy_backends.mkl" and figure out the details of exactly what's available and which one it's going to use at import time. Either way, contorting the Extras system to try to cover such a significantly different set of needs doesn't seem like a good idea. > > Example illustrating why reifed extras are useful totally > independently of site-packages conflicts: it would be REALLY NICE if > numpy could say 'Provides-Dist: numpy[abi=7]' and then packages could > depend on 'numpy[abi=7]' and have that do something sensible. This > would be a pure virtual package. > PEP 459 has a whole separate "python.constraints" extension rather than trying to cover environmental constraints within the existing Extras system: https://www.python.org/dev/peps/pep-0459/#the-python-constraints-extension > > It especially isn't needed just to solve the "pip forgets what extras it > > installed" problem - that technically doesn't even need a PEP to > resolve, it > > just needs pip to drop a pip specific file into the PEP 376 dist-info > > directory that says what extras to request when doing future upgrades. > > But that breaks if people use a package manager other than pip, which > is something we want to support, right? And in any case it requires a > bunch more redundant special-case logic inside pip, to basically make > extras act like virtual packages. > OK, it would still need a PEP to make the file name and format standardised across tools. Either way, it's an "installed packages database" problem, not a software publication problem. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From dholth at gmail.com Tue Mar 14 10:35:13 2017 From: dholth at gmail.com (Daniel Holth) Date: Tue, 14 Mar 2017 14:35:13 +0000 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> <2CE39A31-C1AC-4909-833B-4B09457FD785@twistedmatrix.com>

<99F7634D-17C5-4344-A6C3-0FF318FA5BFB@twistedmatrix.com> Message-ID: The wheel command implements but never fully realized the commands 'wheel keygen', 'wheel sign' for a bundled signature scheme (where the signature is inside the signed file) inspired by JAR signing and based on Ed25519 primitives + JSON web signature / JSON web key. The idea was to have wheel automatically generate a signing key and always generate signed wheels, since it's impossible to verify signatures if there are none. Successive releases from the same author would tend to use the same keys; a TOFU (trust on first use) model, a-la ssh, would warn you if the key changed. The public keys would be distributed over a separate https:// server (perhaps the publisher's personal web page, or an application could publish a list of public keys for its dependencies as-tested). Instead of checking the hash of an exact release artifact, you could use a similar syntax to check against a particular public key and cover yourself for future releases. Instead of key revocation, you could let the only valid signing keys be the ones currently available at the key URL, like oauth2 https://www.googleapis.com/oauth2/v3/certs The goal you'd want to shoot for is not 'is this package good' but 'am I being targeted'. A log of timestamp signatures for everything uploaded to PyPI could be very powerful here and might even be useful without publisher signatures, so that you could at least know that you are downloading the same reasonably old version of package X that everyone else is using. If there was a publisher signature, the timestamp server would sign the publisher's signature asserting 'this signature was valid at time X'. On Tue, Mar 14, 2017 at 2:52 AM Nick Coghlan wrote: > On 14 March 2017 at 15:48, Glyph Lefkowitz > wrote: > > > 2. Except, as stated - i.e. hashes without signatures - this just means we > all trust Github rather than PyPI :). > > > Yeah, HTTPS would still be a common point of compromise - that kind of > simple scheme would just let the repo hosting and PyPI serve as > cross-checks on each other, such that you had to compromise both (or the > original publisher's system) in order to corrupt both the published > artifact *and* the publisher's record of the expected artifact hash. > > It would also be enough to let publishers check that the artifacts that > PyPI is serving match what they originally uploaded - treating it as a QA > problem as much as a security one. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Mar 14 12:05:23 2017 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 14 Mar 2017 09:05:23 -0700 Subject: [Distutils] PEP 426 moved back to Draft status In-Reply-To: References:

Message-ID: On Tue, Mar 14, 2017 at 12:34 AM, Nick Coghlan wrote: > On 14 March 2017 at 09:41, Nathaniel Smith wrote: >> >> On Fri, Mar 10, 2017 at 7:55 AM, Nick Coghlan wrote: >> > On 11 March 2017 at 00:52, Nathaniel Smith wrote: >> > There's a lot to be said for treating the file as immutable, and instead >> > adding *other* metadata files as a component moves through the >> > distribution >> > process. If so, then it may actually be more appropriate to call the >> > rendered file "pysdist.json", since it contains the sdist metadata >> > specifically, rather than arbitrary distribution metadata. >> >> I guess there are three possible kinds of build dependencies: >> - those that are known statically >> - those that are determined by running some code at sdist creation time >> - those that are determined by running some code at build time >> >> But all the examples I can think of fall into either bucket A (which >> pyproject.toml handles), or bucket C (which pydist.json can't handle). >> So it seems like the metadata here is either going to be redundant or >> wrong? > > > pyproject.toml only handles the bootstrapping dependencies for the build > system itself, it *doesn't* necessarily include all the build dependencies, > which may be in tool specific files (like setup_requires in setup.py) or > otherwise added by the build system without and record of it in > pyproject.toml. The build system knows the latter when it generates the > sdist, and it means PyPI can extract and republish them without having to > actually invoke the build system. Currently there are cases where people use setup_requires for what's actually static metadata, sure, but that's just because there hasn't been any alternative. The main actual *needs* are: - static build dependencies - dynamic build dependencies determined at build time So it seems to me that we should encourage people to move static dependencies into the static metadata (pyproject.toml), and when they don't then we can treat them like build-time dependencies, which is a problem we need to solve anyway. Having special metadata for "sdist creation-time dependencies" strikes me as papering over the needless complexity of the current system by adding more complexity on top. I can see how it'd have some short-term benefits but it seems net-harmful in the long run IMHO. (If we need a hack to cover the transition period from secretly-static-setup_requires to actually-static-pyproject.toml, maybe we could teach the setuptools sdist command to push setup_requires into pyproject.toml? That'd be a pretty simple hack that wouldn't increase the surface area of our interoperability problems.) >> >> I'm not sure I understand the motivation for wanting wheels to have a >> file which says "here's the metadata describing the sdist that you >> would have, if you had an sdist (which you don't)"? I guess it doesn't >> hurt anything, but it seems odd. > > > Wheels still have a corresponding source artifact, even if it hasn't been > published anywhere using the Python-specific sdist format. Accordingly, I > don't think it makes sense to be able to tell just from looking at a wheel > file whether the generation process was: > > * tree -> sdist -> wheel; or > * tree -> wheel My point is just that usually if I'm looking at artifact A, I don't care about metadata about artifact B :-). Suppose someone has one of these wheels with an sdist.json in it. My question is, under what circumstances are you imagining that they'd look at that sdist.json? What would they do with it? The only case I can think of is for provenance tracking of various kinds, but I don't think just throwing in the sdist metadata is a very good solution to that. If we want source->binary provenance tracking then I'd rather see something focused on that problem, like wheel metadata fields Sdist-SHA256, Build-Host, Build-Time, etc. This isn't what sdist metadata is designed for, so to the extent that it would help solve the problem it's by accident, incomplete. >> > I'd also be fairly strongly opposed to converting extras from an >> > optional >> > dependency management system to a "let multiple PyPI packages target the >> > same site-packages subdirectory" because we already know that's a >> > nightmare >> > from the Linux distro experience (having a clear "main" package that >> > owns >> > the parent directory with optional subpackages solves *some* of the >> > problems, but my main reaction is still "Run awaaay"). >> >> The "let multiple PyPI packages target the same site-packages >> directory" problem is orthogonal to the reified extras proposal. I >> actually think we can't avoid handling the same site-packages >> directory problem, but the solution is namespace packages and/or >> better Conflicts: metadata. >> >> Example illustrating why the site-packages conflict problem arises >> independently of reified extras: people want to distribute numpy built >> against different BLAS backends, especially MKL (which is good but >> zero-cost proprietary) versus OpenBLAS (which is not as good but is >> free). Right now that's possible by distributing 'numpy' and >> 'numpy-mkl' packages, but of course ugly stuff happens if you try to >> install both; some sort of Conflicts: metadata would help. If we >> instead have the packages be named 'numpy' and 'numpy[mkl]', then >> they're in exactly the same position with respect to conflicts. The >> very significant advantage is that we know that 'numpy[mkl]' "belongs >> to" the numpy project, so 'numpy[mkl]' can say 'Provides-Dist: numpy' >> without all the security issues that Provides-Dist otherwise runs >> into. > > > Do other components need to be rebuilt or relinked if the NumPy BLAS backend > changes? > > If the answer is yes, then this is something I'd strongly prefer to leave to > conda and other package management systems like Nix that better support > parallel installation of multiple versions of C/C++ dependencies. > > If the answer is no, then it seems like a better solution might be to allow > for rich dependencies, where numpy could depend on "_numpy_backends.openblas > or _numpy_backends.mkl" and figure out the details of exactly what's > available and which one it's going to use at import time. The answer is no, and it's unlikely that numpy will massively rewrite its internals because pip is missing a feature that every other packaging system has. > Either way, contorting the Extras system to try to cover such a > significantly different set of needs doesn't seem like a good idea. The advantage of the "reified extras" idea is that it actually *removes* features and complexity while *also* solving a bunch of problems that are intractable today. So from my point of view, it's the status quo that's contorted :-). >> >> Example illustrating why reifed extras are useful totally >> independently of site-packages conflicts: it would be REALLY NICE if >> numpy could say 'Provides-Dist: numpy[abi=7]' and then packages could >> depend on 'numpy[abi=7]' and have that do something sensible. This >> would be a pure virtual package. > > > PEP 459 has a whole separate "python.constraints" extension rather than > trying to cover environmental constraints within the existing Extras system: > https://www.python.org/dev/peps/pep-0459/#the-python-constraints-extension I feel like this is the old argument between whether the best way to handle a complex problem space is with a complex solution, or with several simple solutions that can be composed. We can't even get a dependency resolver that handles simple dist-to-dist dependencies, and you want to add a whole second kind of constraints with its own semantics? (Or really third kind, b/c extras are already a second kind once we start tracking them properly.) -n -- Nathaniel J. Smith -- https://vorpus.org From glyph at twistedmatrix.com Wed Mar 15 01:48:58 2017 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Tue, 14 Mar 2017 22:48:58 -0700 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> <2CE39A31-C1AC-4909-833B-4B09457FD785@twistedmatrix.com>

<99F7634D-17C5-4344-A6C3-0FF318FA5BFB@twistedmatrix.com>

Message-ID: <8D50A7F2-72AF-40E1-A68F-3BEDB6A9B7B9@twistedmatrix.com> The big problem here, of course, is "key management"; what happens when someone throws their laptop in a river. https://github.com/ahf/teneo indicates to me that it may be possible to use a KDF to get an Ed25519 key from a passphrase that the user remembers, minilock-style, largely mitigating that problem, assuming we can get users to remember stuff :-). -g > On Mar 14, 2017, at 7:35 AM, Daniel Holth wrote: > > The wheel command implements but never fully realized the commands 'wheel keygen', 'wheel sign' for a bundled signature scheme (where the signature is inside the signed file) inspired by JAR signing and based on Ed25519 primitives + JSON web signature / JSON web key. The idea was to have wheel automatically generate a signing key and always generate signed wheels, since it's impossible to verify signatures if there are none. Successive releases from the same author would tend to use the same keys; a TOFU (trust on first use) model, a-la ssh, would warn you if the key changed. The public keys would be distributed over a separate https:// server (perhaps the publisher's personal web page, or an application could publish a list of public keys for its dependencies as-tested). Instead of checking the hash of an exact release artifact, you could use a similar syntax to check against a particular public key and cover yourself for future releases. Instead of key revocation, you could let the only valid signing keys be the ones currently available at the key URL, like oauth2 https://www.googleapis.com/oauth2/v3/certs > > The goal you'd want to shoot for is not 'is this package good' but 'am I being targeted'. A log of timestamp signatures for everything uploaded to PyPI could be very powerful here and might even be useful without publisher signatures, so that you could at least know that you are downloading the same reasonably old version of package X that everyone else is using. If there was a publisher signature, the timestamp server would sign the publisher's signature asserting 'this signature was valid at time X'. > > On Tue, Mar 14, 2017 at 2:52 AM Nick Coghlan > wrote: > On 14 March 2017 at 15:48, Glyph Lefkowitz > wrote: > > 2. Except, as stated - i.e. hashes without signatures - this just means we all trust Github rather than PyPI :). > > Yeah, HTTPS would still be a common point of compromise - that kind of simple scheme would just let the repo hosting and PyPI serve as cross-checks on each other, such that you had to compromise both (or the original publisher's system) in order to corrupt both the published artifact *and* the publisher's record of the expected artifact hash. > > It would also be enough to let publishers check that the artifacts that PyPI is serving match what they originally uploaded - treating it as a QA problem as much as a security one. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Mar 15 06:37:45 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 15 Mar 2017 20:37:45 +1000 Subject: [Distutils] PEP 426 moved back to Draft status In-Reply-To: References:

Message-ID: On 15 March 2017 at 02:05, Nathaniel Smith wrote: > Having special metadata for "sdist creation-time dependencies" strikes > me as papering over the needless complexity of the current system by > adding more complexity on top. I can see how it'd have some short-term > benefits but it seems net-harmful in the long run IMHO. > How do you propose Warehouse should publish the static metadata? How should distlib abstract over the different metadata formats? Or perhaps I should just drop the whole section about "pysdist.json" files? It's orthogonal to the essential purpose of the PEP, and it seems to be confusing people more than it's helping (we *can* put these files in the sdists they describe, but we don't *have* to). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From dholth at gmail.com Wed Mar 15 13:06:42 2017 From: dholth at gmail.com (Daniel Holth) Date: Wed, 15 Mar 2017 17:06:42 +0000 Subject: [Distutils] GnuPG signatures on PyPI: why so few? In-Reply-To: <8D50A7F2-72AF-40E1-A68F-3BEDB6A9B7B9@twistedmatrix.com> References: <85r323gw48.fsf@benfinney.id.au> <85h92zge4o.fsf@benfinney.id.au> <85d1dmhevi.fsf@benfinney.id.au> <2CE39A31-C1AC-4909-833B-4B09457FD785@twistedmatrix.com>