From support at github.com Fri May 6 13:47:57 2016 From: support at github.com (GitHub) Date: Fri, 06 May 2016 10:47:57 -0700 Subject: [core-workflow] [GitHub] Brett Cannon has invited you to join the Python organization Message-ID: <572cd8cd9e15_596d3fd1680172b8410f9@github-fe137-cp1-prd.iad.github.net.mail> Hi The Knights Who Say "Ni", Brett Cannon has invited you to join the Python organization on GitHub. Head over to https://github.com/python to check out Python's profile. To join Python, follow this link: https://github.com/orgs/python/invitation?via_email=1 Some helpful tips: - If you get a 404 page, make sure you?re signed in as the-knights-who-say-ni. - You can also accept the invitation by visiting the organization page directly at https://github.com/python If you were not expecting this invitation, you can ignore this email. Thanks, The GitHub Team From rosuav at gmail.com Fri May 6 13:54:19 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 7 May 2016 03:54:19 +1000 Subject: [core-workflow] [GitHub] Brett Cannon has invited you to join the Python organization In-Reply-To: <572cd8cd9e15_596d3fd1680172b8410f9@github-fe137-cp1-prd.iad.github.net.mail> References: <572cd8cd9e15_596d3fd1680172b8410f9@github-fe137-cp1-prd.iad.github.net.mail> Message-ID: On Sat, May 7, 2016 at 3:47 AM, GitHub wrote: > Brett Cannon has invited you to join the Python organization on GitHub. Head over to https://github.com/python to check out Python's profile. > > To join Python, follow this link: > > https://github.com/orgs/python/invitation?via_email=1 > > Some helpful tips: > > - If you get a 404 page, make sure you?re signed in as the-knights-who-say-ni. Were you intending to invite all people on the core-workflow list? The invitation works only for the specific recipient. ChrisA From nicolas.alvarez at gmail.com Fri May 6 14:06:19 2016 From: nicolas.alvarez at gmail.com (=?UTF-8?Q?Nicol=C3=A1s_Alvarez?=) Date: Fri, 6 May 2016 15:06:19 -0300 Subject: [core-workflow] [GitHub] Brett Cannon has invited you to join the Python organization In-Reply-To: References: <572cd8cd9e15_596d3fd1680172b8410f9@github-fe137-cp1-prd.iad.github.net.mail> Message-ID: 2016-05-06 14:54 GMT-03:00 Chris Angelico : > On Sat, May 7, 2016 at 3:47 AM, GitHub wrote: >> Brett Cannon has invited you to join the Python organization on GitHub. Head over to https://github.com/python to check out Python's profile. >> >> To join Python, follow this link: >> >> https://github.com/orgs/python/invitation?via_email=1 >> >> Some helpful tips: >> >> - If you get a 404 page, make sure you?re signed in as the-knights-who-say-ni. > > Were you intending to invite all people on the core-workflow list? The > invitation works only for the specific recipient. I assume he intended the Github account corresponding to the list address itself to be in the github organization. -- Nicol?s From brett at snarky.ca Fri May 6 14:07:12 2016 From: brett at snarky.ca (Brett Cannon) Date: Fri, 06 May 2016 18:07:12 +0000 Subject: [core-workflow] [GitHub] Brett Cannon has invited you to join the Python organization In-Reply-To: References: <572cd8cd9e15_596d3fd1680172b8410f9@github-fe137-cp1-prd.iad.github.net.mail> Message-ID: No, I thought I clicked on "Discard" for the email (I have the CLA bot's account set to core-workflow@ as backup in case someone needs access to the account and I can't be reached). Doesn't really matter as no one can do anything with the link without access to the account and I'm the only one who knows the password. On Fri, 6 May 2016 at 10:54 Chris Angelico wrote: > On Sat, May 7, 2016 at 3:47 AM, GitHub wrote: > > Brett Cannon has invited you to join the Python organization on GitHub. > Head over to https://github.com/python to check out Python's profile. > > > > To join Python, follow this link: > > > > https://github.com/orgs/python/invitation?via_email=1 > > > > Some helpful tips: > > > > - If you get a 404 page, make sure you?re signed in as > the-knights-who-say-ni. > > Were you intending to invite all people on the core-workflow list? The > invitation works only for the specific recipient. > > ChrisA > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri May 6 18:59:10 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 7 May 2016 08:59:10 +1000 Subject: [core-workflow] [GitHub] Brett Cannon has invited you to join the Python organization In-Reply-To: References: <572cd8cd9e15_596d3fd1680172b8410f9@github-fe137-cp1-prd.iad.github.net.mail> Message-ID: On Sat, May 7, 2016 at 4:07 AM, Brett Cannon wrote: > No, I thought I clicked on "Discard" for the email (I have the CLA bot's > account set to core-workflow@ as backup in case someone needs access to the > account and I can't be reached). Doesn't really matter as no one can do > anything with the link without access to the account and I'm the only one > who knows the password. Ah, gotcha. ChrisA From senthil at uthcode.com Sun May 8 04:08:35 2016 From: senthil at uthcode.com (Senthil Kumaran) Date: Sun, 8 May 2016 01:08:35 -0700 Subject: [core-workflow] Time to decide how to convert hg repos to git In-Reply-To: References: Message-ID: Hello Core-Workflow Group, On Fri, Apr 22, 2016 at 6:45 PM, Senthil Kumaran wrote: > > Here's my plan and a to do: > > 1. Even though it is a one-time operation, I plan to convert above steps > into a trivial tool that we can use and verify independently. > 2. Once we are satisfied with our local trials, you could use this tool > once to convert the hg repo and push to canonical git repo. > This was the tool I mentioned in the above point. https://github.com/orsenthil/cpython-hg-to-git I used this to test migration of small hg repos to github repos and operations were successful. As a test, I could migrate Cpython repo, but it took multiple hours on a fast machine. I assume it is due to python and subprocess overhead. The faster way is to just use hg-git extension directly follow steps documented in the orsenthil/cpython-hg-to-git repo for the sake of consistency. Brett and I discussed that we might need a way to verify if two repos, hg and git repos are same (that's have the same graph) as we undertake this process. I don't know any offhand comparison commands, but I assume it should be possible. I plan to add that to that tool. Please share your comments. Thank you, Senthil -------------- next part -------------- An HTML attachment was scrubbed... URL: From vadmium+py at gmail.com Sun May 8 04:48:47 2016 From: vadmium+py at gmail.com (Martin Panter) Date: Sun, 8 May 2016 08:48:47 +0000 Subject: [core-workflow] Time to decide how to convert hg repos to git In-Reply-To: References: Message-ID: On 8 May 2016 at 08:08, Senthil Kumaran wrote: > Brett and I discussed that we might need a way to verify if two repos, hg > and git repos are same (that's have the same graph) as we undertake this > process. I don't know any offhand comparison commands, but I assume it > should be possible. I plan to add that to that tool. One starting point that comes to mind is to compare the number of revisions (including all merges and merged revisions) for each branch, tag, etc. With Git you can do it like: $ git rev-list --count master 489 I don?t know what the equivalent command in Mercurial is. Perhaps you could clone the relevant branch to a fresh repository and check the numerical revision number. From brett at python.org Sun May 8 13:38:51 2016 From: brett at python.org (Brett Cannon) Date: Sun, 08 May 2016 17:38:51 +0000 Subject: [core-workflow] Time to decide how to convert hg repos to git In-Reply-To: References: Message-ID: On Sun, 8 May 2016 at 01:48 Martin Panter wrote: > On 8 May 2016 at 08:08, Senthil Kumaran wrote: > > Brett and I discussed that we might need a way to verify if two repos, hg > > and git repos are same (that's have the same graph) as we undertake this > > process. I don't know any offhand comparison commands, but I assume it > > should be possible. I plan to add that to that tool. > > One starting point that comes to mind is to compare the number of > revisions (including all merges and merged revisions) for each branch, > tag, etc. With Git you can do it like: > > $ git rev-list --count master > 489 > > I don?t know what the equivalent command in Mercurial is. Perhaps you > could clone the relevant branch to a fresh repository and check the > numerical revision number. > SO to the rescue (and Martin is right about how to figure it out): http://stackoverflow.com/questions/16672788/total-count-of-change-sets-for-mercurial-and-git Senthil has also suggested verifying the hashes of all the files in a repository that are not in .hg or .git directories. The reason this is important for us to all figure out is that if the current unofficial mirrors for peps and cpython pass verification then we can skip the conversion steps and simply make the unofficial mirrors the official repositories (on top of making sure any conversion succeeds). -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sun May 8 16:21:36 2016 From: brett at python.org (Brett Cannon) Date: Sun, 08 May 2016 20:21:36 +0000 Subject: [core-workflow] devinabox has been migrated! Message-ID: https://github.com/python/devinabox now exists! A huge thank you to everyone who has helped out so far in making this transition reach this point, especially Maciej and Ezio for the b.p.o stuff and Senthil for leading the git migration work. Next up is the peps and devguide repos. For peps I need to: 1. See if the unofficial mirror is actually good enough to use 2. Find out where the command is being run to checkout and build the peps from hg so it can use git instead 3. Figure out how to turn off the mirroring 4. If the unofficial mirror isn't good enough, then migrate the repo The devguide is similar but w/o the worry of checking for any mirror. The benchmarks migration is on hold until a decision is made over on the speed@ mailing list as to whether that repo will be started from scratch. The cpython repo can't be migrated until I get the Python core team to tell me exactly what functionality must exist before a migration can occur. That will be discussed at the language summit at PyCon US this year, so I should have a better idea of what takes priority after having that discussion. -------------- next part -------------- An HTML attachment was scrubbed... URL: From senthil at uthcode.com Sun May 8 18:38:32 2016 From: senthil at uthcode.com (Senthil Kumaran) Date: Sun, 8 May 2016 15:38:32 -0700 Subject: [core-workflow] Time to decide how to convert hg repos to git In-Reply-To: References: Message-ID: Hi Martin, Brett: On Sun, May 8, 2016 at 10:38 AM, Brett Cannon wrote: > $ git rev-list --count master >> 489 >> >> I don?t know what the equivalent command in Mercurial is. Perhaps you >> could clone the relevant branch to a fresh repository and check the >> numerical revision number. >> > > SO to the rescue (and Martin is right about how to figure it out): > http://stackoverflow.com/questions/16672788/total-count-of-change-sets-for-mercurial-and-git > > Senthil has also suggested verifying the hashes of all the files in a > repository that are not in .hg or .git directories. > Are these validations enough for our purposes? Two files in the different version-control system can have same SHA and same commit of commits, but have a possibility of changesets/diffs associated with those commits different. I was thinking, how we should go about with this when evaluating the existing git repo. When we do migration afresh using a tool like hg-git, we assume that this verification step is asserted as part of unit tests of the tool. -- Senthil -------------- next part -------------- An HTML attachment was scrubbed... URL: From vgr255 at live.ca Sun May 8 19:12:05 2016 From: vgr255 at live.ca (=?iso-8859-1?Q?=C9manuel_Barry?=) Date: Sun, 8 May 2016 19:12:05 -0400 Subject: [core-workflow] Some questions Message-ID: Hey there fellow core-workflowers; I've been following the GitHub transition for a while now, and have some questions (which may have been answered already, I apologize in advance if so!). So, I read the PEP again to see if it was answered, seems it wasn't clear (or I'm visually impaired, you get to choose). I understand that there's already a semi-official mirror of the cpython repo on GitHub, and I've been wondering why it isn't enough for our needs. Sure, a bunch of stuff needs to be done (like the CLA bot and the PR <-> issue linking), but surely they could be done on the current mirror. My workflow uses the GitHub mirror, and my patches are compatible with b.p.o and Rietveld. Is there something I'm missing as to why we can't re-use this one? Then, as someone who's been using git and GitHub for almost everything code-related and never touched Mercurial, is there something I can do to help with the transition? I would really love a more accessible workflow for both core developers and external contributors as soon as possible (don't we all?), so if I can help I'd love to; I do realize I'm a bit late to the party though. Keep up the good work! -Emanuel From senthil at uthcode.com Sun May 8 19:28:56 2016 From: senthil at uthcode.com (Senthil Kumaran) Date: Sun, 8 May 2016 16:28:56 -0700 Subject: [core-workflow] Some questions In-Reply-To: References: Message-ID: On Sun, May 8, 2016 at 4:12 PM, ?manuel Barry wrote: > I understand that there's > already a semi-official mirror of the cpython repo on GitHub, and I've been > wondering why it isn't enough for our needs. > It is suitable for our needs. Our last discussion was about how do we ascertain that cpython git repo has the same history as the hg repo, so that after migrate we do not loose any information from the old system. This could be done using: * check the number of commits in both repos for each branch * checking the hash of the source files in two repos. * (And do we go about validating each piece of commit log graph too)? If you have any suggestions, since you are using the cpython git mirror, please feel free to share your thoughts. Welcome to the party! Thanks, Senthil -------------- next part -------------- An HTML attachment was scrubbed... URL: From vgr255 at live.ca Sun May 8 19:40:15 2016 From: vgr255 at live.ca (=?UTF-8?Q?=C3=89manuel_Barry?=) Date: Sun, 8 May 2016 19:40:15 -0400 Subject: [core-workflow] Some questions In-Reply-To: References: Message-ID: Why thank you! I probably missed that last discussion. Do you need some help? I can probably generate a file with that information and pass it over so you can check it matches the Mercurial one. I?m not used to dealing with the log graph though, but I can probably manage something. Here?s what I have in mind, let me know if you have another/better idea: Take each X commit (say, every 100th or 1000th commit, or even every commit if we decide to be insane^Wprecise), store hashes of all files at that revision with possibly the file tree, in a .py file as a list or dict, or json or anything you prefer. Then I upload it for you to look at and you can compare with the mercurial repo. Or we run the same script on the mercurial repo and compare the resulting files. I can work on that this week, probably. Sounds like a good idea? -Emanuel From: Senthil Kumaran [mailto:senthil at uthcode.com] Sent: Sunday, May 08, 2016 7:29 PM To: ?manuel Barry Cc: core-workflow Subject: Re: [core-workflow] Some questions On Sun, May 8, 2016 at 4:12 PM, ?manuel Barry > wrote: I understand that there's already a semi-official mirror of the cpython repo on GitHub, and I've been wondering why it isn't enough for our needs. It is suitable for our needs. Our last discussion was about how do we ascertain that cpython git repo has the same history as the hg repo, so that after migrate we do not loose any information from the old system. This could be done using: * check the number of commits in both repos for each branch * checking the hash of the source files in two repos. * (And do we go about validating each piece of commit log graph too)? If you have any suggestions, since you are using the cpython git mirror, please feel free to share your thoughts. Welcome to the party! Thanks, Senthil -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sun May 8 19:41:52 2016 From: brett at python.org (Brett Cannon) Date: Sun, 08 May 2016 23:41:52 +0000 Subject: [core-workflow] Time to decide how to convert hg repos to git In-Reply-To: References: Message-ID: On Sun, 8 May 2016 at 16:33 Senthil Kumaran wrote: > Hi Martin, Brett: > > On Sun, May 8, 2016 at 10:38 AM, Brett Cannon wrote: > >> $ git rev-list --count master >>> 489 >>> >>> I don?t know what the equivalent command in Mercurial is. Perhaps you >>> could clone the relevant branch to a fresh repository and check the >>> numerical revision number. >>> >> >> SO to the rescue (and Martin is right about how to figure it out): >> http://stackoverflow.com/questions/16672788/total-count-of-change-sets-for-mercurial-and-git >> >> Senthil has also suggested verifying the hashes of all the files in a >> repository that are not in .hg or .git directories. >> > > Are these validations enough for our purposes? > Don't know, but it's at least a start. > > Two files in the different version-control system can have same SHA and > same commit of commits, but have a possibility of changesets/diffs > associated with those commits different. I was thinking, how we should go > about with this when evaluating the existing git repo. > > When we do migration afresh using a tool like hg-git, we assume that this > verification step is asserted as part of unit tests of the tool. > That would be my hope. It obviously doesn't hurt to check, though, if it isn't too difficult. -Brett > > -- > Senthil > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From senthil at uthcode.com Sun May 8 20:43:12 2016 From: senthil at uthcode.com (Senthil Kumaran) Date: Sun, 8 May 2016 17:43:12 -0700 Subject: [core-workflow] Some questions In-Reply-To: References: Message-ID: Hi ?manuel, On Sun, May 8, 2016 at 4:40 PM, ?manuel Barry wrote: > Take each X commit (say, every 100th or 1000th commit, or even every > commit if we decide to be insane^Wprecise), store hashes of all files at > that revision with possibly the file tree, in a .py file as a list or dict, > or json or anything you prefer. Then I upload it for you to look at and you > can compare with the mercurial repo. Or we run the same script on the > mercurial repo and compare the resulting files. If we store anything externally, that could start limiting us. I looked at the problem in this angle - final cpython git repo has ~10000 commits in master branch. That's not a large number to deal with. The orginal hg repo should have exact number of commits. We have to do a diff between each of these commits, including merge commits. and check if contents of those commits are same, if we encounter anything where git-repo differs in content or history from hg-repo, we alert and fail. Since this is a history checking operation and we could complete this in O(minutes) or ~1 hour to validate the repos. This will give us confidence on the migration, and will help us evaluate multiple hg -> git repos that have been migrated at different points in time. This feature will go in this tool: https://github.com/orsenthil/cpython-hg-to-git , which we will use to migrate, sync, and validate hg->git repos. If interested, you could research for efficient way to do the above operation and submit a pull request against that tool. HTH, Senthil -------------- next part -------------- An HTML attachment was scrubbed... URL: From vadmium+py at gmail.com Sun May 8 20:45:43 2016 From: vadmium+py at gmail.com (Martin Panter) Date: Mon, 9 May 2016 00:45:43 +0000 Subject: [core-workflow] Time to decide how to convert hg repos to git In-Reply-To: References: Message-ID: On 8 May 2016 at 22:38, Senthil Kumaran wrote: > Hi Martin, Brett: > > On Sun, May 8, 2016 at 10:38 AM, Brett Cannon wrote: >>> >>> $ git rev-list --count master >>> 489 >>> >>> I don?t know what the equivalent command in Mercurial is. Perhaps you >>> could clone the relevant branch to a fresh repository and check the >>> numerical revision number. >> >> >> SO to the rescue (and Martin is right about how to figure it out): >> http://stackoverflow.com/questions/16672788/total-count-of-change-sets-for-mercurial-and-git >> >> Senthil has also suggested verifying the hashes of all the files in a >> repository that are not in .hg or .git directories. > > > Are these validations enough for our purposes? > > Two files in the different version-control system can have same SHA and same > commit of commits, but have a possibility of changesets/diffs associated > with those commits different. I was thinking, how we should go about with > this when evaluating the existing git repo. In my experience, mainly with converting Subversion ? Git, there are sometimes subtle variations that mean two different tools end up with slightly different repositories (different commit hashes). Some of these we might like to watch out for; others, maybe we don?t care. Brainstorm off the top of my head: * Trivial commits that don?t touch any files may or may not be removed from history * Messages with non-ASCII bytes (UTF-8, nor non-UTF-8) * User names. Git requires separate, non-empty name and email fields (xxx ) but I don?t think Mercurial is so strict. * Trailing newlines and whitespace in commit messages. Different utilities have different rules about how they strip trailing newlines, e.g. they may leave exactly one, none, or the original. * Sub-second timestamps and time zones? From ethan at stoneleaf.us Sun May 8 20:54:03 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 08 May 2016 17:54:03 -0700 Subject: [core-workflow] Some questions In-Reply-To: References: Message-ID: <572FDFAB.1020807@stoneleaf.us> On 05/08/2016 05:43 PM, Senthil Kumaran wrote: > On Sun, May 8, 2016 at 4:40 PM, ?manuel Barry wrote: >> >> Take each X commit (say, every 100^th or 1000^th commit, or even >> every commit if we decide to be insane^Wprecise), store hashes of >> all files at that revision with possibly the file tree, in a .py >> file as a list or dict, or json or anything you prefer. Then I >> upload it for you to look at and you can compare with the mercurial >> repo. Or we run the same script on the mercurial repo and compare >> the resulting files. > > If we store anything externally, that could start limiting us. I read that as generating a temp file from each tool (git and hg) and then comparing them -- not as storing those files. (I could be wrong, though.) -- ~Ethan~ From vgr255 at live.ca Sun May 8 21:10:29 2016 From: vgr255 at live.ca (=?UTF-8?Q?=C3=89manuel_Barry?=) Date: Sun, 8 May 2016 21:10:29 -0400 Subject: [core-workflow] Some questions In-Reply-To: References: Message-ID: (I apologize for top-posting, I still haven?t figured out how to fix my email client) There?s nearly 94k commits in the git repo, and I expect the hg repo has that same number. It?s a tad more than 10,000. I?ll definitely take a look at that tool; my main weakness is that I don?t know hg commands or similar, but comparing separate commits is most definitely better. @Ethan: I meant that I would write all the output to a file for comparison, but apparently that?s not a very good idea, so here I drop it instead. I?ll look at the tool and see what I can do. I?ll try to document my findings if I can?t come up with a good solution, and probably even if I do. Cheers, -Emanuel From: Senthil Kumaran [mailto:senthil at uthcode.com] Sent: Sunday, May 08, 2016 8:43 PM To: ?manuel Barry Cc: core-workflow Subject: Re: [core-workflow] Some questions Hi ?manuel, On Sun, May 8, 2016 at 4:40 PM, ?manuel Barry > wrote: Take each X commit (say, every 100th or 1000th commit, or even every commit if we decide to be insane^Wprecise), store hashes of all files at that revision with possibly the file tree, in a .py file as a list or dict, or json or anything you prefer. Then I upload it for you to look at and you can compare with the mercurial repo. Or we run the same script on the mercurial repo and compare the resulting files. If we store anything externally, that could start limiting us. I looked at the problem in this angle - final cpython git repo has ~10000 commits in master branch. That's not a large number to deal with. The orginal hg repo should have exact number of commits. We have to do a diff between each of these commits, including merge commits. and check if contents of those commits are same, if we encounter anything where git-repo differs in content or history from hg-repo, we alert and fail. Since this is a history checking operation and we could complete this in O(minutes) or ~1 hour to validate the repos. This will give us confidence on the migration, and will help us evaluate multiple hg -> git repos that have been migrated at different points in time. This feature will go in this tool: https://github.com/orsenthil/cpython-hg-to-git , which we will use to migrate, sync, and validate hg->git repos. If interested, you could research for efficient way to do the above operation and submit a pull request against that tool. HTH, Senthil -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Mon May 9 04:32:59 2016 From: phd at phdru.name (Oleg Broytman) Date: Mon, 9 May 2016 10:32:59 +0200 Subject: [core-workflow] Some questions In-Reply-To: References: Message-ID: <20160509083259.GA21689@phdru.name> Hi! On Sun, May 08, 2016 at 07:40:15PM -0400, ??manuel Barry wrote: > Take each X commit (say, every 100th or 1000th commit, or even every commit if we decide to be insane^Wprecise), store hashes of all files at that revision with possibly the file tree, in a .py file as a list or dict, or json or anything you prefer. Then I upload it for you to look at and you can compare with the mercurial repo. Or we run the same script on the mercurial repo and compare the resulting files. IMO the tool can be designed like this: 1. Generate the list of commits in a branch:: git log -m --first-parent --format='%H' 2. For every commit in the list generate the list of files in the commit:: git cat-file -p $SHA1^{tree} This produces a list in the format like this: 040000 tree e0fd616e5707b006f1a2df8be85d0be973192ee0 Doc 040000 tree 33e09fe5cdcd421c989de911c97fd1d901ac0e8e Grammar 040000 tree 39ca3d725f190d61aa45ea1c8bf4802f44f52e47 Include 100644 blob 84a3337c2e5289fb8e50e5ef6d8ac2ac78be70b2 LICENSE 040000 tree 08eeead22b72c75d84624509286e6c54ec6656ec Lib 040000 tree b1a2357d3d461d161d92d73aabb74f0a9ab52294 Mac 100644 blob 2a687e58c9141b44520db9ad0b07b71525fd051d Makefile.pre.in For every blob in the list store its hash. For every tree use git cat-file -p $SHA1^{tree} recursively. I don't have any idea how to do that in Mercurial, though. > -Emanuel Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From phd at phdru.name Mon May 9 04:40:55 2016 From: phd at phdru.name (Oleg Broytman) Date: Mon, 9 May 2016 10:40:55 +0200 Subject: [core-workflow] Some questions In-Reply-To: <20160509083259.GA21689@phdru.name> References: <20160509083259.GA21689@phdru.name> Message-ID: <20160509084055.GA24362@phdru.name> On Mon, May 09, 2016 at 10:32:59AM +0200, Oleg Broytman wrote: > Hi! > > On Sun, May 08, 2016 at 07:40:15PM -0400, ??manuel Barry wrote: > > Take each X commit (say, every 100th or 1000th commit, or even every commit if we decide to be insane^Wprecise), store hashes of all files at that revision with possibly the file tree, in a .py file as a list or dict, or json or anything you prefer. Then I upload it for you to look at and you can compare with the mercurial repo. Or we run the same script on the mercurial repo and compare the resulting files. > > IMO the tool can be designed like this: > > 1. Generate the list of commits in a branch:: > > git log -m --first-parent --format='%H' > > 2. For every commit in the list generate the list of files in the > commit:: > > git cat-file -p $SHA1^{tree} > > This produces a list in the format like this: > > 040000 tree e0fd616e5707b006f1a2df8be85d0be973192ee0 Doc > 040000 tree 33e09fe5cdcd421c989de911c97fd1d901ac0e8e Grammar > 040000 tree 39ca3d725f190d61aa45ea1c8bf4802f44f52e47 Include > 100644 blob 84a3337c2e5289fb8e50e5ef6d8ac2ac78be70b2 LICENSE > 040000 tree 08eeead22b72c75d84624509286e6c54ec6656ec Lib > 040000 tree b1a2357d3d461d161d92d73aabb74f0a9ab52294 Mac > 100644 blob 2a687e58c9141b44520db9ad0b07b71525fd051d Makefile.pre.in > > For every blob in the list store its hash. For every tree use > > git cat-file -p $SHA1^{tree} Oops, my mistake:: git cat-file -p $TREE-SHA1 > recursively. > > I don't have any idea how to do that in Mercurial, though. > > > -Emanuel Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From brett at python.org Mon May 9 12:24:00 2016 From: brett at python.org (Brett Cannon) Date: Mon, 09 May 2016 16:24:00 +0000 Subject: [core-workflow] Some questions In-Reply-To: References: Message-ID: On Sun, 8 May 2016 at 16:29 Senthil Kumaran wrote: > > On Sun, May 8, 2016 at 4:12 PM, ?manuel Barry wrote: > >> I understand that there's >> already a semi-official mirror of the cpython repo on GitHub, and I've >> been >> wondering why it isn't enough for our needs. >> > > It is suitable for our needs. Our last discussion was about how do we > ascertain that > cpython git repo has the same history as the hg repo, so that after > migrate we do not loose any information from the old system. > Right, we *hope* the mirror is good enough, but when Eli created it he didn't worry too much about accuracy so we need to evaluate if it's good enough to simply switch to or if it needs to be thrown out. Hence, the discussion about how to ascertain if the mirror is acceptable. > > This could be done using: > > * check the number of commits in both repos for each branch > * checking the hash of the source files in two repos. > * (And do we go about validating each piece of commit log graph too)? > > If you have any suggestions, since you are using the cpython git mirror, > please feel free to share your thoughts. > We will also want a mapping of hg commits to git commits for https://hg.python.org/lookup which might help with the validation of the mirror. > > Welcome to the party! > +1! -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Fri May 20 05:08:52 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 20 May 2016 11:08:52 +0200 Subject: [core-workflow] Pagure and Fedora Message-ID: Hi, I just read a very interesting article about a new forge, Pagure: "Pagure and Fedora" https://lwn.net/SubscriberLink/687821/ddb9fc2c985a606a/ Pagure looks like a clone of GitHub implemented in Python (!) (Python 2 only yet, oooooh, but a Python 3 port is ongoing) and storing all data in Git! Excellent. Data: code, documentation, tickets, pull requests, etc. Just everything. https://pagure.io/pagure The main difference with GitHub is that you can more easily extract data to move to a new forge later. I also understand that it's free to host your own server, since Pagure is a libre (free) software (GitHub requires a license, no?). As written in the article, the GitHub still has a major advantage: its "network" (its community). I also shared the article because I read another very interesting article about Gerrit. Mike Bayer writes that Gerrit reviews are as much importants as changes themself. IMHO he's right, the information of reviews are very important and we should take to keep... especially if tomorrow we move to another forge ;-) http://techspot.zzzeek.org/2016/04/21/gerrit-is-awesome/ It looks like we are going to loose all Rietveld reviews when moving to GitHub. What if we move to Pagure tomorrow? :-p The CPython move to GitHub seems to have started. It looks like Pagure is still young, and GitHub has many advantages, but well, I wanted to share this project with you ;-) Note: GitHub was down a few minutes this morning ;-) https://status.github.com/ Victor From soltysh at gmail.com Fri May 20 16:21:33 2016 From: soltysh at gmail.com (Maciej Szulik) Date: Fri, 20 May 2016 22:21:33 +0200 Subject: [core-workflow] Pagure and Fedora In-Reply-To: References: Message-ID: On Fri, May 20, 2016 at 11:08 AM, Victor Stinner wrote: > Hi, > > I just read a very interesting article about a new forge, Pagure: > "Pagure and Fedora" > > https://lwn.net/SubscriberLink/687821/ddb9fc2c985a606a/ > > Pagure looks like a clone of GitHub implemented in Python (!) (Python > 2 only yet, oooooh, but a Python 3 port is ongoing) and storing all > data in Git! Excellent. Data: code, documentation, tickets, pull > requests, etc. Just everything. > > https://pagure.io/pagure > > The main difference with GitHub is that you can more easily extract > data to move to a new forge later. I also understand that it's free to > host your own server, since Pagure is a libre (free) software (GitHub > requires a license, no?). > > As written in the article, the GitHub still has a major advantage: its > "network" (its community). > > I also shared the article because I read another very interesting > article about Gerrit. Mike Bayer writes that Gerrit reviews are as > much importants as changes themself. IMHO he's right, the information > of reviews are very important and we should take to keep... especially > if tomorrow we move to another forge ;-) > http://techspot.zzzeek.org/2016/04/21/gerrit-is-awesome/ > > It looks like we are going to loose all Rietveld reviews when moving > to GitHub. What if we move to Pagure tomorrow? :-p > > The CPython move to GitHub seems to have started. It looks like Pagure > is still young, and GitHub has many advantages, but well, I wanted to > share this project with you ;-) > > Note: GitHub was down a few minutes this morning ;-) > > https://status.github.com/ > Victor I'm positive Brett will "hug" you a lot during language summit when talking about GitHub migration. Anyway, as you've mentioned one of the biggest advantages gh has over its competitors is its popularity which greatly opens the gates to new contributors. Personally I'd prefer we focus on improving the workflow on top of gh to be as much automated as possible. Here I'm talking about something similar we already have for OpenShift or Kubernetes where one of the core contributors can either ask the bot to test the PR (part of the tests or full suite) and then mark it for merge. Additionally, since we will still keep b.p.o around for issues we need to make the connection between it and gh as automatic as possible (including turning patches from b.p.o into PRs). The last one will be actually worked on by our GSoC student Anish Shah. Having said that, once we have all of those pieces in place I don't see any problem with yet another migration to eg. pagure, although I'm pretty sure you'll need to find a volunteer, who will be able to spent ton of time to do the migration. Like currently Brett is doing, for which he deserves eternal gratefulness :) Maciej PS. Every service, even the best one has its down time ;) From vgr255 at live.ca Fri May 20 22:38:04 2016 From: vgr255 at live.ca (=?utf-8?Q?=C3=89manuel_Barry?=) Date: Fri, 20 May 2016 22:38:04 -0400 Subject: [core-workflow] Some questions In-Reply-To: References: Message-ID: I just wanted to let you guys know that I haven?t been able to make any progress last week or this week on this, and it sucks because I really wanted to help on it, but a lot has happened in a short span of time and I don?t know when I?ll be able to work on anything for this. I?ll still try to free up some time to work on it, but it won?t be for a while, sadly. -Emanuel -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sat May 21 16:11:17 2016 From: brett at python.org (Brett Cannon) Date: Sat, 21 May 2016 20:11:17 +0000 Subject: [core-workflow] Some questions In-Reply-To: References: Message-ID: No worries, thanks for the update! On Fri, May 20, 2016, 21:39 ?manuel Barry wrote: > I just wanted to let you guys know that I haven?t been able to make any > progress last week or this week on this, and it sucks because I really > wanted to help on it, but a lot has happened in a short span of time and I > don?t know when I?ll be able to work on anything for this. I?ll still try > to free up some time to work on it, but it won?t be for a while, sadly. > > > > -Emanuel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tomnyberg at gmail.com Sat May 21 18:30:31 2016 From: tomnyberg at gmail.com (Thomas Nyberg) Date: Sat, 21 May 2016 18:30:31 -0400 Subject: [core-workflow] Some questions In-Reply-To: References: Message-ID: <5740E187.4060402@gmail.com> Hello, I'm attaching a script that is an initial attempt at doing this for the git side of things. Everything is done in bash at the moment. It does make use of gnu parallel (which is the parallel package in both Ubuntu and debian). I don't think anything else it uses isn't a standard tool in linux (except git). Basically to run it do the following: 1) clone cpython.git into the current directory (i.e. try not to have any "generated" files) 2) put scan.sh in the current directory and run it there What it does is the following: It checkouts out every 1000th commit (can be changed) going backwards on the current branch and computes the md5sum for every file (except those found in .git) and puts the md5sums in a file in a outdir/ directory that it creates. The names of the files are $num-$commit where the $num is the number of commits _backwards_ from the current commit (which makes sense if you think about iterating backwards from the current commit). Running this on my laptop took ~11 minutes. I uploaded the output directory here in case you don't feel like running it: http://thomasnyberg.com/outdir.tar.bz2 (Ignore the frontpage of my "website". I'm obviously not all that concerned by it...) In any case, this might be helpful for others in addition to myself. I figured it was best to email the list before continuing (maybe isn't really what's needed...). Possible things to add to this: * doing something similar with comments * doing the same thing on all branches * maybe only compute the md5sum for changed files * little thought has gone into efficiency...there may be obvious gains hiding Of course something similar would have to be run with the hg version and then a comparison would need to be done. Hopefully this is helpful... Cheers, Thomas On 05/08/2016 07:28 PM, Senthil Kumaran wrote: > > On Sun, May 8, 2016 at 4:12 PM, ?manuel Barry > wrote: > > I understand that there's > already a semi-official mirror of the cpython repo on GitHub, and > I've been > wondering why it isn't enough for our needs. > > > It is suitable for our needs. Our last discussion was about how do we > ascertain that > cpython git repo has the same history as the hg repo, so that after > migrate we do not loose any information from the old system. > > This could be done using: > > * check the number of commits in both repos for each branch > * checking the hash of the source files in two repos. > * (And do we go about validating each piece of commit log graph too)? > > If you have any suggestions, since you are using the cpython git mirror, > please feel free to share your thoughts. > > Welcome to the party! > > Thanks, > Senthil > > > _______________________________________________ > core-workflow mailing list > core-workflow at python.org > https://mail.python.org/mailman/listinfo/core-workflow > This list is governed by the PSF Code of Conduct: https://www.python.org/psf/codeofconduct > -------------- next part -------------- A non-text attachment was scrubbed... Name: scan.sh Type: application/x-shellscript Size: 997 bytes Desc: not available URL: From shah.anish07 at gmail.com Wed May 25 16:29:36 2016 From: shah.anish07 at gmail.com (Anish Shah) Date: Thu, 26 May 2016 01:59:36 +0530 Subject: [core-workflow] GitHub integration In-Reply-To: References: Message-ID: Hello everyone, My name is Anish Shah and I will be working on GitHub integration under GSoC 2016. This mail is briefing you all about the progress till now. [1] A new "GitHub PR URL" field on issue page - A developer can submit a link to GH pull request that he made. [2] Link PR in comments - currently, issue number and PEP gets link automatically in comments. Same way, "PR 123", "pull request 123" or "pullrequest 123" will get linked automatically. [3] Link GitHub PR automatically to b.p.o issues. - On GitHub, if there's a string like "fixes #123" in PR comments/title/body, then it gets linked to issue 123 automatically. Using GitHub webhooks, we can link PR to b.p.o issues if there's a string like "fixes bpo123". [4] Show PR status (open/closed/merged) on b.p.o issues page. Some things that I will be working over the next few weeks are :- - Posting some of the PR review comments on b.p.o. - Auto-conversion of patches to PR [1] http://psf.upfronthosting.co.za/roundup/meta/issue586 [2] http://psf.upfronthosting.co.za/roundup/meta/issue587 [3] http://psf.upfronthosting.co.za/roundup/meta/issue589 [4] http://psf.upfronthosting.co.za/roundup/meta/issue590 Thank You, Anish Shah -------------- next part -------------- An HTML attachment was scrubbed... URL: From senthil at uthcode.com Wed May 25 16:45:11 2016 From: senthil at uthcode.com (Senthil Kumaran) Date: Wed, 25 May 2016 13:45:11 -0700 Subject: [core-workflow] GitHub integration In-Reply-To: References: Message-ID: Hello Anish, On Wed, May 25, 2016 at 1:29 PM, Anish Shah wrote: > My name is Anish Shah and I will be working on GitHub integration under > GSoC 2016. This mail is briefing you all about the progress till now. Welcome and good progress. For us to be aware, who is your assigned GSoC mentor for this project? You could also share the proposal that you submitted. Thanks, Senthil -------------- next part -------------- An HTML attachment was scrubbed... URL: From soltysh at gmail.com Thu May 26 16:02:16 2016 From: soltysh at gmail.com (Maciej Szulik) Date: Thu, 26 May 2016 22:02:16 +0200 Subject: [core-workflow] GitHub integration In-Reply-To: References: Message-ID: On behalf of the entire python community and as Anish mentor I'd like to thank him for the work he already did and I'm looking forward to see more :) To anyone else, the fact I'm Anish mentor does not stop you from reviewing his patches, in fact all reviews/comments are welcome both by Anish and myself :) Once again thanks, Maciej On Wed, May 25, 2016 at 10:45 PM, Senthil Kumaran wrote: > Hello Anish, > > On Wed, May 25, 2016 at 1:29 PM, Anish Shah wrote: >> >> My name is Anish Shah and I will be working on GitHub integration under >> GSoC 2016. This mail is briefing you all about the progress till now. > > > Welcome and good progress. For us to be aware, who is your assigned GSoC > mentor for this project? You could also share the proposal that you > submitted. > > Thanks, > Senthil > > > > _______________________________________________ > core-workflow mailing list > core-workflow at python.org > https://mail.python.org/mailman/listinfo/core-workflow > This list is governed by the PSF Code of Conduct: > https://www.python.org/psf/codeofconduct From mail at kushaldas.in Fri May 27 01:41:48 2016 From: mail at kushaldas.in (Kushal Das) Date: Thu, 26 May 2016 22:41:48 -0700 Subject: [core-workflow] Pagure and Fedora In-Reply-To: References: Message-ID: <20160527054148.GB7081@kdas-laptop.hp.lan> On 20/05/16, Maciej Szulik wrote: > On Fri, May 20, 2016 at 11:08 AM, Victor Stinner > Victor I'm positive Brett will "hug" you a lot during language summit > when talking about > GitHub migration. Anyway, as you've mentioned one of the biggest > advantages gh has > over its competitors is its popularity which greatly opens the gates > to new contributors. > Personally I'd prefer we focus on improving the workflow on top of gh > to be as much > automated as possible. Here I'm talking about something similar we > already have for > OpenShift or Kubernetes where one of the core contributors can either > ask the bot to > test the PR (part of the tests or full suite) and then mark it for > merge. We actually have a option of testing a patch using ci.centos.org, and sadly not many devels are using it :( I will be talking about it in the language summit. Kushal -- Fedora Cloud Engineer CPython Core Developer https://kushaldas.in https://dgplug.org From brett at python.org Mon May 30 18:22:17 2016 From: brett at python.org (Brett Cannon) Date: Mon, 30 May 2016 22:22:17 +0000 Subject: [core-workflow] Update from the PyCon language summit Message-ID: Basically the key things that came out of the language summit are two things. One is that to move over cpython will require matching what we have now, so when a commit lands on the repo that bugs.python.org gets a message and a way to associate an issue with a PR (probably something like detecting "Issue #" in the title of the PR and then making the association on bugs.python.org). Otherwise dealing w/ sys._mercurial/sys._git and updating the devguide are the only requirements. Two, in order to shutdown Rietveld we will need to back up the database of code reviews and dump the review information in some readable format (maybe something as simple as running wget over Rietveld). -------------- next part -------------- An HTML attachment was scrubbed... URL: