[core-workflow] Help needed: best way to convert hg repos to git?

Pierre-Yves David pierre-yves.david at ens-lyon.org
Mon Feb 15 16:55:01 EST 2016



On 02/15/2016 10:12 AM, Petr Viktorin wrote:
> On 02/14/2016 06:24 PM, Pierre-Yves David wrote:
>> On 02/13/2016 02:56 AM, Nick Coghlan wrote:
>>> On 13 February 2016 at 11:15, Brett Cannon <brett at python.org> wrote:
>>>> I don't remember the story behind cpython-fullhistory, but it's
>>>> obviously
>>>> incomplete since just stopped post conversion. You will need to find
>>>> someone
>>>> who knows (I'd ask on python-dev).
>>>>
>>>> Also realize that this will be  our fourth VCS (cvs, svn, hg, now
>>>> git). This
>>>> is not going to be a perfect history of the semantic actions of
>>>> commits from
>>>> the beginning of time just due to the fact that these VCS tools all use
>>>> different concepts.
>>>
>>> There's also the fact that prior to the move to SourceForge in 2000,
>>> all changes had to be funneled through the half dozen or so people
>>> with write access to the CVS tree:
>>> https://docs.python.org/3/whatsnew/2.0.html#new-development-process
>>>
>>> I think it's definitely OK if future code archaeologists need to dig
>>> into the SVN repository to get a more complete view of CPython's
>>> history.
>>
>> I've never met a project who did not regret such decision at some point.
>> Keeping older history is usually valuable. Mercurial have powerful
>> enough tool to let you get all the history back together, I assume git
>> probably have that power too.
>>
>> This is your call, but I strongly recommend taking advantage of this
>> migration to put everything back together.
>>
>
> While "putting everything back together" would be great, it doesn't
> *have* to block the migration. Git has a command called "git replace"
> that lets you do this later.
>
> The Linux kernel (which switched to Git before Git migration tools
> existed) has a separate "early history" repo that you can "prepend" to
> the main one. Then, in your local copy, it looks like one unbroken
> history. Since Git commits are snapshots and not deltas, this works
> amazingly well -- it's just telling Git's object retrieval routine to
> retrieve <object X> instead of <object Y>. The disadvantage is that it
> has to be done in each clone individually -- no one can rewrite history
> for others.
>
> Two commands every future historian would have to do:
>      git fetch <url_for_old_history>
>      git replace --graft <first_commit_of_new_history>
> <last_commit_of_old_history>

While this exists, I believe this would be much more convenient to have 
the history right in the first place.

But that's not my call.

-- 
Pierre-Yves David


More information about the core-workflow mailing list