[Python-Dev] Actual Mercurial Roadmap for February (Was: svn outage on Friday)

John Arbash Meinel john at arbash-meinel.com
Tue Feb 22 17:56:10 CET 2011


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 2/22/2011 9:41 AM, anatoly techtonik wrote:
> On Fri, Feb 18, 2011 at 4:00 PM, Dirkjan Ochtman <dirkjan at ochtman.nl> wrote:
>> On Fri, Feb 18, 2011 at 14:41, anatoly techtonik <techtonik at gmail.com> wrote:
>>> Do you have a public list of stuff to be done (i.e. Roadmap)?
>>> BTW, what is the size of Mercurial clone for Python repository?
>>
>> There is a TODO file in the pymigr repo (though I think that is
>> currently inaccessible).
> 
> Can you provide a link? I don't know where to search. Should we open a
> src.python.org domain?
> 
>> I don't have a recent optimized clone to check the size of, yet.
> 
> What is the size of non-optimized clone then? I know that a clone of
> such relatively small project as MoinMoin is about 250Mb. ISTM that
> Python repository may take more than 1GB, and that's not acceptable
> IMHO. BTW, what do you mean by optimization - I hope not stripping the
> history?

Mercurial repositories are sensitive to the order that data is inserted
into them. So re-ordering the topological insert can dramatically
improve compression.

The quick overview is that in a given file's history, each diff is
computed to the previous text in that file. So if you have a history like:

 foo
  | \
 foo baz
 bar foo
  | /
  baz
  foo
  bar

This can be stored as either:

 foo

 +bar

 -bar

 +baz
 +bar

This matters more if you have a long divergent history for a while:

 A
 |\
 B C
 | |
 D E
 | |
 F G
 : :
 X Y
 |/
 Z


In this case, you could end up with contents that look like:

 A +B +D +F +X -BDFX+C +E +G +Y +ABDFXZ

Or you could have the history 'interleaved':

 A +B -B+C -C+BD -BD+CE -BDF+CEG -...

There are tools that take a history file, and rewrite it to be more
compact. I don't know much more than that myself. But especially with
something like an svn conversion which probably works on each revision
at a time, where people are committing to different branches
concurrently, I would imagine the conversion could easily end up in the
pessimistic case.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1j6qoACgkQJdeBCYSNAAPzPgCdEOJsHf4Xf4lZH+jHX42FQb8J
sQoAn3JuCmDcsyv0JZpXtbVJoGewA+7t
=M8DI
-----END PGP SIGNATURE-----


More information about the Python-Dev mailing list