sábado, 4 de abril de 2009

One DVCM to rule them all (follow up)


There were some comments in my original story about the performance of some DVCMs where I was told that bazaar degrades pretty much when you have thousands upon thousands of revisions and that the repositories could be packed. I decided to follow suit and see where git and bzr would stand having some thousands of revisions in them.

First I used git-svn to import some 20,000 revisions of a project into git (I got the first 20,696 revisions from kde... there were roughly a million, but I thought that would be enough... as a matter of fact I spent a couple of days getting to those 20,696 revisions).

I exported the content of git and imported it into the separate VCMs to see how they would match up on that task.

Bazaar took hours to complete this import. The first 2000 revisions where imported in about 6 minutes... but by the end, every 100 (hundred, not thousand) revisions were imported in roughly 10 minutes (one commit every 6 seconds?). The repository would be like 554 MBs in size (after packing).

Git made the import (so that I matched apples to apples) in less than 5 minutes and ended up with a repository like 283 Mbs in size (after gc).

Halfway diff of the project to where it is in the last revision took bazaar some 9 minutes and 15 seconds. Git made it in about 28 seconds. I think bzr won't recover after that liver hook.

When I tried to move to that halfway revision, git took 17 seconds to do it (reset --hard revid), bzr took.... well, to tell you the truth, I forgot about it... I went for lunch, came back and it was still working on it. In Tenchu terms, git got a Grand Master (by the way... I'd love to play Tenchu!).

Well... git did mop the floor with bzr on a big repo after all, both in terms of repository size and performance.

Should I include mercurial? Could it withstand git? How do I make the import to begin with? I tried with hg import -, but it was using massive amounts of memory (bzr did too, by the way... I barely made it to import with the memory I had) and I didn't know if it was the right way to do it.

bzr finally reverted. It took 46 minutes.

2 comentarios:

  1. I wonder what repo format you are using for bzr. The default (pack-0.92) isn't very good. 1.9-rich-root is much better.
    You could try running the numbers after upgrading.

    [foo]% bzr init
    Created a standalone tree (format: pack-0.92)
    [foo]% bzr upgrade --1.9-rich-root

    In bzr 1.14 is planning to land a new format (brisbane-core) which is supposed to perform better.

    1.9-rich-root should perform better though I doubt if it will be as fast as git. I would certainly be interested in the numbers.

    bzr is my preferred DVCS, performance is certainly not the reason for that :)

  2. Let's do something. When bzr 1.14 comes out, I'll be running this test again. Fair enough? :-)