July 21, 2015

Git Push Just Got a Whole Lot Faster with Git Fusion 2015.2

Git at Scale

Git Fusion 2015.2 includes a major update to its git push implementation that:

  • Improves performance by 10x or better
  • Reduces memory consumption, usually keeping it under a few GB
  • Fully populates all branches within Helix with revisions de-duplicated automatically

Let's take a look at each of these in turn.

Faster

Git Fusion’s original implementation repeated a loop for each Git commit:

  • Determine which of the files are already in Helix
  • Prepare the actions that need to be made
  • Submit the actions to the various files

That process was fine for simple work, but it took far too long when hundreds (or hundreds of thousands) of commits were involved. In contrast, Git Fusion 2015.2’s fast convert code does its work locally on the Git Fusion server, writing a set of Helix journal files with entries for each file revision and changelist, then records all the work in a single operation.

The result is dramatically faster. Repos that used to take hours to convert now do so in minutes instead. As a matter of fact, the new approach is so much faster that many of our internal, testing repos push so quickly their times amount to little more than statistical noise. We’ve actually had to find much larger repos to test with to achieve meaningful results. And that’s a nice problem to have.

Less Memory

Git Fusion’s original git push accumulated an in-memory list of each Git commit and tree object to back them up into Helix for later recovery. Spectators watching Git Fusion’s memory usage could watch it march inexorably upward. Huge repos could consume 100GB or more of memory, far more than most Git Fusion servers had available.

Git Fusion 2015.2’s fast convert code offloads all that storage to disk, storing Git objects directly as p4 unzip payloads it will eventually send to Helix. It also uses a SQLite database to hold its analogue of the Helix Versioning Engine’s librarian.

The result is that most repos now push with under 1GB of active memory consumed. In fact, the highest consumption we’ve seen with the new approach is 10GB memory, which is a huge improvement.

Fully Populated Branches Everywhere

Git Fusion does a lot of work to minimize the number of files it branches within Helix. Fewer db.rev records and fewer db.integed records mean big savings for the Helix Versioning Engine’s database.

But this work costs a lot of compute time and slows git push. It also creates “lightweight branches” that Helix users cannot immediately use.

Git Fusion 2015.2’s fast convert code creates fully-populated branches for each branch of Git history. The result is a much faster git push that results in Helix depot branches available to any authorized user.

Git Fusion still minimizes the number of branches. But it will duplicate a series of Git commits across several Helix depot branches, preferring a few (or a few hundred) duplicate changelists to creating another fully populated depot branch.

File Revisions De-duplicated

Git Fusion 2015.2’s fast convert code stores exactly one copy of each blob that it copies from Git. It then uses the lazy copy ability of the Helix Versioning Engine to refer to this one copy whenever a file revision holds that data. This saves a lot of disk space on the Perforce server.

First Push Only

The new fast convert code applies only to the first push of a new repo. One of the reasons it can run so quickly is that it does not check with the Helix Versioning Engine for existing history.

But this isn’t as limiting as it sounds. Usually the first push of a group’s repo is the largest. Pushes after that are typically smaller, incremental pushes that complete within a few seconds.

And yes, before you ask, we’re considering extending “fast convert” to second-or-later pushes.

How to Get It

Getting started with the new version of Git Fusion 2015.2 is easy:

Step 1: Upgrade your Perforce Helix Versioning Engine
Git Fusion 2015.2 requires the very latest version of P4D (2015.1/1171507+) because it includes the necessary support for Git Fusion’s use of p4 unzip.

Step 2: Install Git Fusion 2015.2

Step 3: Push a new repo

By upgrading, you'll find that Git Fusion 2015.2 makes Git a faster and better Perforce Helix client than ever before.