October 20, 2015

Narrow Cloning with GitSwarm

Git at Scale
Traceability

Git developers working with repos containing large numbers of files or large binary files know all too well how performance degrades in such cases. As a result the Git community has long desired the ability to clone only part of a repo, so-called “narrow cloning”, but it’s a feature Git still lacks and doesn’t seem to be a development priority.

In contrast, the DVCS features of Perforce Helix include narrow cloning right out of the box, and even Git developers can now enjoy narrow cloning courtesy of GitSwarm and the Helix Versioning Engine. It isn’t as simple with Git as with Helix's native DVCS features, but GitSwarm is still in its infancy. A simpler and integrated user-interface will be included in a future version, but our users don’t have to wait. This blog post will explain how to make use of this helpful capability in the meantime.

I’ve already written in a previous blog how GitSwarm can import projects from Helix, but what isn’t so obvious is how these projects are served up by the Helix Versioning Engine courtesy of Git Fusion. Configuring a new repository for import is, in fact, as simple as defining it via a Git Fusion configuration file[1]. Let’s take a look at an example:

[@repo]
description = Talkhouse
charset = utf8
ignore-author-permissions = no
enable-git-branch-creation = yes
enable-git-merge-commits = yes

[Talkhouse-master]
git-branch-name = master
view = //depot/Talkhouse/main-dev/... ...

[Talkhouse-rel1.0]
git-branch-name = rel1.0
view = //depot/Talkhouse/rel1.0/... ...

[Talkhouse-rel1.5]
git-branch-name = rel1.5
view = //depot/Talkhouse/rel1.5/... ...

The above is taken from a configuration file for our Talkhouse demo project, used in a variety of demonstration and other materials. It’s found in the repository named “.git-fusion”, its full file name in my depot being “//.git-fusion/repos/Talkhouse/p4gf-config”. Git Fusion lets you define an arbitrary number of Git repos, each of which has its own such configuration file. The contents of said configuration files let you specify what content to include.

The above example in particular defines a total of three branches for the resulting Git repo, named “master”, “rel1.0”, and “rel1.5” respectively. Each branch has its own view specification that tells Git Fusion how to map content from the Helix depots into the branch. It may look confusing at first, but it becomes easier with a little practice.

What might not be immediately apparent is just how much power and flexibility this provides for defining Git repos. They may be aggregated from multiple Helix depots, include as much or as little content as desired, and even remap content to different file/folder names. This makes it possible for GitSwarm to enable what is essentially narrow cloning like no other Git management solution, carving off Git-sized slices from huge Helix monorepos with ease. Consider the following:

[@repo]
description = JustTheCode
charset = utf8
ignore-author-permissions = no
enable-git-branch-creation = yes
enable-git-merge-commits = yes

[JustTheCode-master]
git-branch-name = master
view = //depot/MyHugeProject/main/source/... ...

That example exposes a Git Fusion repo named “JustTheCode” that includes only the source folder from the main stream of a Helix depot, remapping it into the root folder for the master branch of the resulting Git repo. I could use the same technique to expose just the artwork, include other folders, limit my repo to just a few files, or whatever I wish. The possibilities are endless.

Once the configuration file is in its proper place, it will be available for import and mirroring in GitSwarm. You need only create a new GitSwarm project and make the connection. Git developers may then clone immediately as they’re accustomed. All of their changes will be mirrored in Helix when pushed to GitSwarm, just as all changes made in Helix to any of the included content will be mirrored in GitSwarm and picked up when Git developers next pull.

In conclusion, GitSwarm with Helix is the only complete Git management solution for the enterprise that lets you work locally and scale globally. It may be downloaded for free at the Perforce web site, so why not give it a spin and make all of your stakeholders much happier with narrow cloning?


[1] Full details are available in the Git Fusion guide section on “Setting up Repos”, available online at http://www.perforce.com/perforce/r15.3/manuals/git-fusion/chapter_dyn_ngj_3l.html