March 3, 2014

Git Fusion Grab Bag

Git at Scale

git fusion icon

I recently answered a bunch of questions about Git Fusion from one of our customers. If you're using or considering Git Fusion, you might have the same questions.

How do Git remote branches work in Git Fusion?

Git users sure do make a lot of branches. At any single time, I probably have 4 or 5 task branches active in my own repo. And yes, I push them to Perforce just fine, my fellow Git-using coworkers pull them down and see them, too.

Git Fusion stores every file revision of every commit pushed, whether those commits are pushed to some master-ish branch mapped to //depot/main/..., or some ephemeral branch like "job01234", or part of some merged history that does not even have a Git branch name.

When you work in a branch other than master (or any other branch that maps to Perforce), Git Fusion stores your commits as changelists, storing files in depot paths under //.git-fusion/branches/... instead of the usual //depot/main/... or wherever.

A Git merge commit between branches becomes a Perforce changelist with p4 integ actions. A merge between Git branches becomes a bunch of integrate actions between depot branches. This is also how Git Fusion stores any review pushed from Git: Git Fusion merges the reviewed code into master or wherever, then stores that merge commit as a shelved pending changelist for Swarm review.

A rebased Git commit becomes a Perforce changelist with p4 add/edit/delete actions. So if you rebase a bunch of work onto master then push, you get a linear history in //depot/main/... .

Can you delete a remote branch?

Yep. In Git, "delete a [remote] branch" really means "delete a [remote] branch reference", not "delete the history referred to by that branch." Both Git* and Perforce hang onto the history that the doomed branch referred to. It's actually a common workflow: work on a branch for a while, merge or rebase it into permanent history, then delete the branch reference.

*(Yes, Git will eventually garbage collect old history if not merged into some surviving branch or tag. But that won't affect Perforce or its copy of that doomed history.)

Merge vs. Rebase?

Merge and rebase have an effect on the shape of Git history, which in turn has an identical effect on Perforce history:

  • Merge commits isolate the development of a feature or fix, then integrate that into a master branch. A merge workflow will create many 'p4 integ' actions in Perforce. You end up with with p4 filelog or P4V Revision Graph showing a file's threaded history as it is changed and merged across branches.
  • Rebased history keeps history linear. It creates a single Perforce depot branch, filled with p4 add/edit/delete actions.

Both workflows are common and supported by Git Fusion.

I tend to prefer rebase, because I like linear history and git bisect. Wait, let me rephrase that. I like linear history. I love git bisect.

Oh no! I pushed something I didn't mean to!

Wait, you mean I'm not the only person wracked with post-push regret? Nothing like that sinking feeling when you realize you just pushed in some code with a big giant debugging dump not commented out, or some experimental feature that wasn't supposed to be enabled.

Don't worry about it.

If you're on the Git side, just fix it and push the fix. This is how I fix breakages, if I can find and fix them easily.

If you're on the Perforce side, try P4V's "Back Out Submitted Changelist". This is how I fix breakages that take longer than a few minutes to fix.

In both cases, you end up with a new Perforce changelist that undoes the damage. Git users will see the new changelist/commit next time they pull. Perforce users will see it next time they p4 sync.

Which Perforce changelist goes with this Git commit?

Git Fusion decorates each Perforce changelist description with the sha1 of the Git commit that produced it. This was originally added as a way for scripts and tools to find which Git commit goes with which Perforce changelist. But it turns out to be mighty useful to humans, too.

But what about the other way?

Git Fusion does not decorate each Git commit message with the Perforce changelist number that produced it. That would change the commit sha1, which would change published history. Prohibited. Git users don't take kindly to having their sha1s change out from under them.

The Swarm team put in a spiffy little feature for us Git users: append a Git commit sha1 to Swarm's URL:
Swarm redirects to the corresponding Perforce changelist:

It's awesome. A clever programmer could probably write a wget/curl script to return the changelist number for any given sha1.

Not using Swarm? Git Fusion actually stores what you need, encoded into the depot path of its copy of the original Git commit object:

$ p4 files //.git-fusion/objects/repos/gfmain/commits/a3/21/b55*
  4fdff24a2c6ea8cab1292-master,791313#1 - add change 791314 (ubinary)

You need to know the name of the repo (gfmain here), and then insert a few slashes to break up the sha1 after the second and fourth character. The Perforce changelist number is stored at the end of the depot file path, after a comma.

How does Git Fusion handle executable files? Binary files?

Git Fusion translates between Git file modes and Perforce file types. As a file sets or clears its Perforce +x type, the corresponding Git commit sets the file mode to 100755 (executable) or 100644 (not executable). And vice-versa.

Binary files go into Perforce as binary. This skips Perforce's usual line ending translation, or for unicode-enabled Perforce servers, any character set conversion. Because changing bytes in a binary file tends to be bad.

It really does work.

Many of these questions about Git Fusion details are just tests around the edges of a larger question: "Does it really work? Will it work for me?"

Yeah. It really works. We use it here at Perforce. We've got branches and merges and continuous integration servers and Swarm reviews and all the joys and pain that active projects enjoy. We have Git and Perforce users all working on the same code. Git Fusion lets everyone use the tool best suited for their work.