August 14, 2018

Multiple Git Repositories: What’s the Best Way to Manage Them?

Git at Scale

In many large projects using Git, a developer needs to work with more code than can be comfortably managed in a single Git repository. Teams have come up with many creative ways to break code into smaller pieces, work on them independently, and then bring it back together into shippable products. There has to be a better way!  Let’s take a look at some of the alternatives.

Git Submodules

Git itself has a built-in way to work with subprojects: submodules. A submodule is a way to embed the content of a "foreign" git repository in another one. A submodule is locked to a given version of the other project. If you need to track a newer version, you'll need to update the submodule, then commit the change in the outer repository.

Why Choose Submodules?

Things to Consider With Submodules 

Atomic commits; can't have a partial change to subproject state.

Tools are bundled with the default Git install (although subtree is only in new versions).

Requires additional setup.

Requires developers to learn new commands to handle changes in subprojects.

Security Concerns With Git Submodules

A few months ago, a major security flaw was discovered involving Git submodules. A defect in the Git source code allowed configurations elements such as the content of the .git/config file and hook scripts to execute code on a remote system (i.e., your system). This means that by your pulling down some code, an attacker could get control of your computer and you might never even know it. After the vulnerability was discovered, Microsoft, GitHub, and GitLab all released patches. But what about all the open source Git servers out in the world that aren’t regularly updated? Chances are good they would not be easy to patch, and/or there is no one to patch them— leaving them vulnerable.

Repo: The Android Git Wrapper

Android is an extreme case of a project that has outgrown a single Git repository. Android has more than one thousand repos of codebase. And, the management task extends beyond the OS code itself. It extends to device-specific code, applications, and multiple versions/releases that need to be supported over a long period of time.

It's impractical to keep up-to-date with all of the Android repositories without some kind of tool on top of Git. To automate the "git clone" and "git pull" operations to catch up with Android, you need to run a "repo sync" to get everything in sync. There are also "repo upload" and "repo download" commands, which automate the client side of interacting with the Gerrit code review system.

Ironically, it has been widely discussed that the development teams at Google don’t even use Git or repo internally to manage their Android codebases.

Why Choose Repo?

Things to Consider With Repo

Well-documented on the Android site.

Connects well to Gerrit.

Designed to implement one workflow: the Android one. May not work well for alternate workflow designs.


Helix4Git is a native Git server inside the Perforce Helix Core server. It looks just like Git to developers, but it provides several benefits to developers and administrators:

  • It scales globally.
  • It gives you a single source of truth.
  • It protects your intellectual property with compliance, security and reliability.

Each developer can work in a combined Git repository that can include code and history from multiple sub-projects.  Unlike wrapper tools, commits are atomic, since there are no separate repositories to deal with. And unlike submodules, there are no extra Git commands.

If you're familiar with how Git uses content hashing for security, this might sound too good to be true. After all, every commit is dependent on the entire history of the project before it, through a chain of SHA-1 hashes. However, Helix4Git repositories have completely valid hashes, so they'll work fine with any Git implementation.

Helix4Git lets you manage large-scale DevOps with continuous build, integrate, and test.

Why Choose Helix4Git?

Things to Consider With Helix4Git

Works just like Native Git.

No extra client-side tools or skills required.

Scales globally.

Can replicate content around the globe.

40-80% faster builds than other Git servers.

Server requires some additional admin expertise for initial setup.