Managing Multiple Git Repositories — How to Do It
A single repository won't cut it for large Git projects today. That's why many teams are managing multiple Git repositories.
When to Use Multiple Git Repositories
You should use multiple Git repositories if your codebase is too large to maintain in a single Git repository. Git can't scale to handle 10s of thousands of users or 100s of petabytes of data in one repository.
Using multiple Git repositories is the only way to work efficiently. This enables each team to work independently, and do their work faster. You can also make sure that developers only have access to the repositories they need access to (thus making Git more secure.)
Challenges With Managing Multiple Git Repositories
There are some serious challenges to managing multiple repositories in Git.
These challenges include:
- Finding reliable source of truth.
- Managing dependencies across repos.
- Reviewing changes with Git pull requests.
- Making changes (or rollbacks) that sync across repos.
- Identifying repositories with uncommitted changes.
- Enforcing workflows.
- Resolving conflicts across repos.
Luckily, these challenges can be solved by selecting a better way to manage repositories in Git.
3 Git Repository Management Tools: Pros & Cons
Here are three options for Git repository management tools — and their pros and cons.
1. Add Git Submodules For Git Repositories
Git submodules are a built-in way to work with subprojects — and bring other Git repositories in.
What Is a Git Submodule?
A submodule is a way to embed the content of a "foreign" git repository in another one. A submodule is locked to a given version of the other project. If you need to track a newer version, you'll need to update the submodule, then commit the change in the outer repository.
Pros: Why Use Submodules?
There are two pros of using Git submodules:
- Atomic commits; can't have a partial change to subproject state.
- Tools are bundled with the default Git install (although subtree is only in new versions).
Cons: Security Concerns & More
The biggest con to using Git submodules is the security risk.
A major security flaw was discovered in Git submodules in 2018. A defect in the Git source code allowed configurations elements such as the content of the .git/config file and hook scripts to execute code on a remote system (i.e., your system). This means that when you pull down some code, an attacker could get control of your computer and you might never even know it.
After the vulnerability was discovered, Microsoft, GitHub, and GitLab all released patches. But what about all the open source Git servers out in the world that aren’t regularly updated? Chances are good they would not be easy to patch, and/or there is no one to patch them — leaving them vulnerable.
Security risks aren't the only con to Git submodules. Other cons include:
- Additional setup.
- Developers need to learn new commands to handle changes in subprojects.
[Related White Paper: How to Lock Down Git]
2. Pull in Multiple Git Repositories With Repo
Repo is a Google-built tool to manage multiple Git repositories. It's designed for Android development (working on Android code requires Git).
What Repo Does
Repo makes it possible to keep up with 1,000s of Android repositories.
Android is an extreme case of a project that has outgrown a single Git repository. Android has more than 1,000 repos in its codebase. And, the management task extends beyond the OS code itself. It extends to device-specific code, applications, and multiple versions/releases that need to be supported over a long period of time.
It's impractical to keep up-to-date with all of the Android repositories without some kind of tool on top of Git. To automate the "git clone" and "git pull" operations to catch up with Android, you need to run a "repo sync". There are also "repo upload" and "repo download" commands, which automate the client side of interacting with the Gerrit code review system.
Ironically, it has been widely discussed that the development teams at Google don’t even use Git or repo internally to manage their Android codebases.
Pros: Why Use Repo?
There are two main pros to using Repo:
- Well-documented on the Android site.
- Connects well to Gerrit.
Cons: Designed For Android
The biggest con to using Repo is it's designed to implement one workflow: the Android one. It may not work well for alternate workflow designs.
3. Manage Multiple Git Repositories in One Project With Perforce
Perforce offers two Git tools that make it easy to manage multiple Git repositories — even in one project.
What HelixTeamHub + Helix4Git Do
Unlike wrapper tools, commits are atomic. There are no separate repositories to deal with. And unlike submodules, there are no extra Git commands.
If you're familiar with how Git uses content hashing for security, this might sound too good to be true. After all, every commit is dependent on the entire history of the project before it, through a chain of SHA-1 hashes. However, Helix4Git repositories have completely valid hashes, so they'll work fine with any Git implementation.
Pros: Why Use Perforce Tools?
Helix TeamHub can host your repositories as they are today. And Helix4Git looks just like Git to developers, but it provides several benefits to developers and administrators. These benefits include:
- Global scale.
- Single source of truth.
- IP protection with compliance, security, and reliability.
- No extra client-side tools or skills required.
- Replication available globally.
- 40–80% faster builds.
Cons: Admin Expertise
The biggest con to using Helix4Git is that the server requires some additional admin expertise for initial setup. But, Perforce experts can help you manage this.
Get Git Repository Management Tools
Git repository management tools — like Helix TeamHub and Helix4Git — make it easy to manage multiple repositories.
You can get started with Git repository hosting with Helix TeamHub for free for up to 5 users and 1 GB of data.