November 15, 2019

Is Git Free?

Git at Scale
Version Control

The best thing about open source software is arguably freedom. Its proponents often break this down in two senses:

  1. Free as in beer (software someone lets you use for free).
  2. Free as in speech (software that is free for all to use and modify).

Is Git Free?

Git is free. It's a classic example because it’s free in both senses. It costs nothing for you to download and use as you wish. And you have the right to leverage and/or rework the source code as well.

But here’s the catch: free isn’t actually free. There are major hidden costs of using Git.

[Related Resource: What Is Git Version Control?]

 

Why Git Isn't Free

Git isn't free for many reasons. It can't scale, and using it comes at a cost.

Definition of Scale

There are 3 elements of scale:

  • Vertically.
  • Horizontally. 
  • Geographically.

Vertical scaling means wringing the most from existing assets. Horizontal scaling means spreading work across more assets. And geographical scaling refers to that special set of headaches involved in uniting resources and personnel scattered around the globe.

Git wasn’t designed with any of these elements of scale in mind, which leads to several costs (noted below). 

7 Costs of Using Git

Cost #1: Storage Model

Git's storage model costs your team time as it slows down. 

This model is built around filing assets by their SHA-1 hash values. This is great for deduplication of storage. But it leaves much to be desired as repositories grow. A variety of Git commands require computing (and possibly re-computing many) hash values.

This process which grows only slower over time as more content is added. This model can't scale. 

Cost #2: Performance

Git's performance leads to extra costs as well.

Native Git does little to maximize use of computing hardware. Throwing additional processor cores or memory into the mix isn’t going to help you scale vertically.

Adding a Git hosting solution solves this by throwing a web-based front end on top of native Git. Doing so allows web applications address the problem by asynchronously handling multiple requests. 

Git hosting services start free, but ramp up costs as you add users and storage. (Check out Helix TeamHub's pricing for this service.) 

Cost #3: Large Binaries

Native Git can't handle large binaries. 

The Git community has responded to this largely through a series of extensions and workflow changes. The goal is to keep big files outside the actual version control repositories. Instead, they're integrated into local, working folders as needed.

Tools like git-annex and Git LFS are designed to simplify this process. But adding another tool adds on costs as well. 

Worse, the fundamental problem remains unaddressed. Even when using tools to keep large binaries outside your repositories, they still have a tendency to grow and split into many more. The practice is so common it has birthed the term “Git sprawl”.

As a result, Git forces you to scale out repositories horizontally just to maintain acceptable performance. But it still adds on costs of time.

Cost #4: Multiple Repositories

Using multiple repositories adds on costs in efficiency and productivity, especially in:

  1. Replicating/mirroring content.
  2. Adopting and maintaining a clear workflow and branching practices.
  3. Developing a plan to address HA/DR concerns.

It places burdens on DevOps teams to unify multiple repositories at build time. It can entail the cost of additional servers sufficient to handle all the concurrent pull requests.

Teams with terabytes of content avoid Git for exactly this reason. When a single terabyte of content might require as many as 1,000 Git repositories to maintain acceptable performance, who wants to manage all that?!

Git is efficient at calculating what to transfer and in its actual network protocol for shuttling bytes around. But it doesn’t offer any good way of unifying multiple teams all working concurrently on multiple repositories.

This creates a dilemma:

  • Host them in a single location that punishes anyone elsewhere, both through latency and possibly low-bandwidth.
  • Or use unreliable WAN links and host them in multiple locations, which greatly complicates the process of putting it all back together and running unified builds.

Cost #5: Learning Curve

Git also adds costs with its learning curve. It can be difficult to learn and adopt Git, adding costs in time and productivity.

Git is not a novice-friendly versioning tool. It has advanced features and often demands deep knowledge of its data model to avoid problems.

For example, a Git rebase accident can destroy work on a particular branch. All the major Git hosting services no longer allow a forced-push to disrupt teams using shared branches. But that only protects other team members from inconvenience. The user who made the mistake must walk an ugly path to fix their local repo.

There are ways to mitigate Git’s learning curve. One way is to use one fo the available graphical user interfaces (GUIs). Git includes a couple of graphical tools by default, but many users prefer third-party offerings. It’s important to assess your non-technical contributors’ skills and match them with appropriate tools — or to ensure that simple plugins are available for the applications they use every day.

Cost #6: Workflows

Git workflows can also add costs to productivity. 

That's why it's important to select the right interfaces for all your contributors. The right ones will enable their preferred workflow without overwhelming them (see the learning curve example above). The best and simplest tools are rarely free. And their costs can add up quickly on a per-year or per-user basis as well.

Cost #7: Security

Git's lack of security is another big cost of using Git. 

In native Git, everyone gets every file and folder and can do anything the file system allows. But do you really want everyone having access to every file and folder of your intellectual property (IP)? Probably not.

Git hosting services often make it possible to “secure” entire repos via different roles, even protect specific branches by allowing pushes only from certain groups. But few offer the kind of down-to-the-file control and granular permissions that centralized versioning systems have traditionally offered.

Anyone who has access to a repo typically has access to everything in it. This results in the need to restrict IP by partitioning it into yet more repos. (And this further contributes to Git sprawl.)

[Related White Paper: How to Lock Down Git]

Security is an important cost to consider in an age when outsourcing has become the norm. You need to juggle the need to restrict proprietary content. At the same time, you still need to allow outsourced teams and consultants to contribute. 

Free Version Control (That Isn't Git)

Git is free, both as in beer (for you to use) and as in speech (for everyone to use/modify).

But adopting it and using it long-term adds up in costs. And the cost of using Git only grows over time.

At the end of the day Git remains a good tool for a variety of jobs. It works well for small teams working small projects. But when those teams are successful and grow suddenly into big teams working big projects, they’re often blindsided by the hidden costs. 

In contrast, Helix Core is the one system you won’t outgrow. And you can get started for free for up to 5 users.

Helix Core has:

  • A fast storage model.
  • High performance.
  • Large binary file management capabilities.
  • The ability to handle everything in one spot, regardless of how many repositories you have.
  • A straightforward learning curve.
  • Efficient workflows.
  • Strong security.

Already working in Git? You can even bring your Git projects into your pipeline (alongside Helix Core) with Helix4Git

Ready to get started? Try Helix Core for free. 

Free Version Control

 

Want to learn more? Explore Git best practices.

Note: This blog was originally published on December 16, 2016 and has been updated for accuracy and comprehensiveness.