March 8, 2016

A Preview of My Talk at MERGE 2016

MERGE User Conference
Version Control

I’ve had the privilege of studying high-performing technology organizations since 1999. One of the most exciting parts of my studies has been surveying over 20,000 technology professionals, with the goal of understanding what high performance looks like, and the practices that predict it. This research was done with Jez Humble, Dr. Nicole Forsgren, Puppet Labs, and later, PwC.

In my presentation at this year’s Perforce MERGE conference, I will discuss how version control has moved from a development concern to a fundamental practice for everyone in the value stream. For me, one of the most startling findings in the research we’ve done is the importance of version control. Version control, of course, is nothing new for developers. After all, for decades, comprehensive use of version control has increasingly become a mandatory practice to support individual developers and development teams.

When all application source files and configurations are in version control, it becomes the "single repository of truth" that contains the precise intended state of the system. Version control also provides the history of all the changes (i.e., "code check-ins") that led us to the current state, including who made those changes, why they made them, etc.

However, because delivering value to the customer requires both our code and the environments they run in, we need our environments in version control, as well. In other words, version control is no longer just for developers, but for all the participants in our value stream. By doing this, our version control repository becomes the basis of our ability to repeatedly and reliably reproduce all components of a working software system — this includes not just the application and the production environment, but all of our pre-production environments as well.

To ensure that we can restore production service repeatedly and predictably (and, ideally, quickly), even when a catastrophic events occurs, we must check a number of crucial assets into our shared version control repository.

It is not sufficient for us to merely be able to recreate any previous state of the production environment — we must also be able to recreate the entire pre-production and build processes, as well. Consequently, we need to also put into version control everything relied upon by our build processes, which includes our tools (e.g., compilers, testing tools, etc.), as well as the environments they depend upon.

In the 2014 Puppet Labs State of DevOps (Puppet Labs, 2014)[1] research, the use of version control by Ops was the highest predictor of both IT performance and organizational performance. In fact, whether Ops used version control was a higher predictor than whether Dev used version control!

But why does using version control for our environments predict IT and organizational performance more than using version control on our code? In many cases, there are orders of magnitude more configurable settings in the environment than in the code — therefore, if the environment is where the most amount of entropy is, where the most number of things can go wrong, then indeed, the environment is where we most need version control.

(Anyone who has done a code migration for an ERP system (e.g., SAP, Oracle Financials, etc.) may recognize the following situation: When a code migration fails, it is rarely due to a coding error. Instead, it’s far more likely that the migration failed due to some difference in the environments, such as between Dev and Test, or Test and Production.)

Version control also provides a means of communication for everyone working in the value stream — having Development, QA, Infosec, and Operations able to see each other’s changes helps reduce surprises, creates visibility into each other’s work, and helps build and reinforce trust.

Furthermore, one of the most powerful mechanisms that enables local discoveries to automatically be integrated across our organization is through a organization-wide shared source repository. This is because when we update anything in the source repository (e.g., a shared library), it can rapidly and automatically be propagated to every other service that uses that library, automatically integrated through each team’s deployment pipeline.

One of the values we enable is that engineers can leverage the diverse expertise of everyone in the organization. As Rachel Potvin, a Google engineering manager overseeing the Developer Infrastructure group, notes, “Any Google engineer has a wealth of libraries already available to them. Almost everything has already been done.”

Tom Limoncelli further expounded, "The value of having one repo for the entire company is so powerful that it is difficult to even explain. You can write a tool exactly once and have it be usable for all projects. You have 100% accurate knowledge of who depends on a library; therefore you can refactor it and be 100% sure of who will be affected and who needs to test for breakage. I could probably list 100 more examples. I can't express in words how much of a competitive advantage this is for Google."[2]

I’ll be discussing more of the research and how this has impacted my own thinking at MERGE — see you there!

If you haven’t already, register for MERGE 2016 here.


[1] Puppet Labs, IT Revolution, & ThoughtWorks, "2014 State of DevOps Report," (2014)

[2] Tom Limoncelli, "Yes, You Can Really Work from Head," (2014)