April 13, 2007

Branching Strategies

Surround SCM

Seapine Software emphasizes the importance of branching when discussing Surround SCM. The proper use of branches can have an incredibly beneficial impact on a software development project, even if used sparingly. [toc] In Surround SCM, a branch is defined as a clone or copy of a source code repository at a specific point in time or at the current point in time. You might call it a virtual copy because no file copies are made inside the Surround SCM database, but it looks like a copy from the user's point of view. Branches exist in Surround SCM in a top-down tree list hierarchy. At the top is the mainline, which is the same as saying the root, head, or tip of a codeline. Branches are created under the mainline as children, and the children can have children of their own (to infinity). A parent-child and sibling relationship begins to appear as new branches are created.

Figure 1 - Surround SCM Branch Tree

The relationships between branches represent visual cues for the different codelines being managed with Surround SCM. Branches that appear lower in the tree represent a greater distance from the mainline. Distance between branches can be good or bad, depending on the type of development environment you are working in. Different branching strategies can be adopted to manage the different facets of the development process, including concurrent development, maintaining multiple product releases, and capturing software configurations. When branching is properly implemented, it is easier to generate metrics, get maintenance releases out the door faster, and incorporate automated build techniques.

Things to consider

Keep it simple

Keep it short and simple. Branching in Surround SCM is extremely fast and simple. While this is a 'good thing', it can quickly lead to overusing branches and complicating the SCM system. Complexity can be avoided through simple strategies, like branch by purpose. Remember -- just because you can branch doesn't mean you should.

Merging and Complexity

Complexity in SCM usually is associated with the maintenance of multiple codelines throughout the process. Questions of how to track specific changes and then how to apply those specific changes to multiple release codelines are often the most difficult to answer. Surround SCM is built in such a way to facilitate merging through multiple codelines. The 3-way merge utility that runs on the server side will, in most cases, be used to automatically merge changes between multiple releases. But the merge utility is not a programmer and only a programmer knows if a merge is good or not. Merging can result in new bugs or require additional code changes, and it can create a lot of overhead to manage. This is the primary reason it is important to keep the number of branches to a minimum and to try and keep the codelines as close to the mainline as possible. For example, the following strategy may cause much more overhead...

Figure 2- Wrong approach

Figure 2 reflects creating a new branch off the old one for each new revision that is going to be generated. Merging a change between revisions will be cumbersome and tricky as the codelines diverge from each other over time. The relationship between the parent and child branches also becomes more and more estranged as the codelines evolve concurrently in their own directions. It is also necessary to jump from one branch to the next, going up all the 'steps', to apply a patch using merging. Otherwise the changes need to be replicated by duplicating check ins on each codeline. However, a strategy as illustrated in Figure 3 can alleviate these challenges.

Figure 3- A better approach

This strategy reflects the tenets of branch by purpose, where the goal is to keep the multiple codelines as close as possible to the mainline. This reduces the overhead of merging common changes between multiple product versions. A change no longer has to be merged up and down a large number of steps, and check ins no longer have to be duplicated. The mainline will have the combined history of all prior branches, which allows any specific changes to be replicated using the Rebase action in any given codeline. Always remember that merging is a high maintenance aspect of SCM tools--in the same way that a check in can cause a bug so can a merge. The key is to adopt a solution that reduces complexity as much as possible, allowing you to spend more time developing software and less time resolving merge conflicts.

Consider Rebases and Promotes

When designing a branch model, rebases and promotes have to be taken into consideration since these are the methods used to update changes within branches. With rebases, the only option is to update changes from the immediate parent branch. With promote,s there is an option to select other branches other than the immediate parent. This can give the impression that promoting changes to branches other than the immediate parent is a valid option when designing a branch model. However, when a promote is done from a grandchild to a grandparent merge conflicts may occur. Skipping a generation when promoting is not recommended and should only be done in critical and rare situations. If a branch structure dictates doing this often, then the branch structure should be reconsidered.

Branch Models

Branch By Version

This branch model is also known as 'Branch by Release'. This is probably the easiest to understand and simplest of all the branching models. In this model, as development has a release for QA, a new branch is created for the developers to start the coding for the next release. The old branch is then used by QA, and eventually is left behind. While this approach is straightforward, it has two drawbacks:
  • Requires serial changes to the code, such as sequential check ins and check outs, rather than parallel development.
  • Adds complexity and overhead to the support of released versions.
When the software is released, and bugs are reported in a supported version, bug fixes must be made. In a branch by release model, this means you have to go back to that baseline and make the fix. Then, the fix has to be propagated to each subsequent release branch. Because this approach creates so many baseline branches, propagating the bug fix to all of the branches can become complicated and cause merge conflicts.

Figure 4 - Branch by version

Branch By Purpose

This is the recommended approach for many development projects and the one we recommend for use with Surround SCM. The main point of branching by purpose is to only branch when it is absolutely necessary. The goal is to not have to change the branch that you are working on. All feature development and bug fixes are made in the mainline branch. A new branch is only created when work on the next release must start and there is still a need to maintain old versions. The new feature development and work on the new release remains on the mainline, and the maintenance of the old release is done in a baseline branch. As maintenance releases come out, snapshot branches are created. The bug fixes made in the baseline maintenance branch are promoted to the mainline branch to ensure the new version includes all bug fixes from the previous version. In the following screenshot, feature development is performed in the mainline branch (called "Branch By Purpose" for illustrative reasons). Maintenance of old releases is done in each baseline branch (1.0.x and 2.0.x). As each major and maintenance version is released, a snapshot branch is created (1.0.0, 1.0.1, 1.0.2, 2.0.0 and 2.0.1). For example, if a bug is discovered in version 1.0.2 the fix is done in the 1.0.x branch. A maintenance version is released, and a corresponding snapshot branch is created (1.0.3). If the bug also affects the 2.0.x releases, the fix could be promoted to the mainline branch where it would be merged with the current codeline. It could then be rebased into the 2.0.x baseline branch. This bug fix could now easily be included in the next 2.0.x release.

Figure 5 - Branch by purpose

If old releases are not maintained, then all development could take place in the mainline branch. Snapshot branches are created simply to capture milestones. Baseline or workspace branches would only be created for special projects and sandboxes.

Branch by Customer

In some development environments, software releases are customer-specific. They all may be deviations of some standard software package, but each customer receives a customized release. When a customer reports a critical bug, it is essential to be able to deliver a fix quickly. This can be accomplished through an adaptation of the branch by purpose model. New development takes place in the mainline branch. The 'latest and greatest' version of the 'standard' version of the software packages are developed on the mainline. When a deviation is needed for a particular customer, a baseline branch is created. All custom development will take place on this branch. Snapshot branches are created for the various builds and release for this customer. Once released, these branches are left alone until the customer reports a defect or requests a change to the software package. Since the baseline branch has the code of their release, changes are easily made without having to worry about introducing new defects or unwanted features.

Figure 6 - Branch by Customer

If the defect reported by the customer is not specific to their release, then the bug fix can be promoted to the mainline and rebased to other customer branches.

The Workspace model

Surround SCM also supports a workspace branch model, where every developer has a private branch. The advantages of a workspace branch are that a developer can work off the main development area and has room to experiment, check in often, and not have to worry about impacting others. Once changes are ready, they are promoted either to a main development branch or to a staging branch. If changes are promoted to the staging branch, they can be reviewed before promoting them to a main development branch. A staging branch may be needed because changes from these workspace branches may come at separate intervals, and the age of the changes will vary as well. A staging branch ensures that all merged changes are reviewed before they are integrated into a main development or production branch.

Figure 7 - Workspace model

Note: The image in Figure 7 is for illustrative purposes only. In a real environment, each user would only see his or her workspace branch. If a user logs in that does not own a workspace branch, that user will not see any workspace branches.

Branch by Module

There are some software shops whose software releases are comprised of several separate modules. Each release may not necessarily contain the latest version of each module. For example, a software shop (we'll call "Modules R Us") develops three modules, module A, module B and module C. Each software they release for their customers contain specific versions of each module depending on the customer's need. So even though each module may be on version 5, for example, they need a release that contains version 2 of module A, version 4 of module B and version 3 of module C. To branch in this situation, use an approach very similar to the Branch by Purpose and Branch by Customer models. Like both models, all modules would be stored in the mainline branch. All new feature development would take place here.

Figure 8 - Modules R Us mainline branch

Then, as feature freeze points for each module arrive, a baseline branch would be created (branch off the specific repository for the module). When that specific version of the module is finished, a snapshot branch is created.

Figure 9 - Module branches

You may be wondering why baseline branches need to be created or why the snapshot can't be created directly off the mainline branch. If a software release is made containing version 2 of Module A, and the customer reports a defect, the fix must be made on version 2. What if development on the mainline for Module A is on version 4? Creating baseline branches for each version of each module allows for maintenance of each version separately. Putting Together the Release To put together each software release for each customer, another baseline branch is created. Unlike the baseline branches for the modules, this baseline branch is created off the root of the mainline branch (contains all three module repositories). For this example, we'll call this branch "Software Release Staging".

Figure 10 - Software release staging branch

The next step is to set a working directory for this branch, which is where the software release will be put together. After this is set, and we know which version of each module we need, we go to each specific snapshot branch and get the files to the corresponding working directory for the staging branch. For example, customer "XYZ Company" needs a software release that contains version 2 of Module A, version 1 of Module B and version 3 of Module C. We go to the snapshot branch that contains the latest release of version 2 of Module A (2.0) and perform a get to the working directory associated with Module A on the staging branch. We then go to the snapshot branch containing the latest release of version 1 for Module B (1.1) and perform a get to the working directory associated with Module B on the staging branch. Finally, we go to the snapshot branch containing the latest release of version 3 for Module C (3.1) and perform a get to the working directory associated with Module C on the staging branch.

Figure 11 - Get Files dialog

If you have Surround SCM set to allow check ins without check outs, then you need to perform a recursive check in of the entire "Software Release Staging" branch. If you do not, you must first check out the entire branch recursively, setting the "Overwrite" option to "Skip". Another option would be to check out the entire branch tree prior to performing each get. Once all the files are checked in to the staging branch, you can create a snapshot branch for that release. Make sure to indicate in the comments which version of each module is contained in the release.

Figure 12 - Customer release branches

Branching for Web Projects

Because Web projects tend to be continuous, they have different branching requirements. As requirements are developed they are released as opposed to bundling multiple requirements into a packaged release like you would do with a C++ application. Many Web developers do not use branching. Instead, all of the work is checked into a mainline or baseline branch. Snapshot branches are used to capture the Web site at different stages of development.

Waterfall Branch model

When branching is used, the branches often represent different approval stages for a specific change or changeset. One approach is to use a waterfall branching tree with the most recent changes in the mainline. Changes then trickle down through various 'stages' like QA, Staging, and Production. Snapshot branches can be used to capture the code at specific milestones.

Figure 13 - Waterfall model

With this branching approach, users can rebase to move changes between each branch, where each branch represents a different 'stage' in the change lifecycle. Another advantage is that you can rebase by label. A user can check in a code change and add a label like "Bug Fix 100". A code-admin can then perform a rebase by label "Bug Fix 100", which is a point and click process performed through the Surround GUI rebase dialog. That action is then repeated through each stage, as QA approves a change it is rebased again to staging where it can await approval for production. For more information on using labels in waterfall model, read the Using Labels in a Waterfall Model article. The snapshots can then be created for milestones or after each rebase to production occurs. This makes it easy to rollback to an earlier Web site release if necessary. Here is a great example of when the Surround SCM workflow feature can be a huge benefit to users. New changes can be marked for Review. After the reviewer approves or signs-off on the change, that file or group of changes can then be rebased to the next branch. The rebasing can even be automated using a simple trigger that runs after a change is set to the Approved state. You can also use triggers or shadow folders to automatically update internal Web servers. As changes move through the different branch stages, you can use either feature to have those changes automatically update Web sites. As developers check in changes, they can jump out to the public dev server to see those changes integrated with other user's changes in real-time. Or, as you rebase through stages, QA and project managers can see those updates and make approvals; they can even send changes to Production when the approved changes are rebased into the Production branch. If changes are not needed immediately, then shadow folders may be used. There is a whitepaper on using the waterfall model with shadow folders available here.

Reverse Waterfall Branching

In some instances, a reverse waterfall approach is prefered. This is similar to the waterfall mode but development is done at the lowest branch and changes are promoted up the stages instead of rebased.

Figure 14 - Reverse waterfall model

Feature Branches

A feature branch is used to do the bulk of the development work on a codeline. In the 'branch by purpose' method, the feature branch is often the mainline that provides the programmers with consistent work areas for the majority of the work performed. A feature branch is used to do all the big feature development for major releases. Using branch by version, feature branches are created prior to starting any new major release. For example, versions 2.0, 2.1, 2.5 would all be feature branches while versions 2.0.1, 2.1.1 would not be. How and when you use feature branches depends on the branching methodology and what is subjectively deemed a 'feature' release as opposed to a 'maintenance' release.

Task Branches

Task branches are areas where major feature work may be performed. A task branch is designed for a specific requirement or to make a major update. The task branch is usually temporary, and can be a private workspace or a public baseline. As a workspace, a task branch allows a programmer to clone a public codeline, either mainline or baseline, and use that branch to check in changes for a lengthy task (such as adding a new feature or fixing a complex bug). The programmer can check in changes to the server, review code changes, and perform rebases to stay current with ongoing development. Changes are stored on the server instead of on the programmer's hard drive, ensuring they will be backed up in case of a power outage. After the task is complete and the code is reviewed, the changes can be promoted into the public codeline. As a baseline, the task branch allows a group of programmers to clone a public codeline, either mainline or baseline, and use that branch to check in changes for a lengthy task, requirement, or feature. If a task branch is used for a specific requirement, and the requirement is pulled from the release, you can freeze the branch and essentially 'put it away' for later. An example use case for a task branch using a recent Seapine example: The TestTrack and Surround SCM GUI Clients both use Qt, a third-party cross-platform GUI library. When a major Qt update was released, from 3.3 to 4.0, the TestTrack and Surround code was branched into task branches. The new Qt code was checked into the branches and development work started on making code changes necessary to work with the new library updates. If the update is overly complex, and the release date might slip as a result, the Qt task branch can be frozen and the feature development would continue in the mainline. If the update goes as planned then the entire task branch can be promoted back to the mainline, with the Qt updates and any code updates, which would then be part of the next TestTrack and Surround SCM releases.

Third-party Library Branches

Managing third-party code can be tricky if there are multiple projects that depend on a single library. Each project often requires a specific version, making it even more complex to control. Following are two ways to manage third-party libraries with Surround SCM.
  • Use separate branches for each library release: This method allows you to store third-party libraries in separate branches, but requires performing two separate gets when doing a build. The first get being your project source code and the second get being the specific library version you need. This can be automated into a build script to make it easy.
  • Use file sharing into the common project areas: Create a separate folder, named Common, for the libraries and share the common source code into the project repositories (e.g., Project1/common). This allows you to branch a project and perform a single get to compile it because all the dependencies exist in the common sub-repository. You can also create root-level task branches when you need to make library updates that affect all projects (cloaking ones that do not share that project), making it easy to maintain the library even if it is shared across multiple projects.
To update third-party code use the Task branch approach as mentioned above.

Managing Builds

A common task with any change management tool is to capture the source code at a specific milestone. More often than not, these milestones are builds. Some tools have a feature that allows to "tag" or "label" a specific version of a file. While Surround SCM does provide labels, it is recommended that snapshot branches are used for this instead. If Snapshot branches capture the file content and the directory structure. Any directory structure change made on other branches is not propagated to the snapshot branch, thus guaranteeing a repeatable build.

When to Create a Snapshot

The first item that needs to be identified is the check in policy. There are generally two contradictory, but commonly recommended best practices: - Check in often. - Check in only after unit test and review. In the latter one, one should feel a certain level of confidence that a build can be created at any point in time. If only "stable" check ins are made, then the chances of a compile error or build error are minimized. If users check in often, before changes are complete, or you have a mix of both check in practices, then there should be a method to determine when it is safe to do a build, where it is safe to do a build from or at least a way to determine which revisions are stable. One common approach with Surround SCM is the workflow. With the use of states, files that are ready to be included in a build are designated by a specific state. The build user or script can then do a get of the source code files and use the "latest version to be in this state" flag. Another approach is to merge changes into another branch, where the builds are created, as the changes are approved.

Failed Builds

If the snapshot is created before the build, the snapshot may capture a failed build. This may or may not be desirable. One approach is to create the snapshot branch prior to the build. Depending on the need to capture failed builds, the snapshot could be deleted, or simply renamed to indicate a failed build. If you start seeing several snapshot branches for failed builds, that could be an indicator that a process change is needed. If one is only interested in successful builds, it may be a better approach to create the snapshot branch after the build.

Important vs. Unimportant Builds

Depending on the process in place, the nature of the business, the company culture, etc .there may be two types of builds:
  • Important Builds: These are major milestones, such as an initial release.
  • Unimportant Builds: These are nightly builds that may or may not make it to testing. This may be the case in a continuous integration environment.
If you create all of the snapshot branches together, the snapshot branch list may get cluttered. It may also become difficult to isolate the important builds from the unimportant builds. There are several approaches and these depend on the process in place and how fluid the process needs to be. One-Tiered Development Branch This approach is just an extension of the branch by purpose example. Two baseline branches are added below the development branch, but these are more for organizational purposes than anything else. Each baseline branch separates the important builds from the unimportant builds. Changes are rebased from the development branch to the corresponding baseline, and then a snapshot branch is created.

Figure 15 - One Tiered Development Approach

The figure above shows what this approach may look like. Development takes place in the WysiWrite 1.x development branch. As builds are needed they are rebased into its corresponding baseline branch, where the snapshots are created. Two-Tiered Development Branch With this approach, create a baseline branch under the main development branch. Daily development takes place in this new branch. Daily builds are captured as child branches (snapshot) of this branch (baseline). Any time an important build is needed, changes are promoted to the main development branch and then a snapshot branch is created. This would place the snapshots for important builds at the same level as the daily development branch. The two development branches will contain different states of the source code. The daily development branch contains the "latest and greatest". The main development branch contains the "latest and greatest" as of the last important build.

Figure 16 - Two-Tiered Development Approach

The figure above shows what this approach may look like. Daily development now takes place in the WysiCM 1.0.x Daily branch. Snapshots for daily builds are created here. Whenever an important build needs to be created, changes are promoted to the WysiCM 1.0.x branch, where the snapshot is created.

Additional resources

Following are links to other wiki resources that may help you with your build process: Using the Workflow- Includes a couple of ideas on how to implement a workflow to complement your build process. E-mail File for Review - Uses a combination of the workflow, custom fields, triggers and scripts to give the ability to email a file for review before it is checked in. Automating a .NET Build With MSBuild and Surround SCM - An example of using triggers to automate a .NET build. CruiseControl.NET Integration - How to integrate CruiseControl.NET with Surround SCM. CruiseControl.NET Example - A configuration example for continuous integration using CruiseControl.NET. Apache Ant Integration - How to integrate Apache Ant with Surround SCM. Nant Integration - How to integrate Nant with Surround SCM. CruiseControl Integration - How to integrate CruiseControl and Surround SCM.

Branches and IDEs

Visual Studio 2005 Web Projects

Branching with Visual Studio 2005 projects can be tricky because the solution files reference HTTP addresses. According to Microsoft, version control only works if users have the exact same working directories on every machine. This is often not the case for most users since new Visual Studio Web projects are stored in each user's profile directory. Following is our recommendation:
  • If every user has a common working directory (e.g., C:Web Projects) the version control works, as long as the Solution file and Project files are located in that directory tree. This may require editing the solution file so that it knows where the project and source code files are located. This is the best approach to take if you have multiple projects in version control and are planning on using branches.
  • If you are using the default user profiles directory, then set your working directory to the My Documents folder.
C:Documents and SettingsUserAMy Documents
The Visual Studio SCC integration gets confused when looking for the Solution and Project files even if you set the working directories properly to each target folder, but setting the working directory to the My Documents folder makes Visual Studio happy. This is the best approach if you plan on having multiple projects in version control but will not use branching frequently.