October 30, 2006

File Sharing Strategy

Version Control
Streams & Branches

File sharing in Surround SCM allows a single copy of a file to exist in multiple repository locations. Since the file is shared, if it is ever updated all locations will reflect those updates simultaneously. There are several best practice suggestions and caveats associated with file sharing. It is important to understand how the Surround SCM architecture facilitates file sharing so that best practice strategies can be devised early on to help you get the most out of file sharing with the least amount of effort. [toc]

What is File Sharing?

File sharing is the practice of taking one copy of a file and making it available in multiple repository locations. This is similar to how symlinks work on UNIX or how junctions work on NTFS. The Surround SCM database uses reference points to identify where the file is available, but internally there is only one single database object reflecting the changes made to the shared file. Meaning, it is acting as a true share with only one copy of the file in existence, as opposed to creating unique copies of the file and synchronizing changes when they are made.

Why Use File Sharing?

Sharing is a useful feature with many applications. For example, if a EULA file is identical across several projects, the EULA text file or Word doc can be shared in each project repository. If an update is made to the EULA file then those updates will automatically be included in the next release of any of the projects that contain the shared document.

Moving Files

Sharing files also has an important role in Surround SCM in order to move files while retaining all the past history. To move a file in Surround SCM it is necessary to first share the file to the new location and then to perform a separate action of removing the original file (move) or to perform a break share on the file (copy).

Sharing and Branches

Sharing in Surround SCM is performed between repositories. When a branch is created in Surround SCM, it is like creating a duplicate copy of the repository for the purpose of isolating changes between the two copies (although no file copies are actually created in the Surround SCM database). Existing file shares are replicated on new branches when they are created. Changes to these shares can be promoted and rebased like non-shared files. If new shares are created in a branch then promote and rebase can be used to re-create that share in other branch locations.

Consider the Common Ancestor

Surround SCM branches support recursive merge operations (promote and rebase). These activities use what is called 'Common Ancestor Merging'. The common ancestor is the originating file that is used to compare one or two sets of changes, allowing 2-way and 3-way merges to be performed automatically during promote and rebase. Only new changes are incorporated into these merge operations, making it impossible to double merge a change or re-merge an old change. Common ancestor detection can affect file sharing in a couple of ways. In order to promote and rebase in Surround SCM a file can only have one common ancestor. You cannot merge a file that is shared on one branch, and unshared on another branch, since the two unique files cannot be merged into a single shared file. If a user tries to establish a share on a branch, and the share would create a scenario where two or more common ancestors exist, an error message will be returned and the share will not be created. For example, you run into this scenario if Repo1/filea.cpp and Repo2/filea.cpp are unshared files in some branches. You will not be allowed to establish a share between filea.cpp on another branch. To share the file, you need to remove filea.cpp from one of the two repositories then share the remaining file. If a share exists in a branch and you want to re-create that share in another branch, you must use promote and rebase to create it. This allows Surround SCM to establish the correct common ancestor with the share. To create a share using promote or rebase, select the file and choose Branch > Promote File or Branch > Rebase File. The file will show up as a new file in the Promote/Rebase Preview, but the shared file will be created when the promote or rebase happens. This same procedure can be used if you removed a shared file then want to re-establish the file as shared between the two locations.

Break Share

Surround SCM allows a share to be broken at any time. This allows the two files to function as unique copies, storing any future changes independent of one another. To move a file in Surround SCM, you can share it to the new location then remove the original copy. The share will be broken, and the file and its complete history will only exist in the new location. Breaking shares can also be propagated between branches. If you break a share in the mainline and rebase the file that was broken, it is possible to have that break share operation replicated to the child branch.

Global Break Share

Sharing can be replicated across many different branches. There maybe a scenario where a file should be unshared for all branches. Global Break Share, which is an administrator feature, will execute a break share for all branches.

Caveats and Best Practices

To avoid any common ancestor conflicts, it is easier to share files before branches are created. For this reason, Seapine recommends developing a sharing strategy before creating branches in Surround SCM. However, it is easy to share new files that are added to Surround SCM because there are not any common ancestor concerns. Existing files are harder to share if they exist across multiple branches. You may need to remove copies from other branches to ensure there are no common ancestor conflicts.

Using Labels and Broken Shares

When a break share operation is performed, Surround SCM simply tracks of all the new changes (or deltas) made to the independent copies of this file (similar to how file changes are stored between branches). This can lead to a potential issue if you use labels to identify builds. Because there is only one copy of the file in the Surround database when a label is applied to the file it is associated with only one version of the file. A label cannot be applied simultaneously to multiple versions of a single file (though the label can be moved to different versions). The result is that a build may be getting the wrong copy of a file from one of the old share-to repository locations. Use the Global Break Share command to solve this problem. After you globally break shares, a new file object is created in the Surround database. This allows deltas and labels to be applied to these now-unique database objects, independently of each other.

Working Directory Copy

Keep in mind that the Surround SCM database is not a filesystem. The repository view may look like your filesystem, but the Surround SCM database does not function like a filesystem. When a file is shared in Surround SCM, the backend system uses special database markers to identify when the file is shared, and to which locations, and it maintains this link for users across multiple repositories and branches. However, when a user performs a Get to their local filesystem a file shared to multiple repositories with multiple working directories will exist as separate copies. This means that while a file shared in Surround will always be identical across all the repositories it is shared to, that same file might exist as different copies in your local working directory. For example, the following files exist in Surround SCM: C:Clientfilea.cpp and C:Serverfilea.cpp. If you modify the Client copy of filea.cpp in your working directory, the Server copy is not updated, even if filea.cpp is shared between the Client and Server repositories. The local copies remain different until the Client copy of filea.cpp is checked in and a Get is performed on the Server copy of filea.cpp. To avoid these types of issues, some administrators eliminate the ability to use sharing in Surround SCM and use on a Common repository instead. (Commonly referred to as using include paths in project/solution workspaces.)