Helix Swarm Guide (2020.1)

Models

There are three code review models: pre-commit, post-commit, and the Git Fusion model. Which model you use for code reviews with Swarm is up to you.

Pre-commit model

The pre-commit model is possible due to Helix server's shelving feature. Shelving enables you to temporarily make copies of your files available to other users without committing the changes into the depot. Shelving can be a very handy way for developers to create a backup, or to handle local workspace changes that might otherwise lose work in progress, without having to commit code that might destabilize a codebase.

Swarm uses the shelving feature in Helix server to manage code reviews. Shelving allows reviewers to easily acquire a copy of the code to be reviewed, and allows updates to the reviewed code prior to submission.

For more information on shelving, see Shelve Changelists in Helix Core Server User Guide.

Post-commit model

The post-commit model can be used if your team's development processes preclude the use of shelving. Code must be committed to the Helix server before code review can begin, which reduces the opportunity to fix problems before, for example, a continuous integration system notices problems. However, code reviews can be started for any existing code regardless of how long it has been committed.

Git Fusion model

Perforce Git Fusion provides repo management for Git repositories, and provides workflows that enable Git and Helix server users to collaborate on the same projects using their preferred tools.

The Git Fusion model is similar to the pre-commit model; changes in your local repo can be pushed for review to a named Helix server branch in the Git fusion repo configuration, making your proposed changes available so that others can review and comment on them prior to committing them to the target branch. Git Fusion and Swarm work together to create a review branch and container for the pre-commit collaboration.

The Git Fusion model has several limitations that you should be aware of:

  • The target branch for Git Fusion-created reviews must be a fully populated branch, and must be listed in the repo-specific Git Fusion configuration.

    See Setting up Repos in the Git Fusion Guide for details on converting a lightweight branch into a fully populated Helix server branch.

  • Reviews created with Git Fusion can only be updated from Git Fusion.
  • You cannot clean up history and then push your changes to the same review. If you perform a Git rebase, you should push your changes as a new review.
  • A Git Fusion review does not currently display the individual task branch commits that make up the review. Only the merged commit diffs are shown.

For more information on Git Fusion, see the Git Fusion Guide.

Internal representation

Swarm-managed changelists

A code review consists of one or more shelved changelists that Swarm manages. A shelved changelist is a pending changelist that has a snapshot of its files on a shelf associated with the changelist.

When a review is started, Swarm creates a new changelist that becomes the review changelist. What happens afterwards varies:

  • If the review contains uncommitted work (the pre-commit model), Swarm copies the shelved files from the user's changelist that initiated the review into the review's changelist.
  • Any time that a user's changelist associated with the review has its shelved files updated, Swarm copies the shelved files into its review changelist and creates an archive changelist. An archive changelist is no different from any other pending changelist with shelved files, but it allows Swarm to provide versioning and diffs within a review.
  • If the head version of a review is committed (the post-commit model), the review's changelist is emptied of files.

The review's changelist is never actually committed; this allows the review to be opened later with additional shelved changes.

Important

Swarm's managed review changelists should only be deleted if you are uninstalling Swarm.

Swarm's review changelists maintain the history of a review and all of its feedback. The deletion of a Swarm shelved changelist causes instability and potentially data loss, and represents a scenario that can be very challenging to recover from, even with the engagement of Perforce consultants.

You can display a list of all of the Swarm-managed changelists using the p4 changelists command:

$ p4 changelists -u swarm
Change 1212285 on 2015/07/31 by [email protected] *pending* 'Add requirements and instructions'
Change 1212284 on 2015/07/31 by [email protected] *pending* 'Add requirements and instructions'
...

swarm is the userid with admin-level privileges within the Helix server that Swarm is configured to use. Use the appropriate userid when you run the p4 changelists command.

Swarm-managed workspaces

Whenever Swarm creates a changelist for a review, it uses a client workspace (or just workspace) associated with the configured Helix server userid that has admin privileges. Whenever a user commits a change via Swarm's user interface, Swarm uses a workspace associated with that user.

To learn more about workspaces, see Helix server as a version control implementation in the Solutions Overview: Helix Version Control System Guide.

The workspaces that Swarm creates and uses live in the SWARM_ROOT/data/clients folder.

Inside the clients folder, Swarm maintains a user-specific folder that contains any workspace folders that may be required. Each user-specific folder is named by converting their Helix server userid into hexadecimal to avoid any characters that would be problematic in the filesystem, such as slashes, accents, UTF-8 characters, etc. For example, the folder for the user steve.russell would be named 73746576652e72757373656c6c.

Within the user-specific folder are the folders that become the root of each workspace. Each of these folders is named with a globally-unique identifier (GUID) prefixed with swarm-, for example swarm-438d482b-f107-9a35-c06c-86ac68136b00. Accompanying each folder is a lock file with the same name plus the .lock extension. Finally, the user-specific clients folder contains a management lock file called manage.lock.

Here is an example of the folder structure:

SWARM_ROOT/
data/
clients/
73746576652e72757373656c6c/
manage.lock
swarm-438d482b-f107-9a35-c06c-86ac68136b00/
swarm-438d482b-f107-9a35-c06c-86ac68136b00.lock
swarm-8388362a-233d-0cb9-3e90-895eaaa99f6c/
swarm-8388362a-233d-0cb9-3e90-895eaaa99f6c.lock
7061756c612e626f796c65/
manage.lock
swarm-da7de4b4-0ecb-12c8-1b35-f3e32bb18033/
swarm-da7de4b4-0ecb-12c8-1b35-f3e32bb18033.lock

Here are the steps Swarm takes when it needs to use a client:

  1. Convert the current connection's userid to hexadecimal.
  2. Check to see whether a user-specific folder exists within SWARM_ROOT/data/clients; if not, create the folder.
  3. Within the user-specific folder, loop over any existing workspace folders and attempt to lock each in turn:

    If a lock is acquired skip to the next step. Otherwise, perform the following procedure.

    Create workspace procedure:

    1. Check if the max number of clients for the current user has been reached:

      • If so, wait a short amount of time (50 milliseconds), and start step 3 again.
      • If not, proceed to the next step.
    2. Take a lock on manage.lock.
    3. Check if the max number of clients for the current user has been reached:

      • If so, release the manage.lock, wait a short amount of time (50 milliseconds), and start step 3 again.
      • If not, proceed to the next step.
    4. Create a new workspace folder using a GUID-based filename, and take a lock on the folder.
    5. Release the manage.lock lock.
  4. Perform the necessary file operations using the locked workspace folder.
  5. Revert the file content within the workspace folder to avoid having constantly growing disk space use.

    Note

    There may occasionally be stray files left; Swarm is not aggressive about cleaning up.

  6. Swarm releases the lock on the workspace folder.

Most users should only require 1-2 workspaces, and those are only required if they commit from Swarm. The admin user that Swarm is configured to use should only use one workspace per configured worker.

By default, the number of workspaces that could be active at any given instant is two times the number of configured workers. Since the default worker count is three, Swarm would use at most six workspaces simultaneously.

If the workspace limit is reached, further file processing is blocked until a workspace becomes available. Potentially, this means that users could encounter timeouts. Configuring Swarm to use more workers could solve that issue.

Removal considerations

Administrators might wish to remove Swarm-managed workspaces. There are a few considerations that should be assessed prior to removal:

  • Ideally, you should stop the web server (taking Swarm out of service) before removing a Swarm-managed workspace from the Swarm server; this eliminates the risk of removing a workspace that is in use.

    If you do not stop the web server first, Swarm may encounter an error during a submit.

  • Removal of a Swarm-managed workspace folder does not remove the client spec from the Helix server. Unless the client spec is removed, that workspace effectively becomes orphaned. Orphaned clients are, of themselves, not a big concern as the storage and performance impact is negligible.
  • Removal of a Swarm-managed workspace's corresponding client spec in the Helix server can be done. However, you should never remove a client spec that has associated shelved files.

    Usually, the only client specs that should have associated shelved files belong to the admin account that Swarm is configured to use. All other workspaces that may exist for other users are primarily used for submitting changes, and so should not have shelved files associated.