null
March 5, 2019

Build at Massive Scale With Docker Volumes

Continuous Integration
DevOps

Before Docker, companies would often use virtual machines (VMs) to build. VMs provided a way to take a snapshot the machine’s state — a kind of versioning. But this method lacked the formal management needed to control the machine’s environment. Drawbacks included:

  • Additional complexity.
  • Need to be provisioned and configured.
  • Drain on IT resources.

Builds would often break when provisioning an entire environment on a VM just to check the latest code change. This can cause significant delays to development and subsequent pipelines. It is why teams started to embrace Docker containers to develop, test, and deploy.

Docker For Continuous Integration

Docker containers have become an integral part of test, development, and Continuous Integration workflows. One of the benefits of using containers is their relatively small size and short-lived nature. They give teams a clean and managed environment.

This lightweight alternative to VMs can be created in seconds and is killed off when the purpose has been fulfilled. With Docker, teams can generate clean, reproducible builds that are quick to deploy. Plus, there is no leftover environment data or custom tools to break the build. 

But for teams managing a lot of large assets, they need to deploy a container without slowing things down. Copying a large numbers of very large files — such as graphics, movies, and sounds — thousands of times severely impacts build performance.

What are Docker Volumes?

Docker volumes are the solution to help quickly orchestrate massive builds. It's because Docker volumes creates another location for files. This persistent external storage can be located on the Docker or on a remote machine. Pulling files from this location significantly reduces the amount of time it takes to get large amounts of code and non-code assets into a container. By setting up Docker volumes, you can accelerate your container workflow and enhance team productivity.

Build the 1 TB Project With Docker Volumes

For companies with over 1 TB of project code, throwing away a workspace with the container only to resync all the files simply isn’t an option. Implementing Docker volumes, you can compile these assets outside of the container, and then pull them into the build.

It is not about just scaling the container. Instead, the code and non-code binary assets (graphics, etc.), are placed outside the Jenkins build machine. This saves time compiling and copying large files. These large assets are readily available on external storage, and they’re ready for use again and again.

But where do you start? The steps below walk you through setting up your workspace and Docker volume. 

How to Use Docker Volumes + Helix Core

The version control system to manage massive amounts of data is Helix Core. It can handle 10s of thousands of users, 10s of millions of daily transactions, and 100s of terabytes of data. When combined with Docker volumes, you are able to scale your builds to deliver feedback to developers fast.

Not a Helix Core Customer >>

Helix Core Workspace Options

Since the container is short-lived, there are a couple options in Helix Core, Should you use a new Helix Core workspace every time the container starts? Do you even need a Helix Core workspace if the source is synced each run?

Using a Helix Core workspace has benefits. You can:

  • Control the view.
  • Map source files.
  • Pin revisions.
  • Apply filters.

The more important question: should you track the file revisions when syncing with the db.have list? Let’s review your options.

Provisioning a New Helix Core Workspace

Setting up a new workspace for each container has benefits. When running a build, you know there is nothing left over in the workspace, and they are quick to start.

But each time you create a container, you need to delete the old workspace. If you do not delete it, you can clog the Helix Core db.have list. Better still, since the workspace is only used once, skip the recording of the have list using p4 sync -p command. You can use the Jenkins plugin option ‘SyncOnly’ and uncheck the ‘Populate Have List’ option to avoid syncing the ‘have list.’

checkout perforce(
  credential: 'myID', 
  populate: syncOnly(have: false), 
  workspace: manualSpec(cleanup: true, 
    view: '//depot/project/… //${P4_CLIENT}/…')
  )
)

 

For the keen eyed, you may have spotted a new ‘cleanup’ option in the Jenkins p4-plugin. This is not exposed in the Pipeline Syntax Snippet Generator. Setting this option to true will delete the Helix Core workspace after the initial sync.

But be careful. It is important to ensure that no subsequent pipeline steps require the Helix Core workspace. Alternatively, there is a ‘p4cleanup’ pipeline step (an alias of ‘cleanup’) that, if set to true, will remove the Helix Core workspace and all local files.

cleanup(true)

 

Reusing a Helix Core Workspace

If you choose to reuse the Helix Core workspace, you can skip the cleaning up and deleting steps. You would need to ignore the ‘have list’, as a previous sync would no longer have the relevant information. You can achieve this using the ‘Force sync’ populate option. But again, because the ‘have list’ information is not used, it may be better to uncheck the ‘Populate have list’ option.

Note: You cannot set ‘Force Sync’ and unset ‘Populate Have List’ because this is an invalid state. You cannot ignore a list you never created.

How to Create Docker Volumes in Helix Core

Once you have figured out workspaces, you can create persistent, external storage for Docker volumes. 

  1. Mount your build area outside of the container, so it can persist through builds.

  2. Use 'AutoClean' (a Perforce reconcile/clean) if the area is shared or is at risk of pollution.

  3. Alternatively, use 'SyncOnly' if you can guarantee the Docker volume is always used by the same Jenkins job, agent, and Helix Core workspace. 

Dealing With Concurrent Access

It is vital to prevent concurrent access to a shared workspace mounted by Docker. It would be a bad idea to have multiple Docker instances building from the same set of external files — unless you are sure that generated assets or intermediary files are not visible to another container.

While Helix Core workspace reuse has its advantages, concurrent access of the same workspace will lead to issues. This is especially true if two or more executors are trying to update the same ‘have list.’

Get More With Helix Core

Docker is an incredibly powerful tool to accelerate CI/CD to give developers near instant feedback. And it provides better fidelity into what you are running in production.

By using external storage, like Docker volumes, to persist some of the code and binaries, you can realize these benefits with massive projects. But before you start setting up anything, take a look at Helix Core. It is the only version control that supports all your digital assets and gives you the performance you need to accelerate development.

See Helix Core in Action