DevOps Digest 310: Final CI Tasks
In our last article, we looked at how to promote code automatically from lower-level, less stable streams to higher-level, more stable streams as a prelude to producing release builds for testing and distribution.
However, organizations often need to (1) archive various build artifacts (and perhaps other data), and (2) label the files that went into producing said builds for future reference. In this article, we’ll leverage Helix for those tasks as we end our chapter on Continuous Integration.
Helix shines thanks to its unparalleled ability to store build artifacts that would choke other systems. For our purposes, the term “build artifacts” refers to any by-products of the process of assembling what you actually ship from various input files.
For example, while the ultimate, desired result of compiling and linking legacy C++ source code is typically an executable, the larger process may also produce all sorts of things you may need in the future:
- Symbol files
- Dictionary files
- Library files
- Hash signatures
- Unit test results
- Screenshots from UI tests
Deciding what to keep is important. Bugs reported from older software can become more painful to fix than they should be, simply because the stack dump collected can’t be matched against re-generated symbol files with time stamps that don’t match the original build. Storing original symbols can mean the difference between a simple debugging session and telling your customers you can’t find the problem.
You should always store files you can’t regenerate easily, files you must keep for compliance, or files that have important time stamps in your version control system. Less important or easily replaced data can be left in the folders of your CI tool or deleted altogether.
With other version control systems, keeping such build artifacts can be painful, even impossible in the case of large, binary files. But Helix can easily handle files of all types and sizes, so storing them is a simple matter of submitting them after the build process. This is easily accomplished with commands like the following:
p4 rec p4 submit -d “Storing build artifacts.”
Those two commands reconcile new files in your Helix workspace, respecting the contents of the ignore file and submit your work back to the server. If you use DVCS, then you’ll need one more command:
A traditional workspace sends content back to the server with a “submit” command, but DVCS lets you submit locally and requires you to “push” work back to the server as desired. Whichever approach you take, your build artifacts will be safely stored and versioned in perpetuity.
Having talked about what to keep, let’s now prepare for the need to regenerate other items at some future date. Implicit in that process is the assumption that you can somehow retrieve everything that went into a particular build in the first place. As you might expect by now, Helix offers a number of options to assist.
The Change List
The most simple, performant option is the humble change list. Those accustomed to using other systems often overlook how useful such a simple integer can be. But in Helix, a single submitted change list number is all you need to get back to a particular point in time. Consider the following command:
p4 sync @13
We’ve used the “p4 sync” many times; however, the use of the ‘@‘ symbol tells the command that we only want our workspace to include files from a particular submitted change list number (here, number 13).
When users submit content to Helix Versioning Engine, it creates a new snapshot at that point in time that is forever linked with the submitted change list number. To retrieve the files for any build, we look in our Jenkins history and sync our workspace to the corresponding submitted change list number.
Of course, numbers like “13” are much harder to remember than something more meaningful like “ReleaseCandidate1”. Thus, Helix offers a number of ways to tie more meaningful text to versions of content.
The most lightweight method is to use a job. Jobs can be very helpful bringing together disparate tools for Application Lifecycle Management (ALM) since every submitted change list related to fulfilling the requirements for a fix or enhancement can be linked to a job.
Because Helix jobs have both names and submitted change list numbers, they may effectively serve to link the two forever. It takes two commands to create a job and assign it a particular change list number:
p4 job ReleaseCandidate1 p4 fix -c13 ReleaseCandidate1
The first creates a new job named “ReleaseCandiate1” by invoking your editor with a sample job template. Save and exit your editor to create the job. Give it any description you like, to remember the details of the “ReleaseCandidate1” job. When you save and exit your editor, the job will be created. It will also be left in the “open” state, to indicate work on that job is still pending. Helix jobs are very powerful yet oft-underutilized for managing workflow.
The second command changes the status of the newly-opened job to “closed”, forever linking it with submitted change list number “13”. Now you can determine the corresponding submitted change list number(s) from the job’s name:
p4 fixes -j ReleaseCandidate1
That command produces output of the form “[JobName] fixed by change [SubmittedChangeListNumber] on [Date] by [User]@[Workspace] (closed)”. In other words, one command simplifies translating a given job name into the corresponding submitted change list number(s). Jobs are easily managed through automation, so they’re a great tool to leverage.
But they’re also limited in function. A submitted change list is forever tied to a particular point in time, but you can’t later update it, except in certain, rare instances. And while jobs are great for a variety of purposes, they can be associated with more than one submitted change list number and might offer either a little less or far more than what you’re really after.
This is why Helix also offers labels. Users can work with labels using several commands, two of which are all-too-easily conflated by new users, and have a higher overhead on the server than mechanisms we’ve already discussed. For many use cases, labels may be used without ever running into any issues, but labelling millions of files can take time, so bear that in mind when using labels with automated systems.
The “tag” Command
The first way of working with Helix labels is via the tag command:
p4 tag -l ReleaseCandidate1 //DevOps/Main/...
Like many Helix commands, this one creates a label specification on the server and associates it with the latest revision of all files in our “Main” stream. This is exactly the sort of operation one might perform as part of an automated release-build process. Just as we previously saw how to sync a workspace to a particular change list number, our new label may now be used in its place:
p4 sync @ReleaseCandidate1
That command syncs the workspace to the versions of files tagged with the “ReleaseCandidate1” label. It offers a more direct way to access those files compared to using a job to look up the number and then syncing to that number.
Labels also offer other interesting functionality. Just as with other specifications stored on the Helix server, a label specification may later be examined and/or altered using the following command:
p4 label ReleaseCandidate1
This invokes your editor and shows you what the “p4 tag” command created behind the scenes. It also shows that the label is marked with the attributes “unlocked” and “noautoreload”. Unlike many systems, you can lock Helix labels to prevent future changes.
If you update the label spec to “locked”, then the set of files/revisions associated with that label may no longer be changed by anyone, even the label owner. The label owner may later edit the spec and unlock the label, but until then it’s effectively frozen. This is great for identifying content securely for compliance.
Let’s set the “noautoreload” attribute aside for now and move on to other storage details. Helix labels offer greater functionality insofar as they may be limited with a view. If you open the label spec for the one we just created, for example, its view section reads:
View: //DevOps/... //depot/...
That limits the scope of the label. Because we created it on the “Main” stream, both “DevOps” and the default depot were included in the label’s view. Were we to try to apply that label elsewhere, we’d meet with issues.
Note: changing the view after the fact won’t change the files already associated with that label, it merely restricts the scope of the label’s use in future commands.
The “labelsync” Command
That offers a natural segue to another useful label-related command, “labelsync”:
p4 labelsync -l ReleaseCandidate1 //DevOps/Main/...
The best way to describe the differences between “tag” and “labelsync” is to say that “tag” always adds file revisions to a label, whereas “labelsync” replaces all the file revisions contained by the label. Though to be clear, “labelsync”, like “tag”, creates a new label specification if one doesn’t already exist.
One subtle difference, however, is that “labelsync” doesn’t select the head revisions of all the files. Rather, it associates the label with whatever revisions are in the current workspace. The “labelsync” command is also exclusive, meaning it removes the label from anything outside the current set of files, unlike “tag” which simply adds. In short, “labelsync” was intended to make it easy to define a label so it matches the content of the current workspace exactly.
Clear as mud, right? An example should help illustrate the differences in intended use. Let’s say that you’re not exactly using Trunk-Based Development (TBD) and don’t have the luxury of taking a known-good build from your “Main” stream at any moment.
A common, alternate way of working is to let the release-build system “float” or “roll” a label like “LastKnownGoodBuild” forward at the conclusion of every successful test run. Testers tend to love this approach as they know they can always sync their workspaces to that floating label.
One problem with this approach, however, is that every time that label rolls forward, the previous known-good build is lost. The most frequently used workaround is to copy the existing label to something more specific before rolling it forward:
p4 tag -l LastKnownGoodBuild_01_24_2017 @LastKnownGoodBuild
So here’s the catch: what if the people relying on that floating label need to work with some previous known-good build? Or worse, what if your automated systems that rely on that floating label need to work with some previous known-good build? This is where the “labelsync” command shines:
p4 labelsync -l LastKnownGoodBuild @LastKnownGoodBuild_01_24_2017
That command replaces the contents of the floating label with the contents of the date-specific label, so you that when everyone syncs, they’ll get exactly what you want them to get. In short, if you stick with the “tag” command to associate files with a label and the “labelsync” command to reset the contents of a label, you’ll never go wrong.
One last thing remains before we’re done with labels, and that’s a bit of advice for working with them at scale. That is, if you’re working with millions of files, but absolutely must rely on lots of labels, how can you maintain good performance and manage the load on the server as time marches onward?
Helix supplies a little-known feature that allows you to “unload” label data from the server, archiving what you no longer need:
p4 unload -l ReleaseCandidate1
What that does “under the hood” is move the details for the label to a special unload depot. Should you ever need that data back, it’s a simple enough thing to undo:
p4 reload -l ReleaseCandidate1
Of course, unloading labels one at a time could be just as painful as having too many in the first place, so the command (thankfully) can also unload anything older than a specified date:
p4 unload -f -al -d 2017/01/01
That commandunloads all labels prior to the beginning of this current year, effectively starting with a blank slate. Helix provides some nice “housekeeping” abilities that both your server admin(s) and day-to-day users will appreciate. For more details, including information on the aforementioned “noautoreload” attribute, click here.
There are other CI topics we could discuss, but we’ve covered a lot, and it’s time to move on to the next step in our DevOps pipeline: Continuous Testing.
You Ask, We Answer
As previously mentioned, this is your roadmap to creating a successful DevOps pipeline. Don’t understand something? Just ask. Need to dive a little deeper? Send an email to email@example.com with your questions. Then, stay tuned for a live Q&A webinar at the end of this series.
Get DevOps Digest Sent to Your Inbox
You don’t need to remember to check back with us each week. Instead, get the digest delivered directly to your inbox. Subscribe to our 25-week DevOps Digest and we’ll get you where you need to go, one email at a time.
See Perforce Helix in Action!
Join us for a live demo every other Tuesday and see the best of Perforce Helix in 20 minutes. Save your spot!