March 27, 2014

Towards a Definition of "Work in Progress"


In order to understand the overall behavior of a distributed installation of Perforce, it's important that you understand the concept of "work in progress." So I thought I'd spend a bit of time trying to explain what it means.

The Perforce server keeps track of the complete history of the files that you submit to the server. This is permanent history; once you submit a file to Perforce, you can always go back and get that same version of the file back, even if you have made other changes to the file subsequently.

The Perforce server also keeps track of the work that you have not submitted, such as

  • Files that you have sync'd to your workspace
  • Files that you have opened for add, edit, or delete.
  • Integrations that you have already resolved, or that still need to be resolved.
  • Numbered pending changelists and their descriptions.

Up until the point that you submit a changelist, the work that is contained in that changelist is temporary. You can revert your opened files, and delete your pending changelists, and the Perforce server will erase all the information that it was storing about that incomplete work.

Work in progress is that work which is associated with your particular workspace, separately from all the other workspaces that are being used with the server, such as which files you have sync'd, which files you have opened for edit, and which files you have integrated.

In fact, most of the work that you do with the Perforce server involves work in progress. The 'p4 submit' command, although it is a very important command, is not even close to being the most common command in use. Typically, the most common commands are commands such as: 'p4 sync', 'p4 edit', 'p4 diff', 'p4 integrate', 'p4 resolve', 'p4 revert', and 'p4 reconcile', all of which involve work in progress.

In a distributed installation, your work in progress is stored on the Edge Server where you created your workspace, and not on any other Edge Server in the installation, nor on the Commit Server. This is why the distributed configuration provides such benefits in performance and scalability. Work in progress activity is entirely combined to a single Edge Server, and doesn't affect any of the other Edge Servers in your installation, nor does it affect the Commit Server.

It all sounds wonderful, doesn't it? And it is!

But there is a bit of complexity that I haven't yet described, and that involves shelves.

The 'p4 shelve' command can be a bit of a curious creature. Your work in progress is generally work that only you care about. Until you submit your changes, you might change your mind, completely re-think what you're doing, etc., and so nobody else needs to see your work in progress until you submit it.

Quite commonly, shelves are created for purely private reasons, similar to the way that other version control systems let you "stash" your work. When you are working on a moderately complex change, and you want to save a copy of it in its current state, you can just shelve your work, and then continue working. It's a simple and easy way to have the server store a copy of your intermediate changes, without having to create a separate branch to submit them to, and your shelved changes are private to you and don't disrupt anyone else's work.

You can create several shelves with different variants of your work, move back and forth between them, use shelves to put one task on hold while you quickly handle a higher priority emergency of some sort, etc.

But sometimes you WANT to share your unsubmitted work. For example, you might want to have a colleague review your changes before you submit them. Or you might have a test automation tool that can test your changes before you submit them.

Shelves are a very useful tool for these sorts of situations. You can shelve your changes, send the changelist number to your colleague, and your colleague can access the shelved files and give you review feedback. Similarly, your test automation tool can be programmed to unshelve your shelved files, and run a suite of automated test cases against them, to assess their readiness to submit.

If you decide to submit your shelf, you can simply run 'p4 submit -e' to submit the shelved files directly. Or, if you decide that your shelf should not be submitted, you can delete the shelved files and the shelved changelist, and the server will erase all the information about the shelf.

So, a shelf is not permanent, and in that sense it is work in progress.

However, a shelf is also shareable with other users, so in that sense it is more than just work in progress. The server not only knows which files are in the shelf, it has a copy of each of those files stored on the server, not just on your workstation, and if the hard disk on your workstation should crash, your shelved files are safe on the server.

This duality of shelved work causes a bit of extra complexity when you are working in a distributed installation.

When you create a shelf, the shelved files are stored only on the Edge Server where you created your workspace, and not on any other Edge Server in the installation, nor on the Commit Server. This is a significant performance benefit, because the shelved files may be quite large and expensive to transmit around the network.

However, if you INTENDED to share your work with others, the fact that your shelved files are only stored on your Edge Server may be undesirable, for example if your colleagues are located in some other office and are using a different Edge Server.

In the 2014.1 release of the server, we have made a significant enhancement to the behavior of shelves in a distributed installation.

A shelf may now be designated as "promoted". A promoted shelf is visible to the entire installation. Users on other Edge Servers can describe the shelf, diff the shelved files against submitted files, unshelve the shelved files into their own workspaces, etc.

However, a promoted shelf is still resident on the Edge Server where it was created. The difference is that a promoted shelf is ALSO stored on the Commit Server (but not on any other Edge Servers). Users on other servers can see that shelf, but there may be a slight pause as the shelved files are transferred across the network. And you can still update, submit, or delete the shelf on the Edge Server where it was created, though there may be a slight pause when you do so, as the updated shelf is copied to the Commit Server to make it available across the installation.

A promoted shelf is easy to create: just specify the '-p' flag when you issue the 'p4 shelve' command.

We're excited about the new promoted shelves feature in the 2014.1 server: we think it provides dramatically improved functionality for a distributed installation of Perforce, and will make it much easier for you to build powerful code review and test automation processes that utilize the full performance and scalability benefits of the distributed configuration.

Interested? Download the 2014.1 server and try it out!!