June 27, 2013

Task Streams – Even if you are a classic Perforce shop

Version Control

There is something in it for you: More lightweight branches.

With the 2013.1 release, Perforce enhanced Streams further and added a new stream type called “Task Streams”.

Here are a few details about what Task Streams can do for Perforce users who have yet not adopted Streams - I am assuming you have no experience with Streams and will focus just on what you need to get going on using Task Streams to facilitate lightweight branching.

First the big backend things. We need two new depots. The first one will serve as the container for our Task Streams. It is of type “stream” and I simply call it “tasks”. Here is the depot spec.

Depot: tasks
Description: Container for Task Streams.
Type: stream
Map: tasks/... 


The other one will serve as the container for everything we don’t need on a daily basis. It is a depot of type “unload” in which we store our used Task Streams and I call it “attic”. Another depot spec.

Depot: attic
Description: Task Streams not in use anymore.
Type: unload
Map:  attic/...


In P4V, the “attic” won’t show up at all. The “tasks” will, but there is no need to look into that.

We have a simple project called “project” to start with and two branches “main” and “release1”.  The project has one file called “foo” and thousands of other files that are represented here for simplicity by one file called “bar”.  “release1” is obviously branched from “main”.  You get the picture. Here it is:

Image Blog Body Task Streams Even If 1

The revision graphs for both files look identical, which is no surprise.

Image Blog Body Task Streams Even If 2

Let’s do some work in classic style on foo in release1 and propagate the change back to main. In order to do so I have created two workspaces “project_main” and “project_release1”.  I’ll make a change to foo, submit it and merge it back to main.  The revision graphs now. No surprises here:

Image Blog Body Task Streams Even If 3

We can count integration records in the Perforce database here. It’s two for foo and one for all the bars in the project. In terms of storing file content, Perforce (of course) hasn’t even bothered creating the file bar more than one time on the server. The database, however, knows about each and every branched file and this is now optimized with Task Streams.

Let’s create a client “project_task” for our work. It does not need a view right now as we will be assigning this workspace to a Task Stream later on. Here is the client spec:

Client: project_task
Description: Client to be assigned to a Task Stream.


Task Streams for isolated small work items

Now it’s time to introduce our first Task Stream. It will get created in the tasks depot. All streams in a depot will not be placed in some special directory structure - they just go straight into the root of that depot. Dependencies between codelines get modeled using parent-child-relationships, which are constructed by defining the parent stream of one given stream. This relationship can change and therefore using a directory structure to model these dynamic relationships is pointless.

With Task Streams we don’t want to model complex codeline relationships at all. These streams will just have no parent. As they are stored in the root and, as we might have lots of them, naming becomes a topic to consider.  Most importantly, the names have to be unique which is why we can just pick some external identifier or right away a UUID. That’s what I’m doing in this example. Most operating systems have some way to generate one for you. On my Mac I can simply call uuidgen in a terminal window.

Our first Task Stream will have the beautiful name B9DE0357-85F6-4FCC-956B-0EE39153E4C6. The Perforce path will therefore be //tasks/B9DE0357-85F6-4FCC-956B-0EE39153E4C6. We can create the stream in either P4V or on the commandline. Please note “Type” and “Parent” here. The spec is this:

Stream: //tasks/B9DE0357-85F6-4FCC-956B-0EE39153E4C6
Name: B9DE0357-85F6-4FCC-956B-0EE39153E4C6
Parent: none
Type: task
Description: Some Task Stream to be used for some work.
Paths: share ... 


And the quickest way to generate one on the Mac without invoking the editor is probably this:

p4 stream -o -t task //tasks/`uuidgen` | p4 stream –i 


Our Task Stream is still empty and with the relatively new p4 populate we can even fill it up without the need of a client workspace.

p4 populate //depot/project/main/... //tasks/B9DE0357-85F6-4FCC-956B-0EE39153E4C6/...


In order to get some work done it’s time to assign (switch) our client workspace to this Task Stream. This can also be easily accomplished with P4V, or by invoking this on the commandline or even inside a script.

p4 client -sS //tasks/B9DE0357-85F6-4FCC-956B-0EE39153E4C6 project_task 


Our client workspace is still empty. So we need to sync it as usual. Once we have done that, we can checkout foo and get our work done and submit our change as we normally do.

This is a good time to review our revision graphs.

Image Blog Body Task Streams Even If 4
Image Blog Body Task Streams Even If 5

Well, foo got a little more cluttered but bar is still nice and clean. Let’s do even more and create a second one.

p4 stream -t task //tasks/E62EB0FE-C105-4E9E-AB05-4D22BD26BEDD
p4 populate //depot/project/main/... //tasks/E62EB0FE-C105-4E9E-AB05-4D22BD26BEDD/...
p4 client -sfS //tasks/E62EB0FE-C105-4E9E-AB05-4D22BD26BEDD project_task
p4 -c project_task sync
p4 -c project_task edit foo
p4 -c project_task submit -d'another change to foo'


The revision graph is no surprise again. Foo is getting bigger and bar remains small.

Image Blog Body Task Streams Even If 6
Image Blog Body Task Streams Even If 7

After work in our Task Streams is done, we are going to think about what changes should make it back into our main codeline. After reviewing it carefully, we decide that the first wasn’t good enough but the second actually passed our quality tests. That means we need to merge changes from the Task Stream E62EB0FE-C105-4E9E-AB05-4D22BD26BEDD back to main.

p4 -c project_main merge //tasks/E62EB0FE-C105-4E9E-AB05-4D22BD26BEDD/... //depot/project/main/...
p4 -c project_main resolve –as
p4 -c project_main submit -d'merge back into main'


We all guessed this revision graph for foo.

Image Blog Body Task Streams Even If 8

As we are done with our work in the task streams and we have no intention to touch them again, it’s now safe to unload them. Let’s go in reverse order and unload Task Stream E62EB0FE-C105-4E9E-AB05-4D22BD26BEDD first.

p4 unload -s //tasks/E62EB0FE-C105-4E9E-AB05-4D22BD26BEDD 
Stream //tasks/E62EB0FE-C105-4E9E-AB05-4D22BD26BEDD unloaded.


The revision graph for foo remains unchanged.

Now we unload our very first Task Stream.

p4 unload -s //tasks/B9DE0357-85F6-4FCC-956B-0EE39153E4C6
Stream //tasks/B9DE0357-85F6-4FCC-956B-0EE39153E4C6 unloaded. 


This does not change anything to the picture either. Only we cannot continue to work on the files that were branched into the unloaded Task Streams.


What’s happening behind the curtain and why is this interesting?

When we add or edit a file in a branch in Perforce, database records are created. One important table to mention here is the db.rev table. Each unique file revision under Perforce control has an entry in there regardless of whether the change was made in a classic branch or a Task Stream. That is good as it let’s us track our changes to files in Perforce. Changes cause entries here and they will stay persistent as long as the files are not obliterated which is a completely different subject. Another thing is the file revisions that exist because we branch/merge/copy/integrate one file revision to make or change another, in a different location, in one of our depots.

There are two tables that are important here: the db.integed and the db.integtx table. They are important because they integrate where the target location is a Task Stream resulting in entries in the db.integtx table which is somewhat special. It’s special because entries will be removed from this table if you p4 unload a Task Stream.  It’s also special because the revision graph does not consider entries there unless there is another db.rev entry for this Task Stream, which is the reason why bar hasn’t shown up in the graph. We just branched bar and have not made any edits to it. As a Task Stream is only really important for the individual(s) working on the task, all these branches or integration records of files in there are practically meaningless for everybody else. So we really shouldn’t care too much about this situation and we shouldn’t reserve any shared database space for it. At least not for long. This change in a Task Stream should be propagated to any other regular branch in Perforce. Therefore entries in the db.integed table are created even if the source of that integrate is a Task Stream. That way they persist equally to changes that result in entries to the db.rev table, which is true for edits in regular branches and in Task Streams.

Task Streams can now be created, unloaded and reloaded if need be. In terms of database operations and database storage this means records are created (create, reload) in and removed (unload) from the db.integtx table. Without Task Streams they would just get created and (except for obliterates) never get removed from the db.integed table. Administrators and users can now much more easily protect the db.integed table from uncontrolled growth by using Task Streams and unloading them if there is no further need. Unloading a Task Stream at the same time does not require the same super powers as p4 obliterate. Give Task Streams a try should you want to have lightweight branching with Perforce.