May 19, 2016

Integrate Take Two: Out with the Old

Healthcare

I'll start this article by setting the wayback machine for 10 years ago, the time of the 2006.1 release.

The most significant 2006.1 feature was a complete rewrite of "p4 integrate", which we referred to around the office as "p4 integrate, take two".  The purpose of this rewrite was to address two of the major shortcomings of "p4 integrate" that we had identified prior to that point:

1.     It didn't handle indirect integrations very gracefully.

2.     The base was generally constrained to the source file, leading to suboptimal merges.

 

With the exception of a few convoluted cases where the older simpler logic turned out to produce better results (through sheer luck), "take two" (or "version 2" or "v2") was a success compared to what preceded it.  Ladder merges (where you merge back and forth between two branches) worked better by virtue of the ability to pick bases on the target when appropriate, which would naturally result in a "copy up" following a "merge down" rather than requiring re-resolution of conflicts or a manual "copy" action.  Reparenting, while still a bit tricky, no longer resulted in quietly dropping changes that had happened on intermediate branches.

There were still a number of outstanding issues, though, which surfaced over several years of customer feedback:

1.     The "closest path" heuristic used to choose bases could produce very poor merges since in many cases it didn't accurately reflect content differences between different revisions.

2.     Merging renamed files required either manual adjustment of a branch spec or manual mirroring of renames in the target branch before content could be merged.

3.     Conflicts between deleted and existing files were detected, but could not be resolved, causing them to re-surface on every integration unless adjustments were made to the branch spec.

4.     Traversing the "tree" of integration credits could be extremely inefficient in cases where one branch had a very large number of child branches.

5.     Truly "baseless" merges were not possible; in cases where no base could be found, the only option to proceed was to use an arbitrary base that could result in lost changes.

6.     Once granted, credit could never be "undone", because the "closest path" approach could not account for the cumulative effect of later integrations on earlier ones.

 

To the extent possible, all of these issues were carefully recorded in the form of detailed test cases as they were reported by customers, so that we could maintain an accurate picture of the work to be done and measure our future progress against it.  As the picture grew, it became clear that small adjustments to the integration logic would not do the trick; furthermore, as we tried to develop a system whereby users would no longer need to manually manage views (what eventually became streams), we realized that refactoring workflows which required manual view management were no longer going to be feasible.

The v2 engine was last updated in 2008; after that point, development effort started going into an experimental "v3" engine.  The new engine was made available fairly early on in the form of an undocumented "-3" flag on "p4 integrate"; when customers had problems with the existing engine, we could test their case with the "-3" flag, and the results were generally promising.  We were also able to implement features within the new engine that had previously been out of reach, such as more robust rename support (this 2010 article describes one of the first steps toward merging renamed files).

The v3 engine remained under continuous development in its "undoc" status (with the exception of a premature changeover in 2011.1 that was quickly reversed by a patch in the same release) until the 2013.2 release, when it became the standard.  The development of the new engine included a large number of refinements based on customer feedback, culminating in yet another complete rewrite that was completed in 2012.1 (carefully retaining all of the functionality and test case compliance from the version that it replaced) -- internally we called this rewrite "v3.5" or "v4", but decided that it would be less confusing to have it quietly replace the old v3 than to add yet another number to the mix, and nobody noticed (except insofar as a bunch of bugs vanished overnight), so it seems to have been the right call.

From 2011.1 (post-patch) to 2013.1, the new engine could be enabled globally via the undoc configurable setting dm.integ.engine=3, with the default setting of engine=2 continuing to use the old v2 engine from 2006.1.  When v3 became the default in 2013.2, we decided that we'd leave the old v2 engine available as a deprecated/undoc option for a while before completely removing it, although of course all of the old v2 bugs are still present if that option is in use.

That brings us to the reason for this blog post.  Over the past couple of months it's come to our attention that there are customers who are, for one reason or another, still using the long-obsolete v2 engine, while also waiting on those decade-old bugs to be fixed.  This has given rise to two requests:

1.     Come up with a definite plan for removing the v2 engine entirely so that customers won't continue to have the impression that it's under development.

2.     Publish a blog article explaining why we're doing this and what it means.

 

For the vast majority of our customers, this blog article has no relevance beyond historical interest, since unless you have specifically set the undoc dm.integ.engine configurable, every server from 2013.2 onward has been using the current ("v3") engine with the full benefit of all the latest functionality and fixes.  While we're not positive, we suspect that there may also be customers who are currently using dm.integ.engine=2 without a clear understanding of what exactly that means, and if so, hopefully this article has helped to clear it up to some extent and will encourage them to unset that option.

As for the plan for removing the v2 engine, our current target is the 2017.2 release, to allow some transition time for customers who need it.  The main thing that has been brought to our attention as possibly requiring transitioning is any script that might be written to parse individual messages in the output of "p4 resolve" -- part of fixing the deficiencies in the old integrate/resolve process included adding new functionality to the resolve command, which necessarily included new output and new flags.

As with almost any time we add new output, tools written with old versions of the API continue to receive the old output (preserving "bug-for-bug compatibility").  The command line client always receives the newest output by default where possible, but the -Zapi=N flag can be used to lock it to a particular version as if you were using an old API build, and this is something that we always recommend as standard practice to users who are writing scripts that might be sensitive to output changes.  For the particular case of new resolve types, -Zapi=69 (or lower) will do the trick, as well as restricting all other version-dependent messages to the 2011 timeframe.

There is also an (undoc) flag on "p4 integrate" that has the same effect without setting the API level, the "-Ro" flag.  As of 2016.1 there is a third option, which lets the -Ro flag effectively be set globally on all integrate commands: dm.integ.tweaks=16.  Note that as with using the other options that force the legacy resolve behavior, this will disable a large amount of merge functionality, although the new base selection logic will continue to function normally.

As an intermediate step, in 2017.1 we plan to automatically transition sites with dm.integ.engine=2 to the new dm.integ.tweaks=16 setting, on the theory that this will preserve the old-style resolve output that those sites may be dependent on.  If you are one of those sites, I'd first encourage you to try to update your tools to handle the new output, or to specifically disable it for those tools only, but failing that, try the dm.integ.tweaks=16 backward compatibility mode now (this is why we've made it available a year early) before upgrades start forcing the issue for you; if you encounter any problems with that mode, we'll have that much more time to address them.

I hope this trip down memory lane has been as entertaining for everyone to read as it was for me to write.  In the near future I plan to start writing a series of articles discussing the current integration engine in some detail.  Keep watching this space!