July 22, 2010

Not Enough Chefs In The Merge Kitchen

Integration
Branching

Recently someone asked me if there is any way to automatically save the merge output files with conflict markers while running p4 resolve. In this scenario, we're running a big integration, potentially merging hundreds of files at once. We rely on the automatic merge action (p4 resolve -am) where possible, but some of the files have conflicts. When we run into a merge conflict, we need to grab the person who knows that file and can determine how to manually address the conflict. Rather than having that person walk over to our desk, we'd like to send her the file with conflict markers. She can manually edit the file and send it back to us, and we'll use the edited file to finish the resolve.

After thinking for a few minutes, I remembered that the P4MERGE environment variable lets us solve this problem. P4MERGE, when set, provides an alternative three way merge program for use by p4 resolve when the user selects the m resolve option. We can set that variable to point to a custom script, not just a regular diff/merge utility.

It's certainly easy to use P4MERGE and shell scripts to accomplish this goal, but the P4Python API lets us develop a more elegant solution. Specifically, we're going to use the P4.Resolver class. Using P4.Resolver in the P4Python API is even more powerful than setting the P4MERGE environment variable to a custom script. By providing a custom implementation of P4.Resolver, we can use custom logic for handling resolves. We could choose to accept the merged output file for text files, for instance, while choosing the accept theirs action for binary files. In this case, we'll provide a custom P4.Resolver implementation that archives the merged output file, then skips the resolve.

The example script is available in the Public Depot. The script tries to resolve any files in the workspace that still need resolving. The MyResolver implementation of P4.Resolver simply saves off the merged result file to a temporary directory, after giving it a better name. It then returns s, indicating that the resolve should be skipped. Since run_resolve calls the resolver once for each file, it gives us a one-stop way to archive any files that need resolving.

The script should be run from a client workspace that contains files that need to be resolved. (The script is run standalone, not invoked by setting P4MERGE and running the p4 resolve command.)

Here are the interesting snippets of the script:

# To-do: provide a better archive directory than /tmpclassMyResolver(P4.Resolver):
    defresolve(self, mergeData):
        dst = "/tmp/" + basename(mergeData.your_name)
        copy(mergeData.result_path, dst)
        return"s" 

p4 = P4.P4()
try:
   p4.connect()
   p4.run_resolve ( resolver=MyResolver() );
   p4.disconnect()

We could archive all four files involved in the merge, by enhancing the MyResolver class. It has access to all the elements in the P4.MergeData class. Mechanically, then, here's what we'll do:

  • Run the integration command.
  • Automatically merge what we can with p4 resolve -am.
  • Run python savemerge.py. The files with conflict markers are archived.
  • Circulate the files with conflict markers, and obtain updated versions from the developers who understand those files.
  • Run p4 resolve again, and edit the merged file, replacing its contents with the updated version from the developer.
  • Choose the ae (accept edited) resolve action.Automatically archiving the four files involved in the merge resolution process -- the yours, theirs, base, and merged output files -- may also be useful for other purposes. Particularly when we have conflicts, the conflict markers in the merged output file may be a key piece of that file's audit trail. Or, perhaps we'd like to submit files with merge conflicts to a code review tool.Whatever the case may be, the P4MERGE environment variable and its equivalents in the APIs give us the flexibility to customize the resolve process for our needs. Script away!