ClearCase to Perforce Conversion Guide
This guide consists of two parts:
- a description of the conceptual differences between ClearCase and Perforce
- a guide for transferring information from ClearCase to Perforce
Conceptual Differences
Atomic changes
SUMMARY:
- Perforce supports atomic change transactions; ClearCase doesn't.
Perforce allows the grouping of a number of add, delete, edit and
branch operations as one atomic change. The files the user is
working on are shown with p4 opened. When the user
runs p4 submit, these files (or some user-determined
subset) are submitted as one change.
Thus, if a feature or bugfix is to be added, either all of it will appear or none of it will. Other users can never see an inconsistent state.
ClearCase views versus Perforce clients
SUMMARY:
- ClearCase modifies filesystem semantics; Perforce doesn't.
- ClearCase requires network availability to access files; Perforce doesn't.
- Perforce requires disk space devoted to each client; ClearCase doesn't.
ClearCase users specify "views" which define which files and which versions of files are visible along specific paths. This is termed "transparency", since the user can access files with normal pathnames (although filesystem operations such as mv and rm are not transparent, and there is administrative overhead required to set up VOBs and filesystem mounts). Views are dynamic; new data added to the central repository is visible immediately.
Dynamic views require the interception of all file system calls in order to compute what the user should see. This slows down all file system accesses, even those unrelated to ClearCase, and imposes an added load on the network. It also requires constant network availability. (There is an additional ClearCase product to allow disconnected operation).
Perforce users typically each have their own separate "client workspace" which holds all files of interest to them. The files which appear on the client are determined by the "client view" which the user specifies. The default client view maps all depot files to files with the same name in the client workspace. Specifying a mapping such as
//depot/projecta/... //localclient/projecta/...
will result in only files in "projecta" appearing on the client
"localclient". There is no artificial limit on the number of
mapping lines a view can have. Mappings also allow translation of
names between the depot and the client workspace. For example,
//depot/a/b/c/internationalization/... //localclient/i18n/...
maps a smaller section of the depot the user is interested in and
names it something manageable.
Perforce clients are not modified except by explicit user action.
Changes to depot files which are mapped to a client view will not be
updated in that client view until the user runs p4 sync on that
client.
Users specify which versions of files they want using the p4 sync
command. By default p4 sync will synchronize the client workspace
to the latest revision of files in the depot (including deleting those
whose latest version is "delete"). Users can specify specific files
and specific revisions of those files, however. For example:
| Command | Result |
|---|---|
p4 sync @42 |
synchronize client to state at change 42 |
p4 sync @awesome |
synchronize client to list of files and revisions in label "awesome" |
p4 sync foo#2 bar@15 |
update foo to reflect contents of revision 2, update bar to reflect contents of bar as of change number 15 |
p4 sync junk/...#none |
remove all files in junk directory on client without affecting the depot |
Note that files which are opened on the client (about to be added,
deleted, edited or merged into) are never overwritten by p4
sync.
Perforce keeps track of the files and revisions on each client. All client workspaces must be disjoint, otherwise Perforce's records will be inaccurate. The way to share information is through the depot, not by "sharing" local client files.
Special naming conventions
SUMMARY:
- ClearCase uses "extended pathnames" to access specific revisions.
- Perforce allows the specification of files using either "client syntax" or "depot syntax".
- Both systems disallow some filenames.
ClearCase provides an "extended namespace" to allow viewing of
files other than those defined by the current view. The symbol
used to denote extended names (by default @@) is not allowed
at the end of filenames.
In Perforce files may be specified in either the "client" syntax or the "depot" syntax. The former is the name used to refer to the file on the local operating system (eg /usr/jane/foo or ../foo or c:\dick\foo), the latter is the canonical name of the file stored in the depot (eg //depot/foo). Note that the client view determines the mapping between the two namespaces. Any Perforce command which takes a file argument can take either client or depot syntax.
As well, Perforce defines special wildcard characters * (match any file) and ... (match any string).
The latter is most often used to specify whole directory trees. For example
p4 files //depot/main/... will list all files whose names start with
//depot/main/ and hence are in the //depot/main directory. The command p4 files //.../blast.c
will list all files ending with "/blast.c" and hence will show which directories
contain the files "blast.c". The command p4 files //...blast.c
will also list things such as "//depot/main/whatablast.c".
Perforce users can use the p4 print command to quickly view the
contents of a file without having to map the file to the client
workspace. For example, p4 print -q //depot/foo#2 will print revision
2 of the file "//depot/foo" without a header. p4 print //depot/projectb/...
will print all files in "//depot/projectb" with a header line for each file.
Wildcards and the symbols used for specifying revisions (#) or change numbers or labels (@) are not allowed in filenames in Perforce.
Versioned directories versus separate files
SUMMARY:
- ClearCase "versions" directories; Perforce doesn't.
- Both systems support the ability to see files in their correct locations at any point in time.
In ClearCase adding a file requires checking out the directory first.
Files are moved with the "ct mv" command. The version of the directory
selected by the view determines where the file will exist. The movement
of a file is not immediately obvious from user output (for example
cleartool lshistory shows the file as existing in its current location
for all revisions, even if it was moved).
In Perforce files are uniquely specified by a path in depot syntax.
Files cannot be moved; rather they can be copied and deleted by doing
p4 integrate from to then p4 delete from (then at some point a "p4
submit"). The p4 sync command will ensure that files which have been
deleted in the depot are deleted on the client. File histories in
Perforce clearly show which versions of a file have been integrated
into another file. An example may make this clear:
# echo 'hi' >hi
# p4 add hi
//depot/a/hi#1 - opened for add
# p4 submit
Change 1 created with 1 open file(s).
Submitting change 1.
Locking 1 files ...
add //depot/a/hi#1
Change 1 submitted.
# p4 edit hi
//depot/a/hi#1 - opened for edit
# echo 'Hi!' >hi
# p4 submit
Change 2 created with 1 open file(s).
Submitting change 2.
Locking 1 files ...
edit //depot/a/hi#2
Change 2 submitted.
# p4 integ hi ../b/hi
//depot/b/hi#1 - branch/sync from //depot/a/hi#1,#2
# p4 delete hi
//depot/a/hi#2 - opened for delete
# p4 submit
Change 3 created with 2 open file(s).
Submitting change 3.
Locking 2 files ...
delete //depot/a/hi#3
branch //depot/b/hi#1
Change 3 submitted.
# p4 changes
Change 3 on 1998/02/15 by fish@shark 'moved hi from a to b '
Change 2 on 1998/02/15 by fish@shark 'More emphatic greeting! '
Change 1 on 1998/02/15 by fish@shark 'Greetings '
# p4 describe 3
Change 3 by fish@shark on 1998/02/15 12:09:23
moved hi from a to b
Affected files ...
... //depot/a/hi#3 delete
... //depot/b/hi#1 branch
Differences ...
# p4 filelog //depot/b/hi
//depot/b/hi
... #1 change 3 branch on 1998/02/15 by fish@shark 'moved hi from a to b '
... ... branch from //depot/a/hi#1,#2
# p4 filelog //depot/a/hi
//depot/a/hi
... #3 change 3 delete on 1998/02/15 by fish@shark 'moved hi from a to b '
... #2 change 2 edit on 1998/02/15 by fish@shark 'More emphatic greeting! '
... ... branch into //depot/b/hi#1
... #1 change 1 add on 1998/02/15 by fish@shark 'Greetings '
# p4 sync @2
//depot/a/hi#2 - added as /home/fish/a/hi
//depot/b/hi#1 - deleted as /home/fish/b/hi
# p4 sync
//depot/a/hi#2 - deleted as /home/fish/a/hi
//depot/b/hi#1 - added as /home/fish/b/hi
Branching
SUMMARY:
- ClearCase associates a revision tree with every file and directory; Perforce associates a simple linear list of revisions for every file.
- ClearCase uses merge "hyperlinks" to store merge information; Perforce stores "integration records".
- ClearCase creates branches for you in a piecemeal fashion; in Perforce the branching is done by the user once, up front.
Branches in ClearCase follow the traditional model of having a tree of versions for every file. As well, ClearCase "versions" directories, so there is a tree of versions for every directory. Implicit creation of branches is encouraged through view configuration specifications. VOB-extended pathnames include version information which specifies the revision of each element along the path. For example,
/vobs/ dbtools/ .@@/ main/ spm_bldmdl/ 1/ dbcore/ main/ spm_bldmdl/ 8/ factory/ main/ 2/ DB/ main/ 1/ ctbasic/ main/ CHECKEDOUT.41495/ mmframei.cpp@@/ main/ 1
(The example has spaces after slashes so that it doesn't force your browser to display obnoxiously wide paragraphs!)
Careful examination of this path reveals that the file is really on the "spm_bldmdl" branch. Files which exist along paths which have not been branched may also be on this branch as well, though - the config spec is usually "if there's a branch, follow it, otherwise give the latest version on main".
ClearCase advocates a model in which the main line is used for releases, not for development. Development requires creating a branch, then merging code back into the main line.
ClearCase uses merge "hyperlinks" to denote which versions have been merged between branches.
Perforce uses a simple integer to denote a version of a file. Branching is done by simply copying the file to a file with another name. Perforce does this quickly and efficiently, by doing a "lazy copy"; the copy doesn't actually exist in the depot initially, rather the metadata stores the relationship between the copied file and its source.
The naming convention is up to the user, but one typical
convention is to have the first pathname component after "//depot"
denote the branch name. For example, if the main line of development
is in //depot/main and it is time to create a branch to release
version 3, the user would simply type
p4 integ //depot/main/... //depot/v3/.... Every file in the directory
//depot/main will then have a corresponding file in //depot/v3.
Users working on bug fixes for the release would simply look at files in //depot/v3,
those continuing main development would work on files in //depot/main.
Bugfixes may be integrated from v3 back into main, and new development may be
integrated from main into v3.
Note that the model Perforce advocates is different than the one ClearCase advocates. Perforce suggests calling your main line of development "main" and creating branches (copies) for each release. There is no need to use labels.
The output of p4 filelog in the example given above for moving a file
illustrates the "integration records" which are kept.
For more information on branching see the papers on
Inter-File Branching
and
Software Life-Cycle Modelling.
Labels
SUMMARY:
- Use of labels is important in ClearCase; in Perforce labels are not usually needed.
- Both systems support labels.
Both systems support the notion of an object which identifies a
specific set of files and revisions. In Perforce this is stored as a
list of (file#revision) entries, and is updated with the p4 labelsync
command. Note that labels are often used to simply denote a snapshot
in time for a particular easily-specified set of files; this is
inherently easy to do in Perforce without using a label, due to
Perforce's use of atomic change transactions and file naming syntax.
For example, the state of all the files in //depot/projecta as of
change 42 can be obtained with
p4 sync //depot/projecta/...@42
Symbolic links
SUMMARY:
- Perforce does not support symbolic links on Windows/NT; ClearCase does.
- Perforce maps depot files to client files as a 1:1 mapping.
ClearCase users may encounter problems moving to Perforce because of the different way Perforce handles symbolic links. Since ClearCase intercepts file system calls it can implement symbolic links with UNIX semantics on Windows/NT. Perforce uses the native operating system, so it cannot do this.
It is also common for ClearCase users to use symbolic links to point to common code. These symbolic links can be transferred to Perforce successfully if they are added as type "symlink", and the client mapping does not render the symlink pointing to an incorrect location.
If it is necessary for the files to appear whole rather than as symbolic links, the best solution is to use Perforce's branching mechanism. This is not transparent, but in most cases common files should be "released" to the rest of the company anyway.
For example, if all files in //depot/main/common are supposed to appear in //depot/main/projecta and //depot/main/projectb, then when a change is made to a common file the following commands should be run:
p4 integ //depot/main/common/... //depot/main/projecta/...
p4 integ //depot/main/common/... //depot/main/projectb/...
p4 resolve -am
p4 submit
It is recommended that this be put in a script.
Multiple VOBs, "ClearCase Multisite" product, "ClearMake" and wink-in
SUMMARY:
- Many ClearCase concepts do not apply.
All of these concepts are irrelevant to Perforce. Everything in
Perforce is typically stored in one depot handled by one server
(although Perforce does support read-only "remote depots"). Builds are a
separate issue; in Perforce it is simply a matter of doing the
appropriate p4 sync command and running your favourite build tool.
Perforce hosts a freely available build tool called Jam which has
advantages over traditional make utilities, but Jam is independent
of Perforce.
Transferring information from ClearCase to Perforce
Difficulty with complete conversion utility
Preservation of existing data would be simple if there was a conversion utility which transferred all data in a consistent manner, taking into account the different concepts used by ClearCase and Perforce. Unfortunately, ClearCase's complex versioning and branching of directories and its poor event record output make such a converter difficult.
Furthermore, it seems likely that such a utility would be of little practical value because of the speed at which it would run. A prototype converter which did not handle directory versioning ran at the blistering rate of 8 revisions per minute, or about one to two weeks to convert a mid-sized site. The main limitation seems to be the time it takes to extract revisions from ClearCase.
The simple snapshot approach
The other extreme to preserving everything is to simply start using Perforce with a snapshot of the existing data. Here are the steps required:
- Decide on the desired organization of files, including branching issues. Recall that branches are simply different files in Perforce; hence branch names are part of the file namespace. Typically, files should be branched as a group at the highest level in the tree which is in common. Typically, as well, there is a "main" development line which is branched into all others. Thus, you could have //depot/main contain all files, with each branch x being under //depot/x. Or you could have //depot/projecta/main branched into //depot/projecta/v3.2 and //depot/projecta/v3.3, plus //depot/projectb/main branched into //depot/projectb/v2.0, and so on.
- Decide on a location for the Perforce server - choose a disk
partition with lots of space, and preferably run the server on a
machine which is not running ClearCase (for performance reasons).
Then, most likely as root, run
p4d -r /usr/perforce. If you want to choose a port other than the default, specify it with -p - Set your login script to set P4PORT to the machine:port# the server is listening on, and set P4CLIENT to a reasonable client name (e.g. your user name)
- Decide on a location for your client, say /home/me/work, make
that your current working directory and run
p4 client, accepting the default. - Copy all desired files from ClearCase to /home/mew/work, doing
whatever reorganization is desired. If you are copying the "main
line" of development, for example, you would likely do
mkdir main cleartool setview (as appropriate) cp -R /vobs main - add all of these files to Perforce with
find . -type f -print | p4 -x - addfollowed byp4 submitNow the Perforce depot contains all the files, submitted as one change.
Note that the simple snapshot approach does not result in any merge information; future integrations will at first be a two-way merge. This problem can be minimized by merging as many branches as possible in ClearCase before taking the snapshot.
The middle ground
The first merge after the "simple snapshot approach" will be difficult. To avoid this problem, one could essentially do multiple snapshots so that the merge relationship is correct (well, mostly correct). Here is the procedure:
- Determine the oldest common ancestor of all branches you will be importing. Perform the simple snapshot approach as above, putting the code in //depot/main.
- For each branch x do
p4 integ //depot/main/... //depot/x/... - Run
p4 submit. Now the Perforce depot will contain a copy of the snapshot for all branches. Perforce now knows that each branch was copied from the particular version that exists in //depot/main - Delete all files on the client with
rm -rf. This is *not* something you would do normally when using Perforce! - Copy the latest version of each file from ClearCase to the appropriate location.
- Follow the instructions in Perforce tech note 2 to construct
a changelist which will add, delete, and edit the appropriate files,
then run
p4 submit.
A simpler alternative to the above is to replace the first step with just getting the latest version of the main line. This will result in merge information which is definitely wrong, but it is preferable to the simple snapshot approach which would require a two-way merge.
Another alternative is to abandon the notion of keeping integration records and simply transfer snapshots based on time (every day or every week) or labels. The procedure for each snapshot would be:
- Set the ClearCase view according to the criteria (label or date).
- Create a list of all files called "filelist" and add each file to
Perforce with
p4 -x filelist add. - Run
p4 submitwith a change description saying "snapshot as of (date/label)" - If it's a label, run
p4 labelandp4 labelsyncto create the same label in Perforce - Run
p4 -x filelist deleteto mark all files as deleted, then runp4 submitOR modify the "add files to Perforce" step to add the file if it doesn't exist or edit it if it does.
This approach eliminates the filename mapping problem, and may solve the performance problem because it transfers only some of the history. Anyone wishing to pursue this approach may wish to contact us; we will be glad to assist.
The phased approach
Due to the different branching model suggested by ClearCase and the need to become accustomed to a different system, it may be best to take a phased approach to conversion. The best approach would seem to be
- Take a simple snapshot of the main line only. Due to ClearCase's model this will probably currently be a branch called "version n" where n is the next version to be released.
- Switch the main line developers over to using Perforce. When it is time to release "version n", create a branch for version n.
- Code maintainers working on version n will now switch to Perforce. The process continues until all supported releases are in Perforce.
The advantage of this approach is that the Perforce history is very clean, the transition is more manageable, and experience with Perforce can be passed on from developer to developer.
The disadvantage, of course, is that the interim solution involves two CM systems, which may introduce short term complications.
Other conversion issues
Multiple names for one directory
ClearCase installations will typically have a myriad of NFS mounts
and symbolic links. This can result in the same directory having
multiple names. Perforce maps files between client syntax and
depot syntax based on the current working directory. In order to
ensure Perforce uses the canonical name for the directory
p4 can be aliased to p4 -d `pwd`.
Performance problems
ClearCase slows down all filesystem accesses. Thus it is best for the Perforce server to be run on a machine which does not have ClearCase installed. Users who will never need to access ClearCase should have ClearCase de-installed from their machines.