March 20, 2014

When Should You Avoid RCS Archives for Text in Perforce?

Traceability
Version Control


Image: 001fj via Flickr

Storing text files in Perforce

In Perforce the default storage format for text files is a Revision Control System (RCS) archive, marked in the file history as type text. An RCS archive is an efficient way to store text documents that only change little between revisions – typical for the wide majority of source code files.

Alternatively, the Perforce Server can store text files as individual revisions, either as full files (text+F, for historical reasons also known as ltext) or as compressed files (text+C or ctext).

There are some situations where the RCS archive is not the optimal storage choice, and an alternative storage should be used:

Large text files

As the file size increases, saving and loading a large RCS archive becomes inefficient. The Perforce Server needs to load the whole archive into memory before it can add or extract a revision. Before Perforce release 2009.1 RCS archives were actually limited to 1.5 GB to avoid memory problems, but this limitation has since been lifted since 64 bit servers are now commonplace.

Nowadays the Perforce Server will automatically store large text files as compressed single files if the file size exceeds a configurable limit, filetype.maxtextsize, by default 10M.

Many revisions

If you store content like logs in Perforce and overwrite existing revisions with a complete new copy that has little in common with the previous versions, then RCS archives are an inefficient way of storing this data. Compressed individual files are usually a better choice for these kinds of files. We have seen cases where users have created 1000s of revisions of this log files, resulting in very large RCS archives on the server.

Remember that you can change the type of a file on a revision-by-revision basis if you stored the file as an RCS archive first.

Temporary revisions

Perforce allows file revisions to be purged automatically with the +S and +S types to save space on the server. This is useful for automatically generated binary files where only the last few revisions in the history will ever be accessed.

Creating a file with a text+S type on the other is unproductive. Normal text files are efficiently stored in an RCS archive. Text files which completely change content (such as log files) and that need to be automatically purged should be stored as text+FS or text+CS instead.

Conclusion

Choosing the correct storage type for file revisions can be important for the performance of a Perforce Server. Think about the size of the file and the nature of changes to this file when you make a decision on the storage type. Remember that your administrator can use p4 typemap to define the default storage type for your files if you do not want to specify the file type manually for each file.

Speak to Perforce technical support if you need any more details or if you want help to improve the performance of your Perforce Server.

Happy hacking.