Build server strategies
With the advent of agile programming and continuous integration, the number of builds run in a company have increased considerably. This is great from a testing and early bug-spotting point of view, but it puts a considerable strain on the Perforce server, which can hamper performance.
Performance is what a Perforce server is all about; therefore we need strategies to cope with the additional load from build servers.
Why is there an impact on performance?
Build servers tend to populate a fresh workspace with all files required for the build, which often means thousands of files. This affects the Perforce server in three ways:
- The server needs to calculate first which files need to be synced, locking several database tables in the process.
- The files need to be retrieved from the depots, which can lead to disk contention and file system cache thrashing.
- The db.have table needs to be updated after the files are synced, which locks that table.
All these problems can be solved easily. This will lead to a more responsive Perforce server and faster builds.
Use "sync -p" for build workspaces
The option "-p" was officially introduced in 2007.2 for build workspaces. From the release notes:
'p4 sync' now sports a '-p' option. This allows the user to sync files without the server keeping track of it. This option is very useful when populating build clients or when publishing content when there is no requirement for saving the client workspace state.
Using this option reduced the amount of meta data stored, but it indirectly also reduces the amount of meta data that needs to be loaded when the server calculates which files need to be synced.
Use a Perforce Proxy for builds
This leaves us with the second problem, the files pulled of the depots. We cannot change the fact that we need to load the files, but we can change the location these files come from. Enter the Perforce Proxy.
Most Perforce users and administrators are aware of the Perforce Proxy that improves the speed and responsiveness for users in remote offices, but (maybe surprisingly) there is also a case for a Perforce Proxy in the main office where your Perforce server resides. The archived files will then be pulled of the drives of the Proxy server and not the main server. This will not only reduce contention on the disks but also, in the case of large build farms, reduce network access to the Perforce server.
Separate Perforce server for builds
Thinking these ideas to their logical conclusion, one could also surmise a separate Perforce server dedicated to builds. The server would be created and updated via replication from the main Perforce server. The replication process could even filter out non-essential tables for the build such as db.have. No officially supported replication engine with this capability exists (yet), but there are some scripts in the Perforce public depot which provide this functionality.
Alternative: Trust Perforce
One could argue that there is actually no need for any of the above techniques if one where to trust Perforce to do the right thing. The db.have table is there for a purpose: to remember which files been altered compared to the files already in the workspace.
If the build scripts separate the build artefacts cleanly from the source, there is really no reason to clear out and repopulate a build workspace for every build. This will reduce the strain on the Perforce server for every build considerably. How many files actually change in 10 minutes between two continuous integration builds?
There is a large and justifiable demand to run many builds a day, often for many products. With a little bit of forward thinking and some of the ideas mentioned above implemented, the Perforce server can process these builds without sacrificing responsiveness to the users.
Why don't you have a talk with your build masters? If you are unsure or uncertain about any of the strategies, you can always ask one of your favourite consultants to help you out - for a small fee :-)