November 16, 2009

Story of a Perforce Hero


I came across a rather perplexing issue while working with a customer.  When a Graphics Designer pointed P4V to a particular directory that was known to contain "many large files", it took "forever" to expand the folder structure.


"Many" was only about 900 files.  I'm not sure how many files P4V can show in a single directory, but it's some wicked big number, not a measly 1,000 files.  Several files were a few hundred Mb, and a few were over 1Gb.  That's big for a versioned file, but not overly so.  I wondered why the size of files affected the time it took to display them in P4V at all?  I was stumped, so I called on Support.

Tim drops in to help solve my tech support problem ...
Tim drops in to help solve my tech support problem ...

Tim O'Mahony came to the rescue.  Once given the facts, he speculated what might be the problem, and he was spot on!  The "forever" it had been taking (20 to 45 minutes) was cut to less than 2 seconds.  The problem turned out to be that the metadata on the server didn't have MD5 checksums (aka "digests") calculated for all files.  Normally, the size of files doesn't affect the time it takes to show them in the GUI -- but it does if the MD5 checksums don't exist on the server.

Since 2003.2, Perforce automatically calculates MD5 digests on the fly during submits.  So this isn't a problem we see everyday.  However, this loyal customer had been using Perforce for a long time and hadn't upgraded the server until recently.  The fix was to do a once-in-history operation, executing a series of commands to calculate MD5 digests for all their existing versioned files.

You Can Try This at Home!

If you're a Perforce administrator at a site where Perforce has been around for a long time, try this.  Keep running commands like this:

   p4 verify -u -m 1000 //each/depot/...

until Perforce happily reports "file(s) already have digests."  You can do this against a live, actively used server.  The '-m 1000' will prevent user-noticeable performance impact, by only calculating up to 1,000 MD5 digests at a time.  Use larger values if you like.  I chose 1000 as a number that should be safe even if your server is a 10 year old tower with the amber warning light glowing on the hard drive. :)