February 5, 2013

Intelligent Replication - Sending Data Only Where You Need It

Healthcare
Product Branding

Perforce replication is progressing along a steady arc of improved performance and behavior. Proxy servers for several years offered a way to keep archive content closer to end users. In the early days of replication, replicas were vanilla read-only copies of a Perforce repository. Then came role-specific replication, with behaviors crafted for particular use cases like supporting build automation.

replica progress

The 2013.1 release introduces the next phase, the ability to chain replicas together and restrict how much data each replica gets. I covered replica chains in a previous article, so now let's talk about replica filtering.

The appeal of restricted replication is obvious. If you have a repository that contains 500 TB of data, you probably don't need all of that in each location. Indeed, you may only need 50 GB in a small office that works on a single project. It's a little indulgent to throw around numbers in an example like this, but indulge me anyway: in that example, filtered replication has reduced the amount of data transferred by 99.99%. Since standing up a new replica usually means starting from an existing copy of the repository, it's now much easier to create that replica in the first place. Shipping 50 GB over a WAN takes a few hours. Shipping 500 TB means a call to FedEx.

intelligent replication

Configuring how much data to send to a replica is simple. In its server definition, simply specify what part of the master repository it should have. Let's say that I only want one of the replicas to use the gui depot:

> p4 server repl ServerID:   repl

Type:   server

Name:   repl

Address:   repl:4444

Services:   forwarding-replica

Description:  Replica for the gui project.

RevisionDataFilter:  //gui/...

> p4 configure set "repl#startup.1=pull -i 1 -P repl"

Now, if you decide later that a replica needs more data, you can either change its configuration or just let the replica fetch data on demand as users request it.

Where is all this leading? It's a bit too early to share the news, but I'll just say that I've now seen the promised land. And it is good. Stay tuned for some big updates on Perforce's federated architecture later this year.