March 19, 2013

Supporting Global Perforce and Git Deployments

Healthcare
Git at Scale

Supporting a global development team is a constant series of challenges. How can you make sure that all of your developers have fast, consistent access to the data they need, whether they're in Boston or Bangalore? How can you oversee everything they're doing? And how can you avoid giving your administration team a migraine in the process?

Perforce has been working on this problem over our last few major releases, and we've hit another significant milestone. As of the 2013.1 releases of Perforce replication and Git Fusion, a Perforce deployment can now effectively support both Git and Perforce teams worldwide - with no artificial barriers to collaboration.

Let's look at the big picture.

This picture is a simplified representation of a deployment that many of our customers use: a master repository in North America or Europe, with remote offices in several other locations around the world. (One little known fact: we can indeed support the submerged Git team located somewhere off the coast of Madagascar. Look for P4Submarine in your local boating shop.)

A key underpinning for this picture is Perforce's out-of-the-box data replication. The picture below highlights how Perforce replication has evolved over the years.

replica progress

Check out the latest replication improvements in my other blog posts, but in a nutshell you'll be able to restrict a replica to a subset of the overall repository data and chain several replicas together.

What does this mean for Perforce administrators and users? It means you have a range of choices for supporting a variety of global deployment scenarios.

  • Do you have a remote site that pulls a lot of data but is otherwise a pretty light user of Perforce? A proxy is a cheap and simple solution.
  • Do you need a read-only copy of your repository for backups and reporting? A read-only replica is perfect, and it can contain a full set of data for disaster recovery or a subset for specific reporting needs.
  • Are you trying to support read-only build automation? A build mode replica is a good answer, and again it can have as much or as little data as you need.
  • Do you need to support a large user base or users at a remote location? A forwarding replica (aka smart proxy) is the ticket, and it can have whatever subset of the main repository it needs.
  • Do you need to support a web of build servers or a series of remote sites in a region? Use your first level replica to feed other replicas.

With the latest release of Git Fusion, you can now use this Perforce infrastructure to support a global Git deployment. You can attach one or several Git Fusion instances to one or several Perforce proxies, forwarding replicas (smart proxies), or brokers1. You could use two Git Fusion instances to support a large Git user group at your main site, and then a third Git Fusion instance attached to a forwarding replica at a remote site.

There are a few other Git mirror solutions available, of course. But if you use Perforce as the backbone for a global Git deployment, you get several advantages:

  • Out-of-the-box data replication with built-in monitoring and retry technology. You don't need to write your own tools to support remote sites.
  • More consistent end-user experience. Your Git users push and pull using a single Git Fusion remote. They don't need to pull from a local read-only mirror and push to the real remote; Perforce handles those logistics behind the scenes.
  • Perforce has consistent and granular access control to make sure that users at a site are not accessing more data then they need.
  • All users, whether working in Perforce or Git, can access and share data seamlessly. There are no boundaries based on the location of a repository.

To give a specific example on that last point, consider a mobile app project that has three components.

folders

The art work is developed in Perforce by a team based in California, and the iOS and Android source is developed by separate teams working in Git. The iOS team is based in London and the Android team is based in Auckland. The two development teams need the production art assets, not the raw huge media files.

By using Perforce and Git Fusion as the master repository supported by replication, there's no need to host different repositories at different sites or make one of the teams deal with a slow pull connection. Each team gets a local workspace or repository with the data they need, and a push to a shared component is reflected automatically everywhere that change is needed. And of course everyone has local data for their daily work; for the average user it's only pushes that go across the WAN. And those enormous raw media assets? No need to replicate those to any of the remote sites; the replicas filter out the data they don't need.

Now, if the London office opens a satellite in Edinburgh, you don't need another replica pulling from California. You set up a replica that pulls from London. That saves a considerably amount of data hops across the ocean.

You can use Perforce IP address-based access rules to restrict how much data an individual site can get. For example, consder the set of rules shown below.

admin user git-fusion-user * //.git-fusion/...
admin user git-fusion-user 10.0.50.* //jam/...
admin user git-fusion-user 10.0.50.* //gwt/...
admin user git-fusion-user 10.0.40.* //jam/...

The Git Fusion instances on the 10.0.50 sub-net could be used to work on data in the jam or gwt depots. But the Git Fusion instances on the 10.0.40 sub-net could only work on jam, not on gwt.

Overall, using Perforce as the backbone for a global development scenario for Perforce and Git users solves several deployment problems. Out of the box you get:

  • Reliable and fully supported data replication.
  • Flexible deployment options to suit different sites and teams.
  • Enterprise security and auditing support worldwide.
  • Simple data sharing and collaboration with no extra demands placed on the end user.

Perforce has a lot more in store to support global deployments over the next few releases, so this picture will keep getting better.

If you are using any advanced replica or broker techniques to filter the set of data at a remote site, make sure that you are providing all the shared Git Fusion information to each site to ensure a consistent Git repository.