Kelly Alexander is the Manager of Engineering Services and Ron Hernandez is the Storage Administrator at NVIDIA, Santa Clara, California.
To speed time-to-market, reduce costs, and meet the many operational challenges inherent in chip design and software development processes, most development organizations rely on software configuration management (SCM) platforms and tools. With multiple development teams working on separate project components and with teams often geographically dispersed, SCM solutions can help increase productivity and improve quality by:
- Automating repetitive tasks
- Providing version control
- Managing and tracking changes
NVIDIA Corporation uses Perforce SCM not only to manage its software development process and chip designs, but also to provide change control for critical documents company-wide. This technical case study describes a deployment of Perforce running on clustered Network Appliance™, storage configured to provide simultaneous Fibre Channel SAN and network-attached storage (NAS).
A global leader in advanced graphics processing technology, NVIDIA Corporation has received more graphics awards from the PC industry than any other company. Companies worldwide choose NVIDIA graphics processing units to enhance the digital media experience on desktops, workstations, notebooks, handhelds, and other devices. Headquartered in Santa Clara, Calif., NVIDIA has over 2,000 employees worldwide with revenue approaching $2 billion annually.
Over 800 NVIDIA engineers work on chip design and software development. The output from these groups is stored and managed using the Perforce SCM system. In addition to engineering data including all final chip designs and software source trees, other departments use Perforce to store critical documents. In total, NVIDIA has over 1,700 Perforce users—85% of the company.
Ensuring high performance and availability of this critical business application is a key priority for the NVIDIA IT department. NVIDIA's original Perforce SCM infrastructure used a pair of Sun servers configured with third-party clustering software and a monolithic storage system from a major storage vendor. As NVIDIA's processing and storage requirements expanded, the IT team found this fixed architecture was expensive, complicated, and difficult to manage.
The highly complex storage environment on which Perforce was deployed made something as simple as changing configurations a coordinated effort that often required bringing all three vendors on-site. Although NVIDIA's storage administrator manages all of the engineering group's storage needs—including over 40 NetApp storage systems with more than 400TB of total capacity, NVIDIA found that the single system supporting its Perforce environment was taking up most of the administrator's time.
Based on its long-term relationship with NetApp, NVIDIA decided to replace its existing storage solution with an easy-to-manage cluster of NetApp fabric-attached storage (FAS) systems capable of simultaneously supporting both Fibre Channel SAN and network-attached storage protocols. Key goals were to reduce the administrative burden while providing high availability and disaster recovery capabilities.
Unified Storage Configuration
Perforce SCM is designed with a client/server architecture. The Perforce server maintains a centralized repository while client workspaces reside on network storage. Perforce tracks all client workspaces, as well as the evolution and contents of its repository, through a central metadata database.
Since high availability was a critical requirement, extreme care was taken to ensure redundancy at all levels for the Perforce environment. The original Sun server cluster was already configured for high availability, and two Brocade Fibre Channel switches were used to create a redundant SAN fabric between the Sun servers and the NetApp FAS cluster. Sun servers are configured with dual Fibre Channel HBAs, while each FAS system has four HBAs to provide optimum performance and multipath reliability.
Within the NetApp cluster, each FAS system has different responsibilities under normal operation. One system, designated P4SAN, has primary responsibility for 4TB of SAN storage used by the Perforce SCM metadata database. The second, P4NAS, has primary responsibility for 16TB of network-attached storage, which is directly accessible from NVIDIA's IP networks and used for client workspaces. Should either member of the NetApp cluster fail, the surviving member immediately assumes that workload in addition to its own.
The P4SAN system was initially configured with four 250GB LUNs, yielding 1TB of capacity to accommodate the central Perforce repository. The P4NAS system is configured with a single large volume with 11TB total capacity. In both cases, underlying RAID groups consist of 16 disks and dual parity RAID (RAID-DP) is used to provide protection from multiple disk failures in a single RAID group.
With RAID-DP, each RAID group is allocated an additional parity disk. With this additional protection, the possibility of data loss due to a double disk failure has been eliminated and therefore larger RAID group sizes can be utilized for more flexible and simplified disk configuration options.
A key benefit of the new storage solution is accelerated backups. Every four hours a Snapshot copy is created of changed data on the P4NAS system and data is asynchronously mirrored to a local NearStore® system using NetApp SnapMirror® software. The NearStore system is periodically backed up to NDMP tape using VERITAS NetBackup software.
The Perforce servers mount P4NAS and store database checkpoints and journals there, so this critical information is backed up. The database stored on P4SAN is backed up nightly to tape and then restored to a staging area where journals are replayed to verify the integrity of the dumps.
The NetApp storage cluster and SnapMirror software also gives NVIDIA the ability to provide complete disaster recovery for its Perforce data. NVIDIA maintains a disaster recovery facility in Sacramento, Calif., about 100 miles from its main facility in Santa Clara. Perforce data is asynchronously mirrored to NetApp NearStore systems in Sacramento for protection from site-wide or regional disasters.
"We wanted to ensure that the new environment would be simple and fault tolerant and that the backend storage would not be a bottleneck," says Kelly Alexander, manager of Engineering Services for NVIDIA. "NetApp helped us achieve those goals.
"Our old system was very expensive to maintain," continues Alexander. "The maintenance cost alone far outweighed the purchase price of our NetApp solution."
Since the rest of NVIDIA's engineering environment already uses NetApp storage, the benefits for Rod Hernandez, NVIDIA's storage administrator, were substantial, "I was already very familiar with NetApp storage, and because the SAN piece is simple it only took a few hours of training for me to learn how to manage that, too. Standardizing on NetApp means I don't have to worry about Perforce storage anymore. My life is much improved and many sleepless nights have been avoided since we upgraded to NetApp."
"Now it's easy for us to accomplish common storage tasks like volume expansion and backup. We don't have to pay someone else a fortune to do it. Not only does NetApp save NVIDIA a lot of administrative time, it's saving us a lot of money that we used to spend on professional services," adds Alexander. "NetApp just does what it's supposed to do without a lot of hassle. It gives us great availability, better data protection, and disaster recovery—all with truly lights-out operation."
With its most critical company data on NetApp storage, NVIDIA puts a lot of faith in NetApp technology. NVIDIA's new storage environment for Perforce SCM provides tremendous benefits for the company in terms of improved availability, data protection, and disaster recovery. NVIDIA depends on NetApp for the innovative storage solutions that allow it to continue to be successful.