Video
Scrambling SAP Data with Perforce Delphix [Demo]
Data Management,
Security & Compliance
Overview: Scrambling SAP Data with Perforce Delphix
Managing complex SAP environments introduces a significant challenge: balancing the demand for accelerated project timelines with the critical need to protect sensitive data. Exposing this information in non-production environments creates vulnerabilities, heightens the risk of data breaches, and complicates compliance with regulations like GDPR. Perforce Delphix delivers a powerful solution to this SAP scrambling problem, enabling organizations to automate data compliance for non-production environments and accelerate innovation without compromising security.
With Perforce Delphix, enterprises can effectively transform their SAP data scrambling processes:
Accelerated Sensitive Data Identification: Delphix connects directly to SAP HANA environments using native SAP HANA drivers, enabling the platform to scan and identify sensitive data across tens of thousands of tables, including environments with 80,000+ tables and 100,000+ objects. Built-in accelerators rapidly pinpoint approximately 500 high-risk tables immediately, then continuously scan both metadata and data for evolving sensitivity.
High-Performance Data Masking: The demo showcases Delphix’s ability to mask and scramble sensitive SAP data at exceptional speed: processing 4 to 5 million rows per minute. This high throughput is supported by in-memory database integration, ensuring organizations can secure large datasets without disrupting development timelines.
Automated Policy Application and Consistent Protection: Once sensitive elements are detected, Delphix applies appropriate out-of-the-box or custom masking algorithms, replacing production data with realistic but fictitious values. This preserves testing accuracy while fully de-risking non-production environments.
Maintained Data Relationships: The video highlights how Delphix maintains referential integrity across SAP datasets, maintaining relationships between tables and composite fields are preserved, preventing application logic from breaking, and ensuring lower environments behave just like production.
Rapid Provisioning and Refresh: With the Delphix engine, teams can provision fully-masked SAP HANA environments on demand and refresh them as needed, removing delays and bottlenecks from development and testing while ensuring all test data is protected and compliant.
Enterprise-Scale Support: Leveraging multiple Delphix engines, the platform scales to support performance testing and compliance for the largest SAP environments, transforming datasets at high speed, regardless of volume or complexity.
See How Delphix Can Transform Your SAP Data Security
Discover how you can accelerate your development workflows while ensuring robust data compliance. Request a personalized demo to see how Delphix enables you to innovate faster and more securely with your SAP data.
Get an SAP Data Scrambling Demo
Full Transcript
Good afternoon. I am Ilker Taskaya, and I work as a field engineer at Perforce Delphix.
Today, we are going to discuss the topic of ERP systems such as SAP. There are many others that we work with at Delphix, like Workday, Salesforce, Oracle Financials, and even PeopleSoft from a while back. Our objective with SAP is to provide the ability to create SAP environments in the non-production part of your enterprise. This allows you to stand up multiple environments so that you can test across multiple teams and versions.
Most importantly, you need to protect this data so you are not proliferating one of the most sensitive datasets you have at your company.
Scrambling sensitive SAP data has always been difficult. In part, this is because data volumes tend to be very large, and it is a heavily integrated dataset across your transaction environments and other SaaS applications like Salesforce.
Protecting this data means transforming it in a way that preserves data relationships so you do not end up with a bunch of siloed datasets.
Delphix provides two specific values in the context of SAP non-production environments. One, we can generate many of these environments. Two, we can generate them in such a way that sensitive data is replaced or scrambled with fictitious data, so you are not proliferating it across many environments.
When customers use SAP masking from Delphix, they start by achieving their zero trust SAP data construct.
These lower environments essentially become useless for any other purpose other than testing.
Secondarily, this must be done with speed.
Because Delphix has been in the market for so long and has had hundreds of enterprise customers with large ERP environments, our solutions had to be fast. We can mask four to five million rows a minute using our compliance engines, and you can scale that across multiple Delphix engines.
Ultimately, the measure of our success is that you replace the sensitive data. You have not proliferated this production data across lower environments, and you can still test with it. The data looks realistic, but it is fictitious.
It is also important to mention that we are an SAP-certified product. We have masked SAP HANA for several enterprise customers. Prior to that, we worked with SAP to ensure that we are certified in context.
With our solution, our customers get two primary values.
They can identify what is sensitive in their SAP environment and de-risk this data, whether it is in the cloud or across multiple applications that have integrity, either being a source to the data that is in SAP or SAP being a source to that lower data source. The reason we can do this is that our products abstract transformation mechanisms.
This means when we take a value A and mask it to a value B, we can do this consistently across any data source, timeline, or data location, whether it is in the public cloud or on-premises.
A capability of our products, which I am going to showcase in this demo, is that we can work with any application that is a database, a file, or a mainframe.
We can mask any data, realistically.
One of the key values we have is we can identify what this data looks like. I would like to show you that with our product line. So, what I have for a demo here is an SAP application that has data from HANA. I have our masking engine here, our compliance engine, where I have a connection to a HANA environment, as you would expect.
Let's pull that up. We use SAP HANA drivers right out of the box so that you can use native drivers to read and write data. This is how you achieve, partially, four to five million rows a minute using the in-memory database.
And when we connect to this data source, we can, in fact, identify what is sensitive across all the tables that you might have. In some instances, there are 80,000-plus tables in an SAP data source coming out of the box. Some of our customers have upwards of 100,000 objects or tables in their environment.
We have what we call accelerators available for SAP specifically that can scan the SAP data model and identify about 500 tables right out of the box, as well as continue to scan both the metadata and the data in your SAP data source to identify what is sensitive.
As a result of this process, the tables that have sensitive data are automatically identified and ready to be transformed, masked, or scrambled so that your lower environments are protected. Let me show you what that looks like. I have an environment in here that has my SAP HANA rule set. And, what I am going to do is kick off a masking job against this environment.
What this is doing right now is connecting to the SAP HANA environment, reading data using the SAP HANA JDBC driver, transforming that data in memory, then writing or persisting that data back either to the source or to a target. It is completely up to you how you want to do this. To showcase what this looks like, I actually have an SAP application available here that is going against one of these modules.
I have some dataset that is clearly a sample dataset, but it will serve the purpose of showcasing how we protect the data. Our compliance engine is actually using a virtual database in here. We can certainly use a physical HANA instance as well. But the beauty is that once this is masked, I can use this masked or scrambled data to generate all of those lower environments.
For example, if we wanted to have five specific environments under HANA, we can take this VDB and create five more VDBs downstream for the application testers to use. My masking job finished, and I am just going to refresh the application so that we pull back the dataset that existed in the environment. Now, as the data values change, the masked data values also change. Let's take a look at the underlying data and how we actually protected it.
I have addresses across multiple tables here. For example, you can see "616 NLC Street." There's another address here, and it is consistently protected.
I think I have some contact information here. Yes.
I had a real dataset under these before that we inserted, and now they are protected. You can see our algorithms at work, transforming the first name and the last name. There is a full name instance or a composite instance. We can take an algorithm and run both of those algorithms back-to-back in the same attribute.
Thus, while individually masking each attribute, we can also mask the composite value together. So, in summary, the solution provides the ability to identify what is sensitive in this environment and assign what is sensitive with a scrambling algorithm that is going to take that input and transform it in memory. And then we can read and write datasets from an SAP HANA environment very, very fast—four to five million rows a minute on average—and transform the data in such a way where it is completely replaced by a synthetic value.
You can still test with this. You can still develop with this, but the value of this to somebody who might be interested for malicious reasons is zero. Furthermore, we can consistently protect this dataset within a HANA environment or across different applications because we can provide referential integrity wherever the dataset might be. Thank you so much for your time.