Blog
December 19, 2018
File Masking Made Simple With Perforce Delphix Masking APIs
Data Management,
Security & Compliance
In today’s world of multiple data sources and the heavy burden of compliance, customers are looking for solutions that enable the business to operate while addressing security requirements. Sensitive data that need to be anonymized through masking is not only stored in databases, like Oracle or RDS, but also in file systems with a variety of different file types.
Perforce Delphix supports a number of standard and out-of-the-box file formats, including delimited files (i.e., csv or tab), XML files, Copybook, and fixed-width file formats, while also enabling the data masking of other file types, like JSON, through pre/post masking steps.
The File Masking Process
Delphix masking technology has a logical flow to masking files that can be simplified into two phases: setup and execution.
Delphix File Masking Flow
Secure Sensitive Data Across Your Enterprise
Learn how to identify, mask, and protect sensitive information. Explore best practices for data masking with practical insights straight from the DevOps Data Platform.
How to Do File Masking With Delphix
Setup
Using Delphix, the first phase of masking files is the setup process. Here are 6 high-level concepts that are important to understand before you get started:
File formats: To mask any file, our process requires a file format definition (column names header) for each uniquely structured file. Then, the file format is assigned to the rule set that identifies the file(s) for masking. This has to be done for each unique file format, so defined file formats can be reused for all files that have the same structure.
Connectors: In order to access the data you wish to mask, you need to create a Connector. Connectors are any set of data (database or file) that has been connected to the Delphix Data Platform. These data sources can be physical or virtualized data sources. In the case of file masking for example, it may be the SFTP/FTP server where the files are stored.
Rule sets: A rule set is a group of flat files (or tables for databases) within a particular data source (which you have connected to by creating a Connector) that a user may choose to run profile, masking or tokenization jobs on.
Inventories: An inventory describes all of the data present in a particular data source and defines the methods, which will be used to secure it. Inventories typically include the file name, field name, the data classification and the chosen algorithm.
Masking jobs: A masking job is what you will set up to actually execute the masking of your files. When configuring your masking job, you must select the rulesets and inventories that you configured, which will tell the masking job how to do the masking based on which files and algorithms that were applied.
Pre and post-processing scripts: If the file content is not 100 percent in one of our supported predefined formats, the file can be pre-processed into a working format, masked and then post-processed back into its original format by creating a wrapper script/program that calls the pre-processing code, masking job and post-processing code. You can upload or define the pre/post-processing scripts when creating your masking job.
Execution
The next phase of this process has to do the actual execution. All you have to do is start the masking job(s) that are set up. These masking jobs will run any pre-scripts you defined, complete the masking transformations (based on the setup you completed in setup phase) and finally run any post-scripts you defined. At the end, you will have files that have been anonymized through masking.
Automating with the API
While running the Delphix UI is a great way to get familiar with masking and do the initial setup, you will most likely want to automate the process. So how can we automate this process and simplify the file masking process? When you have a lot of files that need to be masked, you can do this with a click of a few buttons. You can check out this Delphix Masking APIs document for a brief overview.
One of the tools we make available is our Masking API client portal as shown below. It provides an interactive way to learn the individual APIs, the URL as well as the inbound and outbound JSON body content.
While we offer our APIs for users to develop their own automation/scripts around masking, we also provide a number of open source repositories for the masking APIs, including dmx-toolkit and dxapikit. These repositories provide basic shell script examples to help users learn quickly and get up to speed. Here’s one that involves automating file masking.
The code/scripts in this repository take advantage of all the masking object creation APIs to define the file format, create a rule set, assign the domain/algorithm, set up a masking job and then run the job. The sample script requires an existing masking environment and connector, and the connector must contain the valid path for the file.
The rest of the parameters define the source delimited file, the delimited column names (header information), the mapping of the column names to the masking domains/algorithms and if applicable, the file type and respective delimited parameters.
Some additional functionality that a user may want to add to these scripts to further automate the file masking process includes:
- Additional code for the front end pre-processing of the files to build the desired column header file formats and the mapping of the masking domains/algorithms
- Automation for the connector creation and/or updating an existing connector to change the full file path
Explore More: Get the complete guide to data masking methods and techniques >>
Eliminate Compliance Risks with Delphix Data Masking
Sensitive data is everywhere — whether in structured databases or diverse file formats such as CSV, JSON, or XML. Delphix equips organizations to mitigate security risks while streamlining compliance with privacy regulations like GDPR, CCPA, HIPAA, and PCI DSS.
With Delphix APIs, you can mask sensitive data such as names, emails, and payment information across a variety of data sources and file systems. Our technology transforms sensitive data into fictitious, yet realistic values while preserving referential integrity. Support for pre- and post-processing steps allows you to mask even non-standard file formats seamlessly.
Combine Data Masking with Delivery
The Delphix DevOps Data Platform integrates data masking with data delivery, offering masked, virtualized data copies that act as full data copies but require far less storage. With automated data pipelines, your teams can deliver compliant data within minutes to support testing, analytics, AI, and other critical business workflows.
Automate with APIs
To simplify workflows further, Delphix provides robust APIs and open-source toolkits to help your teams automate file masking processes. From defining file formats to executing masking jobs, our APIs enable businesses to scale data protection with just a few automated steps — saving time and reducing manual efforts.
Safeguard Data Across All Formats
Whether your sensitive data resides in relational databases or a variety of file systems, Delphix helps you secure it efficiently and comprehensively. Automate compliance, accelerate innovation, and eliminate barriers to your organization’s success with Delphix.
Get Started with Data Masking
Experience how Delphix can help your organization comply with privacy laws, protect against breaches, and enable fast, automated compliance. Request a no-pressure demo to see why industry leaders trust Delphix as their go-to solution for data masking and delivery.
 
                
