About Unmasking jobs
Privitar can unmask data in the same execution engines used for masking:
Columnar CSV, Avro or Parquet data files in HDFS.
Streams of records in Apache NiFi.
Streams of messages in Apache Kafka.
Streams of messages in Streamsets.
Records passed via HTTP REST API to Privitar On Demand.
Embedded in Java-based applications with the Privitar SDK.
In all cases, the general process of unmasking data is as follows:
Either via the Privitar user interface or via the Automation API, create an Unmasking Job that describes the nature and structure of the data that contains masked tokens. Individual Unmasking Jobs are bound to the PDD that contains the masked data to unmask.
Include the ID of the Unmasking Job in the corresponding execution engine configuration. Each Unmasking Job has a unique identifier for this purpose.
When the Unmasking Job is applied to the data, a copy of the input is produced with the appropriate values converted from their de-identified form back to their unmasked values.
Unmasking Job Permissions
To ensure that Unmasking Jobs are only used in an authorised way, Privitar has a set of Unmasking Job Permissions that must be present on a user before an Unmasking Job will operate.
In particular, Permissions are provided for Create, Edit, Run Batch and Run Data Flow. Only users with the correct Permissions in their assigned Roles are allowed to manipulate and execute Unmasking Jobs.
For more information about Permissions for Unmasking Jobs, see Role Permissions.
Working with Unmasking Jobs
Unmasking Jobs that process data files on HDFS are referred to as Batch Unmasking Jobs. For more information, see Creating Batch Unmasking Jobs.
Unmasking Jobs that process data in streaming systems such as Apache NiFi, Apache Kafka and Streamsets are referred to as Data Flow Unmasking Jobs. Unmasking Jobs for Privitar on Demand are similar. For more information, see Creating and Running Data Flow/POD Unmasking Jobs.