Closing and Deleting PDDs

User Guide

Closing and Deleting PDDs

If the need for a specific PDD has ended, it is good practice to close the PDD.

Closing a PDD

A PDD should be closed when it is no longer needed. Closing a PDD has the following effects:

The PDD will become read only; no new data may be added. However, existing de-identified data associated with the PDD is unchanged.
The Token Vault used to ensure token consistency in this PDD is discarded; therefore, closing a PDD also prevents any unmasking of tokens associated with the PDD.

PDD metadata and the history of processing associated with the PDD are not removed. It is possible for a watermark investigation to identify a dataset as originating from a PDD that has subsequently been closed.

A PDD might be closed for data minimization reasons, or to control storage volume:

If de-identified output needs to be made irreversible, the PDD can be closed to prevent unmasking.
If the need for a PDD is eliminated, or if it is only appropriate for it to contain a finite amount of data, closing the PDD will stop new data from being published into it.
PDDs require on-disk storage for Token Vaults, so it may be necessary to close unused PDDs to recover space.

A closed PDD will display a lock next to its name on the Protected Data index.

To close a PDD:

Select Protected Data from the Navigation sidebar. The Protected Data page is displayed, showing an index of all the PDDs that have been created.
Click on the row select button alongside the PDD to select the PDD.
Select Close Protected Data Domain from the Actions list box.
The Close Protected Data Domain dialog box is displayed asking you to confirm that you want to close the PDD.
Click on Close Protected Data Domain.
It may take a few minutes to close the PDD. Click on Run in Background to hide the dialog box.

Alternatively, a PDD can be closed from the details page of an individual PDD by clicking Close Protected Data Domain at the top of the page, or it can also be closed programmatically via the Privitar Automation API (v3).

Note

Closing a PDD is an asynchronous process. You can monitor the status on the UI in the Protected Data Domain page or via the Privitar Automation API (v3). Depending on which type of Token Vault the Protected Data Domain uses, different execution modes are used for the closing process. For example, HDFS/cloud storage and HBase Token Vaults will be closed via Spark Batch/Hadoop, whereas JDBC and DynamoDB vaults are closed via the Privitar application or by Privitar On Demand (POD).

Deleting a PDD

PDDs may be completely deleted. A PDD must have been previously closed before it can be deleted. Once this is done, the following steps will delete a PDD from Privitar's configuration.

Note

Most of the time, closing a PDD is enough.

Typical reasons why you may delete a PDD include:

If a PDD was created as a mistake.
If you would like to purposely disallow watermarking investigations.
If you need to clean up deprecated, or unused, PDDs

To delete a PDD:

Select Protected Data from the Navigation sidebar. The Protected Data page is displayed, showing an index of all the PDDs that have been created.
Click on the Row select button alongside the PDD to select the PDD.
Select Delete Protected Data Domain from the Actions list box.
The Delete Protected Data Domain dialog box is displayed asking you to confirm that you want to delete the PDD.
Click on Delete Protected Data Domain.

Note

Deleting a PDD only deletes the PDD from Privitar's configuration. De-identified data written to HDFS/Hive or processed as part of a data flow pipeline or via a Privitar On Demand API will not be deleted.

Note

Deleting a PDD should be done with caution as it is not possible for deleted PDDs to be identified as the result of watermarking investigations.

In this section:

User Guide

Closing and Deleting PDDs

Closing a PDD

Note

Deleting a PDD

Note

Note

Note

Search results