Skip to main content

User Guide

Token Vault Environment Configuration

When tokens are generated using rules with the Preserve Data Consistency option enabled, the produced tokens are stored in a Token Vault. (For more information about the Rules that support tokenization, see Masking Rule Types Supporting Unmasking.)

The Environment settings on the Token Vault Configuration tab configure how these tokens are generated and stored.

The following Token Vault Types are supported:

  • None (no Token Vault is used, therefore no consistent tokenization is possible within this environment)

  • HDFS

  • HBase (and Google Cloud Bigtable)

  • JDBC

  • Amazon DynamoDB

HDFS or HBase Token Vaults can only be selected if a Hadoop Cluster is configured in the Environment.

Amazon DynamiDB is the only Token Vault supported when using Privitar AWS in an AWS Glue Environment.

HDFS Token Vault

When selecting HDFS Token Vault the following settings are available:

Property

Description

Vault Path

HDFS location under which files containing Token Vaults will be written.

Use Derived Tokenization

If checked, random token generation is seeded using a value derived from the encrypted value of the input. The token can be generated from the input without checking a Token Vault, which makes the tokenization process significantly faster.

If this strategy is selected, a Derived Tokenization Key Name must be specified.

(This requires a Key Management System (KMS) to have been configured for use with Privitar. For more information, see Key Management Environment Configuration.)

The tokens produced still exhibit uniform randomness, and the input value from which any token was derived cannot be guessed without knowledge of the encryption key used to encrypt the input value.

Token Vault Encryption

If set, a Vault Encryption Key Name must be specified.

(This requires a Key Management System (KMS) to have been configured for use with Privitar. For more information, see Key Management Environment Configuration.)

Any changes to this setting are not automatically applied to existing Token Vaults. Changes are reflected in existing Token Vaults when:

  • A Job is run. Any affected Token Vaults are updated to reflect the new settings.

  • The Update Encryption action is invoked from the Environments list page. This action starts a background process that applies any changed settings to all existing Token Vaults referenced by the Environment.

This applies to both enabling or disabling Token Vault encryption.

HBase and Google Cloud Bigtable Token Vault

Select HBase Token Vault for both HBase and Google Cloud Bigtable clusters. The following settings are available:

Property

Description

hbase-site.xml

The HBase XML configuration. To generate a hbase-site.xml file for Google Cloud Bigtable, please see the Google Cloud documentation.

Namespace

The HBase namespace that this Token Vault should reside in. The default name is default but it is highly recommended to change it to a custom Namespace.

Note

Privitar will not automatically create a Namespace; it must exist before use.

Column Family

The HBase column family this Token Vault should use. The default is F.

Note

Privitar will not automatically create a new column family; it must exist before use.

Batch size

The number of values per batch that is sent to HBase for tokenization. This is a performance-related parameter. The recommended default is 1000.

PDD Deletion Thread Count

The maximum number of threads that will be used to delete a PDD.

HBase Jar Paths

The additional HBase Jar files that are required for using Google Cloud Bigtable. The Jar files required will be specific for each setup.

Each Jar file pathname must be defined as an absolute pathname and each separate pathname must be separated by ;.

This setting is optional. It is only required when using Google Cloud Bigtable.

When using an HBase Token Vault. the location of the client JAR on the Hadoop cluster nodes must be specified in the yarn-site.xml in yarn.application.classpath, in the Hadoop Cluster configuration.

If using HBase with Kerberos:

  • Keytabs must be installed on all nodes connecting to HBase.

  • The following properties must be added to hbase-site.xml:

    <!-- User principal for HBase -->
    <property>
      <name>hbase.kerberos.principal</name>
      <value>hbase@PRIVITAR.COM</value>
    </property>
    <!--- Location of keytab on nodes and Privitar -->
    <property>
      <name>hbase.keytab.file</name>
      <value>/etc/security/keytabs/hbase.headless.keytab</value>
    </property>

Note

Kerberos is not currently supported by the Google Cloud Bigtable Token Vault.

JDBC Token Vault

Only Oracle and PostgreSQL JBDC connections are currently supported.

When selecting JDBC Token Vault, the following settings are available:

Property

Description

URL

The JDBC URL of the RDBMS database.

Username

The username to authenticate with the database.

Password

The password to authenticate with the database.

JDBC Driver JAR Path

The path to the JDBC Driver on the Privitar filesystem. This driver will be used by the Spark Jobs running on the Hadoop cluster to connect to the RDBMS. This must therefore be set only when a Hadoop cluster is configured in the Environment.

KMS Key name

The name of the key used to encrypt the database credentials when they are shared with the Spark Jobs in the Hadoop Cluster.

This name must be set when a Hadoop cluster is configured in the Environment, and it also requires a Key Management System (KMS) to have been configured for use with Privitar. For more information, see Key Management Environment Configuration.

Note

If you are using CyberArk to store user credentials, there will also be settings for CyberArk descriptor queries. The descriptor queries are used to retrieve the Token Vault username and password from CyberArk.

For more information about configuring CyberArk for use with the Privitar platform, see the separately provided CyberArk Reference Guide. (Please contact Privitar for further information about CyberArk integration.)

Amazon DynamoDB Token Vault

Choose DynamoDB and configure the settings in the table (double-click on the values to fill in the fields). The table is already populated with sensible default values, so there is no need to change any of the fields. Required fields are marked with an asterisk.