Installation
This section describes the installation procedure for the Privitar StreamSets Data processor.
Pre-requisites
It is assumed that you have installed the following software:
StreamSets v3.16.0
Privitar Data Privacy Platform v3.8.0 (or later)
Installation procedure
Install the plug-in.
Install the necessary Token Vault drivers.
Restart StreamSets.
Confirm that the plug-in is available in StreamSets as a Processor.
Installing the Privitar data connector plug-in
The Privitar StreamSets data processor is provided as a tar file called:
privitar-data-flow-streamsets-<x.x.x>.tar
where <x.x.x>
is the version of the platform. For example:
privitar-data-flow-streamsets-3.8.0.tar
To un-tar the file and install the plug-in (assuming you are using v3.8 of the platform):
Copy the tar file into the StreamSets directory that is defined by the StreamSets environment variable:
USER_LIBRARIES_DIR
(You can discover the definition of this variable using the
env
command from StreamSets.)Un-tar the file, using the command:
tar -xvf privitar-data-flow-streamsets-3.8.0.tar
This command creates a directory called:
USER_LIBRARIES_DIR/privitar-data-flow-streamsets/
The Privitar StreamSets Data Processor jar file is located in:
privitar-data-flow-streamsets/lib/privitar-data-flow-streamsets-3.8.0.jar
Installing the Token Vault driver
Drivers are required by the Privitar plug-in to connect to the Privitar Token vault. These drivers need to be added to the same location as the jar file for the Privitar StreamSets data processor. That is:
USER_LIBRARIES_DIR/privitar-data-flow-streamsets/lib/
The drivers to include are specific to the type of database you are using to store the Privitar Token Vault. For a StreamSets processing environment, the following types of Token Vault are supported:
Relational Database (JDBC) including PostgreSQL (v9.6 and later) and Oracle (11g, 12c and later.)
HBase v2.2.x and later
For JDBC drivers, you can use the drivers that are provided from PostgreSQL and Oracle vendors.
For HBase, the platform provides a custom version of the HBase driver that can be used with the HBase Token Vault. The driver that needs to be used depends on the Hadoop Vendor that the HBase Token Vault is running on:
Hadoop vendor | Privitar HBase driver jar name |
---|---|
Google Bigtable |
|
Cloudera CDH6 - HBase |
|
For more information on accessing the Privitar HBase drivers, contact support@privitar.com.
Restart StreamSets
To restart StreamSets:
Select the Administration icon in the top-right corner of the StreamSets main page.
Select Restart from the menu.
Confirm that the Privitar data connector is available
The Privitar StreamSets Data Processor should now be available from StreamSets in the Processors list box. For example:
![]() |
Two processing components are available:
Apply Privitar Policy - use this component to apply a Privitar Policy (de-identify data).
Apply Privitar Unmasking - use this component to Unmask (re-identify data).