Working with Existing Jobs
Details of Jobs and records of their runs remain in the system and can be accessed from the Job Details page. This page can be viewed by clicking on the name of the Job in the Jobs page.
The Jobs page lists all Job types that are enabled in your Privitar installation in their corresponding tabs: Batch Jobs, Data Flow Jobs and Privitar On Demand Jobs.
These frequently asked questions will help when working with existing Jobs.
Batch Jobs
The following sections cover some of the tasks associated with Batch Jobs.
All of the following sections assume you are viewing the Job Details page for a particular Job. To navigate to the Job Details page, follow these steps:
Select Jobs from the Navigation sidebar. The Jobs page is displayed.
Select the Batch Jobs tab.
Find the Job that you want to run. Use the Filter search box to filter the Jobs if required.
Finding the output data for a previously run Batch Job
When a Job is run successfully, its output data is written back to the location specified by the Protected Data Domain. To find the Protected Data Domain:
Click on the name of the Job in the Name column.
The Job Details page is displayed containing details of the Job together with a list of all the times that the job has been run.
The PDD output for each Job run is shown in the Protected Data Domain column.
The output data can be found in the Protected Data Domain's details. This can be found from Protected Data in the Navigation sidebar. For more information on how to view the details of a PDD, see Viewing PDD details.
Re-running a Batch Job
You would typically re-run a Batch Job after new data becomes available or when a Policy is changed.
To re-run a Job, click on Run from the Jobs page.
Every new invocation of a Job is listed in the Job Details page.
Viewing more details about a Batch Job
To view more detail about the running of a Batch Job, in the Job History tab, click on Show Details in the Details column for the Batch Job you are interested in.
The Job Results dialog box is displayed showing information about the steps involved in executing a Job.
For each step, select the row to see a breakdown of the work done in that step.
STEP | Description |
---|---|
LOAD | The time taken to load an input file, the number of rows, and the total data size. |
PARTITION (Optional) | The number (and size) of Spark partitions before and after the repartitioning of the Spark Dataframe. |
MASKING | The rule that was applied to each column. |
OUTPUT | The volume of data written after transformation and the location of the output. |
Viewing diagnostic information about a failed Batch Job
Diagnostics provides access to the log file written while the Job was executing, and snapshots of the Policy and Environment that were used.
The snapshots are important because they provide a record of the Policy and Environment settings at the time of invocation. Privitar does not prevent a Policy or Environment from being modified after being used in a Job. When debugging, it may be useful to refer to these snapshots.
To view more detail about the running of a Batch Job, in the Job History tab, click on Diagnostics in the Diagnostics column for the Batch Job you are interested in. Click on Download to download the diagnostic file.
Finding a Batch Job's API ID
To invoke a Job from the Privitar Automation v3 APIs, its ID is required. This is a unique alphanumeric code associated with the Job.
The Job Details page displays the Job ID field containing the Job ID.
Data Flow Jobs
The following sections cover some of the tasks associated with Data Flow Jobs.
All of the following sections assume you are viewing the Job Details page for a particular Job. To navigate to the Job Details page, follow these steps:
Select Jobs from the Navigation sidebar. The Jobs page is displayed.
Select the Data Flow Jobs tab.
Finding the unique ID of a Data Flow Job
To set up a data flow plug-in Job, the Data Flow Job ID is required. This is a unique alphanumeric code associated with the Job.
The Job Details page displays the Job ID field containing the Job ID.
For more information about Data Flow jobs, see What are Data Flow Jobs?.
Downolading the input and output formats of a Data Flow Job
To configure a data flow plug-in, the input format and the output format of a Data Flow Job are required.
To download the Avro schema definition of a Schema:
Click on the row-select button to select a Data Flow Job.
Select Export Input/Output Format from the Actions menu. The Export dialog box is displayed.
Select Input or Output as the format to download.
Click on Export to download the file.
Privitar On Demand Jobs
The following sections cover some of the tasks associated with On Demand Jobs. For more information about Privitar On Demand Jobs, see About Privitar on Demand.
Finding the unique ID of a Privitar On Demand Job
To set up a Privitar on Demand Job that can be called through the SDK or Privitar on Demand HTTP API, the Job ID is required. This is a unique alphanumeric code associated with the Job.
To find this for a specific Privitar On Demand Job:
Select Jobs from the Navigation sidebar. The Jobs page is displayed.
Select the Privitar On Demand Jobs tab.
Select the name of the Job in the Name column.
The Job Details page is displayed. The Job ID field contains the Job ID.