Skip to main content

User Guide

What is a Schema?

A Privitar Schema is a description of the tables and columns of the input data, likely a Hive database, Avro, Parquet or CSV file. Before data can be de-identified with Privitar, a corresponding Schema must be created in Privitar.

A Schema represents the structure of the input data. It contains:

  • A list of named tables. Each table must have a name.

  • Inside each named table are named columns of a particular data type.

Once the Schema exists, Policies can be created based on it. This means that any data that conforms to the Schema can be processed by that Policy.

It is only necessary to create a single Schema for a known set of tables, and then reuse that Schema for as many Policies as required.