Skip to main content

User Guide

Specifying Automatic Generalization

Before specifying Automatic Generalization, note the following points:

  • In order to add generalization for a column, it must first be added to the Masking tab. Generalization processes the output value of the masking step. Usually, a Retain masking Rule is applied.

  • It is not required to have a generalization strategy for every column.

To specify Automatic Generalization:

  1. Select Policies from the Navigation sidebar. The Policies page is displayed showing all the Policies that have been created.

  2. Select a Policy by clicking on the Policy name in the Name column. The Edit Policy page is displayed showing all the columns together with any rules that have been assigned to the columns.

  3. Select the Automatic Generalization tab.

  4. Click the ON slider to enable the option.

  5. Enter a value for k in the Minimum Cluster Size box.

  6. Select Add Column to add a column to be included in the Automatic Generalization strategy.

    For each column, configure the Automatic Generalization strategy for that column. The choice of strategy determines how the Automatic Generalization algorithm blurs values for that column. The extent to which the values in each column are blurred depends on the input data and will be automatically determined by the Automatic Generalization algorithm. There are different generalization strategies applicable for different Privitar data types.

    For a complete definition of the available strategies and which Privitar data types they can be applied to, see Automatic Generalization Strategy Types.

    A column can also be also be configured as Do not generalize. If you choose this option, you may also set this as a sensitive column by selecting the Sensitive check box. Specifying a column as Sensitive ensures that for each cluster of rows with the same quasi-identifier values, there is a diverse mix of values for the sensitive columns. For more information, see Automatic Generalization Advanced Settings.

  7. Select the Priority check box if you want the column to be one of the two priority columns in the generalization strategy. All columns that are of interest for any intended analysis or have a particular structure that should be preserved can be selected as a priority column.

    Note

    A maximum of two priority columns can be selected at the same time.

  8. Click Save to save the Policy.

For more information about quasi-identifiers and minimum cluster sizes, see What is k-anonymity?.