Data Anonymization.


Table of Contents



Overview

Purpose: This feature is designed to allow developers to test applications in environments that simulate production settings without exposing sensitive data.

It supports the substitution of real record values with anonymized, fake ones in the target environment.

Feature Configuration

Developers can define which fields within the SObject should have their real values replaced with fake data, and the specific patterns for generating these fake values.

Example Configuration

Here is a script configuration example that anonymizes the Name field in the Account object:

"objects": [
    {
        "query": "SELECT Id, Name FROM Account",
        "operation": "Insert",
        "externalId": "Name",
        "mockFields": [
            {
                "name": "Name",
                "pattern": "name",
                "excludedRegex": "^DummyAccount$",
                "includedRegex": "Account\\sTo\\sMask"
            }
        ],
        "updateWithMockData": true
    }
]

This script replaces the original Name values with randomly generated fake names during data insertion.

Detailed MockField Object Configuration

  • name: Specifies the field to anonymize. Using "name": "all" applies anonymization rules to all fields, except those listed in excludeNames.
  • excludeNames: An array of field names to exclude from anonymization when "name": "all" is used.
  • pattern: Defines the type of fake data to generate for the field.
  • excludedRegex and includedRegex: JavaScript regex expressions that exclude or include values for anonymization. The --row flag can be added to these expressions to exclude/include the entire row based on the matching field value.

Special Anonymization Functions and Patterns

1. c_seq_number(prefix, from, step)

Generates a sequence of strings with numbers, starting from a specified number, incremented by a defined step, prefixed by a string.

Example: c_seq_number('Account ', 100, 10) generates "Account 100", "Account 110", "Account 120", etc.

2. c_seq_date(from, step)

Produces a sequence of dates starting from a given date, incremented by a defined step. Steps can be:

  • "d": Increment by one day.
  • "-d": Decrement by one day.
  • "d0": Use the same date as "from" without incrementing or decrementing.
  • "m": Increment by one month.
  • "-m": Decrement by one month.
  • "y": Increment by one year.
  • "-y": Decrement by one year.
  • "s": Increment by one second.
  • "-s": Decrement by one second.
  • "ms": Increment by one millisecond.
  • "-ms": Decrement by one millisecond.

Example: c_seq_date('2020-01-01', 'd') would generate dates "2020-01-01", "2020-01-02", "2020-01-03", etc.

3. c_set_value(value)

Sets the field to a specified value.

Example: c_set_value("Active") sets the field to "Active".

4. ids

Populates the given field with the source record ID value.

Special Regex Expressions

  • *: Matches any value, which is useful for enforcing anonymization on any non-empty field.
  • ^*: Matches only empty values, which is useful for excluding empty fields from anonymization.
Last updated on 20th Apr 2024