Data Anonymization.


Table of Contents



Overview

Purpose: This feature is designed to allow developers to test applications in environments that simulate production settings without exposing sensitive data.

It supports the substitution of real record values with anonymized, fake ones in the target environment.

Feature Configuration

Developers can define which fields within the SObject should have their real values replaced with fake data, and the specific patterns for generating these fake values.

Example Configuration

Here is a script configuration example that anonymizes the Name field in the Account object:

"objects": [
    {
        "query": "SELECT Id, Name FROM Account",
        "operation": "Insert",
        "externalId": "Name",
        "mockFields": [
            {
                "name": "Name",
                "pattern": "name",
                "excludedRegex": "^DummyAccount$",
                "includedRegex": "Account\\sTo\\sMask"
            }
        ],
        "updateWithMockData": true
    }
]

This script replaces the original Name values with randomly generated fake names during data insertion.

Detailed MockField Object Configuration

  • name: Specifies the field to anonymize. Using "name": "all" applies anonymization rules to all fields, except those listed in excludeNames.
  • excludeNames: An array of field names to exclude from anonymization when "name": "all" is used.
  • pattern: Defines the type of fake data to generate for the field.
  • excludedRegex and includedRegex: JavaScript regex expressions that exclude or include values for anonymization. The --row flag can be added to these expressions to exclude/include the entire row based on the matching field value.
  • excludeNames: Fields to exclude from anonymization, for example, when using the all keyword as name. This helps to selectively anonymize fields within an object while preserving specific field data.

Most Common Anonymization Patterns

Here's a streamlined list of patterns for use with SFDMU:

Address

  • Country: country
  • City: city
  • Street: street
  • Address: address
  • ZIP Code: zip

Personal Information

  • Name: name
  • Full Name: full_name
  • Username: username
  • First Name: first_name
  • Last Name: last_name
  • Email: email

Text

  • Sentence: sentence
  • Title: title
  • Text: text
  • Word: word

Internet

  • IP Address: ip
  • Domain Name: domain
  • URL: url

Numbers and Dates

  • Random Number: integer
  • Date: date
  • Time: time
  • Year: year

Special Anonymization Functions and Patterns

1. c_seq_number(prefix, from, step)

Generates a sequence of strings with numbers, starting from a specified number, incremented by a defined step, prefixed by a string.

Example: c_seq_number('Account ', 100, 10) generates "Account 100", "Account 110", "Account 120", etc.

2. c_seq_date(from, step)

Produces a sequence of dates starting from a given date, incremented by a defined step. Steps can be:

  • "d": Increment by one day.
  • "-d": Decrement by one day.
  • "d0": Use the same date as "from" without incrementing or decrementing.
  • "m": Increment by one month.
  • "-m": Decrement by one month.
  • "y": Increment by one year.
  • "-y": Decrement by one year.
  • "s": Increment by one second.
  • "-s": Decrement by one second.
  • "ms": Increment by one millisecond.
  • "-ms": Decrement by one millisecond.

Example: c_seq_date('2020-01-01', 'd') would generate dates "2020-01-01", "2020-01-02", "2020-01-03", etc.

3. c_set_value(value)

Sets the field to a specified value.

Example: c_set_value("Active") sets the field to "Active".

Notes:

4. ids

Populates the given field with the source record ID value.

RAW_VALUE Keyword

If you want to update a field value based on its existing value, you can use the RAW_VALUE keyword (similar to the RAW_VALUE used in Value Mapping RAW_VALUE Keyword). For example, to add .test to the original email address, you can use the following configuration:

{
    "name": "Email",
    "pattern": "c_set_value('RAW_VALUE.test')",
    "includedRegex": "*"
}

This example appends .test to the current value of the "Email" field if the original email value is present.

Special Regex Expressions

  • *: Matches only any non-empty value, which is useful, for example, for including only non-empty fields in anonymization (when defined as an includedRegex).
  • ^*: Matches only empty values, which is useful, for example, if you want to populate all empty values (when defined as an includedRegex).

Full List of Available Patterns

  • The plugin leverages the casual library to generate patterns.

  • For a complete catalog of patterns appropriate for the SFDMU Data Anonymization feature, please visit the Full List of Available Patterns. When using these patterns, refer directly to the pattern name such as sentences(n = 3) instead of casual.sentences(n = 3). Use these names directly in the pattern attribute of the MockField object. Note that the casual reference page does not cover the Special Anonymization Functions and Patterns listed above.

Last updated on 14th Aug 2024