Data Anonymization.

This feature allows to mask the real data by replacing with a randomly generated.


Feature overview.

If you're developing an application, you'll want to make sure you're testing it under conditions that closely simulate a production environment. In production, you'll probably have a sensitive data that usually you do not want to expose in the testing environment. To help with this use case I have added the option to mask real record value with a fake one before uploading it to the target.

You can define the list of fields that their values need to be replaced with the fake data and the pattern for each field under mockFields section of SObject.

Below is the example of script that will generate a sequence of random fake names instead of the original Name values before the records will be inserted into the target org.

  objects: [
      {

          "query": "SELECT Id, Name FROM Account",
          "operation": "Insert",
          "externalId": "Name",

          "mockFields": [
                {
                    "name": "Name",
                    "pattern": "name",
                    "excludedRegex": "^DummyAccount$",
                    "includedRegex": "Account\\sTo\\sMask"
                }
          ],
          "updateWithMockData": true    
      }
  ]

This will replace the original account names and produce Accounts records like this:

  [
      {
          "Id" : "[RECORD ID1]",
          "Name" : "Miss Perry Larson",
      }, 
      {
          "Id" : "[RECORD ID2]",
          "Name" : "Ms. Alec Romaguera",
      },

      {
          "Id" : "[RECORD ID3]",
          "Name" : "Ms. Alec Romaguera",
      },

      {
          "Id" : "[RECORD ID4]",
          "Name" : "Ms. Drake Gerlach"          
      },
      ...
  ]

MockField properties:

"name" is the name of field to anonymize. Use "name" = "all" to apply the rule to ALL FIELDS of the current object's query string. For example, you can anonymize all textual (string) fields (the Name field is excluded) with the following setting:

objects: [
      {

          "query": "SELECT Id, type_string FROM Account",
          "operation": "Insert",
          "externalId": "Name",

          "mockFields": [
                {
                    "name": "all",
                    "excludeNames": ["Name"],
                    "pattern": "name",
                    "excludedRegex": "^DummyAccount$",
                    "includedRegex": "Account\\sTo\\sMask"
                }
          ],
          "updateWithMockData": true    
      }
  ]

"excludeNames" is the string array of field names to NOT anonymize, if the 'all' keyword is used instead of specific field name.

"pattern" is the pattern what to put into that field instead of the original value.

"excludedRegex" is JS regex to skip anonymization and use the original value if the value is matching the expression.

"includedRegex" is JS regex to include into the anonymization for the matching value even it was already excluded by the excludedRegex expression.

Add *"--row"* flag to the excludedRegex or includedRegex expression(f.ex *"excludedRegex": "^DummyAccount$ --row") in order *to skip checking other fields and immediatelly exclude / include the entire row from the anonymization when the field value is matching the given expression.

You can find complete list of the available patterns here

(For the "pattern", omit the "casual." prefix leaving only the name of the function, for example, you can write: *"name", city", "street", "address"* etc.)

Special Regex expressions:

  • * - matches ANY value, for example you can force anonymization only if any value exists by specifying includedRegex='*'. This will remain blank field values unchanged.
  • ^* - negative of '*', matches only EMPTY value.

Special anonymize patterns:

  • ids - populates given field with the source Record Id value.
  • c_seq_number(prefix, from, step) - produces sequence of strings terminated by number starting from the number defined by the "from" parameter incremented by the "step". For example, c_seq_number('TheRecord ', 1, 2) will generate strings *"TheRecord 1", "TheRecord 3", "TheRecord 5", etc.*
  • c_seq_date(from, step) - produces sequence of dates from the "from" date with defined "step".

For example: c_seq_date('2019-01-01', 'd') will generate a dates sequence *"2019-01-01", "2019-01-02", "2019-01-03"*

Available values for the step parameter are:

*"d" :* + one day

*"-d" :* - one day

*"d0" :* the same date as "from" without increment / decrement

*"m" :* + one month

*"-m" :* - one month

*"y" :* + one year

*"-y" :* - one year

*"s" :* + one second

*"-s" :* - one second

*"ms" :* + one millisecond

*"-ms" :* - one millisecond

  • c_set_value(value) - sets the field to the value passed in the function parameter, e.g. c_set_value(null) will reset the field value to null.

See also:

Full export.json format.

Last updated on Sa Oct 2022