Mask Data Stored in Teradata

You can use Fast Data Masker masking functions as Hive user-defined functions (UDFs) to mask data stored in Teradata. The stored data must be structured data with a defined schema. 
tdm49
You can use Fast Data Masker masking functions as Hive user-defined functions (UDFs) to mask data stored in Teradata. The stored data must be structured data with a defined schema. 
Create Teradata Type Transformation Map in Datamaker
  1. Connect Datamaker to a Teradata data source
  2. Create a project of type Teradata in Datamaker
    • Project name: ORDERS TERADATA
    • File Publish DBMS: Teradata
    • Publish To: Data Target
    • Inherit Tables: Yes
    • Type: DB
  3. Register the tables in the project.
  4. Click
    Project,Transformation Maps
    to create a transformation map.
    Important: Set the DBMS for the map to TERADATA.
  5. Use the drop-down menus under
    Transformation
    to set the masking function for each column that you want to mask.
  6. Save the transformation map.
Teradata supports the following masking functions:
ADDPERCENT,
n
Modifies an existing value by adding
n
percent of the original value.
Applies to:
Numeric data types
Example:
 ADDPERCENT,40
We add 40% to the value in the APPROXIMATE_INCOME column in the table PERSONS. The original value 18000 is masked as 25200.
ADDRANDDAYS,
min
,
max
Adds a random number of days between 
min
 and 
max 
to an existing date value.  
Applies to:
DATE data types
Example: 
ADDRANDDAYS,1,30
We want to add a random number of days between 1 and 30 to the existing value in the EXPIRATION_DATE column in the table CREDIT_CARDS. The original value 2018-01-07 is masked as 2018-01-14.
ADDRANDMONTHS,
min
,
max
Adds a random number of months between 
min
 and 
max 
to an existing date value.
Applies to:
DATE data types
Example:
 ADDRANDMONTHS,1,12
The EXPIRATION_DATE column in the table CREDIT_CARDS adds a random number of months between 1 and 12 to the existing value. The original value 2018-01-07 is masked as 2018-11-07.
ADDRANDOMNUM,
min
,
max
Adds a random number between
min
and
max
to the existing numeric value.
Applies to:
Numerics
Example:
 ADDRANDOMNUM,10,90
You want to add a random number between 10 and 90 to the existing value in the ORDER_TOTAL column in the table ORDERS. The original value 1721 is masked as 1788.
AMEXCARD
Generates a random American Express credit card number. The generated number has a valid number format and check digit.
Applies to:
Character columns
Example:
 AMEXCARD
You want random valid Amex numbers in the CARD_NUMBER column in the table CREDIT_CARDS. The original value 371449635398431 is masked as 375548058820296.
CHARHASH
Hashes a character string to consistently change to a new value. This function does not keep the original format of the string, nor does it guarantee uniqueness.
Applies to:
Character columns
Example: 
CHARHASH
You want to replace the text in the EMAIL column in the table PERSONS by random text. The original value  is masked as 
iBNr35^4\]}CU;<@/Q/qAGm=
.
CHARHASHPAT
Hashes a character string to consistently change to a new value. The function keeps the original format of the string, but does not guarantee uniqueness.
Applies to:
Character columns
Example:
 CHARHASHPAT
You want to replace the text in the EMAIL column in the table PERSONS with similar values. The original value  is masked as [email protected]
CREDITCARDKEEPTYPE
Generates a random credit card number while retaining the card type. For example, an existing Visa card number will be masked as a new Visa card number, Amex as a new Amex number etc. The generated number has a valid number format and check digit.
Applies to:
Character columns
HASH[,
n
]
Hashes a number to consistently change to another number. This function does not guarantee unique values. The optional numeric parameter, for example "
HASH,123",
provides the same function as HASH but with a hash key of 123.
Applies to:
Numeric columns
Example:
HASH
You want to change the PERSON_ID column in the table PERSONS consistently to a new number. The original value 202 is masked as 682671856
HASHLOV,
SEED DATA CATEGORY NAME
Hashes the current value to get an integer value. This integer value is used to consistently look up a value from a seed list.
Applies to:
character columns
Example:
HASHLOV,NAME - FEMALE INDIAN FIRST
You want to change the values in the FIRST_NAME column in the table PERSONS consistently to female Indian first names. The original value Donnachadh is masked as Poonam.
MASTERCARD
Generates a random Mastercard credit card number. The number has a valid number format and check digit.
Applies to:
Character columns
Example:
MASTERCARD
You want to mask the values in the CARD_NUMBER column in the table CREDIT_CARDS by random Mastercard numbers. The original value 4547357219782851 is masked as 5131725103379163.
RANDLOV,
SEED DATA CATEGORY NAME
Picks a random value from a seed list for the given category name.
Applies to:
character columns
Example:
 RANDLOV,NAME - FEMALE INDIAN FIRST
You want to change the value in the FIRST_NAME column in the table PERSONS randomly to a female Indian first name. The original value Lennox is masked as Sumita.
RANDOMNUM,
min
,
max
Generates a random number between the supplied minimum and maximum values.
Applies to:
numeric columns
Example:
 RANDOMNUM,10,70
You want to set the value in the QUANTITY column in the table ORDER_ITEMS to a random number between 10 and 70. The original value 40 is masked as 58.
RANDOMTXT,
min
,
max
Generates random text within the specified minimum and maximum length.
Applies to:
character columns
Example: 
RANDOMTXT,10,20
You want to replace the values in the OBJECT_NOTES column in the table ORDER_ITEMS by random text of length between 10 and 20. The original value "Test notes" is masked as "wwohultvirbaqmg".
REPLACE,
abc
,
xyz
Replaces all instances of the string
abc 
in the original value with the string
xyz
. The replacement is case sensitive.
Applies to:
character columns
Example:
REPLACE,provider.com,post.com
You want to replace all occurrences of yahoo.com in the EMAIL column in the table PERSONS with post.com. The original value [email protected] is masked as [email protected], while [email protected] remains the same.
SHUFFLE
Shuffles the columns values in a table.
Applies to:
numeric, date, and character columns
Example:
 SHUFFLE
You want to shuffle the values in the EMAIL column in the table PERSONS relative to the other column values. The original table (left) is masked as follows (right):
First Name
Last Name
Email
Tevin
Carr
Clennan
Riddle
First Name
Last Name
Email
Tevin
Carr
Clennan
Riddle
SSNKEEPTYPE[,
separator
]
Masks a US social security number, retaining the first 3 digits, which determine the area. Optionally, specify a 
separator
character.
Example: 
SSNKEEPTYPE,-
You want to generate new values for the SSN column in the table PERSONS, keeping the first 3 digits, and using a hyphen as a separator. The original value 473925722 is masked as 473-81-6680.
TAKERANDDAYS,
min
,
max
Subtracts a random number of days between
min
 and 
max 
from an existing date value.
Applies to:
DATE data types
Example: 
TAKERANDDAYS,1,30
You want to mask the EXPIRATION_DATE column in the table CREDIT_CARDS by subtracting a random number of days between 1 and 30 off the existing value. The original value 2018-01-07 is masked as 2017-12-16
TAKERANDMONTHS,
min
,
max
Subtracts a random number of months between 
min
 and 
max 
from an existing date value.
Applies to:
DATE data types
Example: 
TAKERANDMONTHS,1,12
You want to mask the EXPIRATION_DATE column in the table CREDIT_CARDS by subtracting a random number of months between 1 and 12 from the existing value. The original value 2018-01-07 is masked as 2017-09-07
TRANSLATE,
abc
,
xyz
Replaces all occurrences of each character in
abc
by its corresponding character in
xyz
. That is, you want to replace a by x, b by y, and c by z. Characters in the original value that are not in ABC are not replaced. The replacement is case sensitive.
Applies to:
Character data types
Example:
TRANSLATE,abcdef,hrtsgj
You want to replace all occurrences of the characters "abcdef" with the corresponding character in "hrtsgj", that is, replace a with h, replace b with r, etc. The original value [email protected] is masked as [email protected]
VISACARD
Generate a random Visa card credit card number. The number will have a valid number format and check digit.
Applies to:
Character columns
Example:
VISACARD
You want to replace the value in the CARD_NUMBER column in the table CREDIT_CARDS by a random Visa card number. The original 5131725103379163 is masked as 4547357219782851.
XREFLOOKUP
Looks up a masked value from the gtscr_xref table which is in the database where the scramble functions are installed. A prerequisite for this to work is that the gtscr_xref table is pre-populated with old and masked values for the identifier. If the original value is not found in the cross reference table (column rx_old_value) then it is ignored and the original value is retained.
Note:
You must supply the
Cross Ref Ident
value.  
Applies to:
Numeric and Character columns
Example:
XREFLOOKUP
You want to mask values in the column SSN using the lookup identifier SSN in the cross-reference table. The original value 111111112 is masked as 473925722.
Rx Ref Id
Rx Old Value
Rx New Value
SSN
111111112
473925722
SSN
111111115
323632875
Generate Masking Scripts in TDM Subset
Generate scripts to move data.
  1. Open TDM Subset.
  2. Select
    File, Build Database Actions
    . The Database Action - Build Teradata Window opens.
  3. Go to the
    Extra Details
    tab and fill in the following Teradata-specific fields.
    1. Action name
    2. Functions Database
      – if generating scramble scripts, this is the name of the database where the scramble C UDF functions have been installed – choose from drop down list or type database name
    3. Export Log Database
      – the database where the export log tables will be created. Typically this will be the database where the data is extracted from – choose from drop down list or type database name.
    4. Load Log Database
      – the database there the load log tables are created. Typically this will be the target database where the data will be loaded into – choose from drop down list or type database name.
    5. Source Database
      – the database the data will be extracted from– choose from drop down list or type database name, or choose environment variable (name of the variable enclosed in %)
    6. Target Database
      – the database the data will be loaded into – choose from drop down list or type database name, or choose environment variable (name of the variable enclosed in %)
    7. Stage Source Database
      – for stage insert scripts, the database the data will be selected from – choose from drop down list or type database name.
    8. Stage Target Database
      – for stage insert scripts, the database the data will be inserted into - choose from drop down list or type database name, or choose environment variable (name of the variable enclosed in %)
    9. Log Directory
      – (Windows only) the directory the log files will go to. This can be a directory path or a Windows environment variable that points to a directory or choose environment variable (name of the variable enclosed in %)
    10. Extract file directory
      — (Windows only) the directory the fast export data files will go to, and where the load scripts will load from – this can be a directory path or a windows environment variable that points to a directory or choose environment variable (name of the variable enclosed in %)
    11. Get Metadata from repository connection
      — (Optional) Extract the details directly from the Datamaker repository to speed up the response when building masking scripts. 
    12. Click
      Generate
      to extract the details as normal.
  4. Go to the
    Extract Tables
    tab
    1. Click
      Open Extract Directory
      and browse to the directory where you saved extracts definitions.
    2. Select check boxes to add extract tables to the list at the bottom of the screen. If you wish to choose from ALL the schema tables rather than those defined in a Subset extract definition, then check the ‘Use Connection Tables Only’ checkbox. This leaves the ‘Subset Tables’ list box empty and fills the ‘All Data Tables’ list with the tables for the current connection. You can then move tables between this list and the ‘No Data Tables’ list to define which tables have their data extracted.
    3. Click Generate to create the appropriate export and import scripts to populate the subset schema
  5. (Optional) Go to the
    Extra Scripts
    tab.
    1. Drop / Create Indexes – Creates SQL scripts to drop and recreate indexes and primary keys. These are called by the relevant load scripts since Teradata does not support loads into tables with secondary indexes. This is the default.
    2. Truncate / Delete From Load Tables – Clear down data from the target database prior to loading data
    3. Disable / Enable or Drop / Create FKs – For Teradata, this drops foreign key constraints prior to the data load, then recreates them after the load has completed
Use bind variables in Teradata Windows exports when defining extracts against Teradata. For example:
select * from orders.PERSONS
where person_id < :id
AND FIRST_NAME  = :NAME
The binds are added to the driving table SQL. Create environment variables with the same name in the Windows system to resolve the variables at run time. Enclose non-numeric values in single quotes.