Masking Options

The Options tab includes additional parameters to control the masking run, audit options, cross-referencing options, and seed table options. A full list of the available options, what they do, and what values apply is as follows:
tdm481
The
Options
tab includes additional parameters to control the masking run, audit options, cross-referencing options, and seed table options. A full list of the available options, what they do, and what values apply is as follows:
2
2
Audit
  • AUDIT
    =ALL
    All rows are audited.
  • AUDIT
    =ROWnnn
    nnn represents the number of rows to be audited. For example, ROW1000 produces an audit of the first 1000.
  • AUDIT
    =SAMPLEnnn
    Every nnn rows is displayed. For example, SAMPLE100 produces an audit of every 100th row.
  • AUDITDIR
    Set the path to the audit file directory.
  • AUDITEPASSWORD
    Set the encrypted password for the audit ZIP file.
  • AUDITFILE
    The name of the file in which to store the audit information; myaudit.csv.
  • AUDITONLYCOLUMNS
    Mention the specific list of columns to be audited in the format—table1.column1, table2.column2, table3.column3.
  • AUDITPASSWORD
    Set the password for the audit ZIP file.
  • AUDITVALUES
    N—show only new values, not the old values in the AUDIT file. The default value is to show both.
  • AUDITZIP
    Zip and encrypt the audit CSV file. Values are winzip or jzip for the program to use for the zip.
Cross-Reference
  • CASEINSENSITIVEXREF
    Make comparisons case insensitive (for cross-reference).
  • CROSSREFCONNECT
    The name of the connection file to read and write cross-reference data; Connectscramble.txt.
  • CROSSREFTABLE
    The name of the table to read and write cross-reference data to; Gtsrc_xref.
  • ENCRYPTXREF
    Encrypt the old values in the cross-reference table.
  • TRIMMEDXREF
    Trim values before comparing (for cross-reference).
Date
  • CDATE
    Override the date (today) for the purposes of date calculation functions. For example, DOB (format: YYYYMMDD).
  • HIGHDATE
    Override the highest data that offset date functions process. For example, dates later than 22000101 are ignored.
  • LOWDATE
    Override the lowest data that offset date functions process. For example, dates earlier than 19000101 are ignored.
Directory
  • BACKUPDIR
    The directory name for backup files. If this setting is blank, the default from the Fast Data Masker directory is used.
  • ERRORDIR
    The directory name for error files. If this setting is blank, the default from the Fast Data Masker directory is used.
  • LOGDIR
    The directory name for log files. If this setting is blank, the default from the Fast Data Masker directory is used.
  • SEEDFILEDIR
    The directory name where seed data files are stored. The default value is seedtables sub-directory.
Languages
Fast Data Masker currently supports three languages for masking—English, German, and Spanish.
When Fast Data Masker starts, it verifies the current default locale. If the language is supported (one of the three listed), it processes messages in that language. If the language is not supported, it defaults to English unless set in the options file.
You can override the local language by altering the language option as follows:
  • LANGUAGE
    Use one of the three languages—en (English), de (German), or es (Spanish)
Large Tables
  • LARGETABLESPLITENABLED
    Enables large tables processing. Set this parameter to Y to enable, and to N to disable.
    Default: N
  • LARGETABLESPLITSIZE
    Defines the minimal number of rows for FastDataMasker to start using large table processing.
    Default: 1000000
Parallelism
  • PARALLEL
    =n
    Enables users to attempt to run ‘n’ number of concurrent threads. This is constrained by the number of physical cores and processors available.
    Note:
    When masking Microsoft SQL Server tables, use the PARALLEL option only when the table has a primary key or unique index. If the table does not have a primary key or unique index and you use the PARALLEL option, masking is either slow or does not work.
    Fast Data Masker assigns a separate thread for each table in a CSV if there is more than one. However, a CSV would have only one table, which can be split using the WHERE clauses. For example, a CSV using the WHERE clauses below would have four splits:
    WHERE, CUSTID<100
    ….
    WHERE, CUSTID BETWEEN 100 AND 200
    ….
    WHERE, CUSTID BETWEEN 200 AND 300
    ….
    WHERE, CUSTID>100
Seed Tables
  • CASEINSENSITIVESEED
    Makes search for rd_ref_value column, using RANDLOV1 function case insensitive.
  • LOADALLSEEDDATA
    N (default)—loads all seed data into Java memory for the RANDLOV1 function, irrespective of the distribution of rd_ref_values in the table.
  • SEEDTABLECONNECT
    The name of the connection file to get seed data from; Connectscramble.txt.
  • SEEDTABLE
    The name of the table to get the seed data from; Gtsrc_reference_data.
  • SEEDTABLECOLUMNS
    Comma separated list of the columns in SEEDTABLE.
Shuffle
  • SHUFFLEDISTINCT
    This option takes Y or N.
    Y selects distinct values for the shuffle creates.
    N is the default and selects all values.
  • SHUFFLELIMIT
    n—only select n values for the shuffle.
  • SHUFFLEONLY
    This option takes Y or N.
    Y does not update the database. Instead, it just produces the shuffle files or database shuffle values.
Poststep
  • POSTSTEP
    This is the path to a SQL file to perform post-steps, the SQL should be ANSII standard insert, update, or delete operations. The SQL file is executed after the masking.
Prestep
  • PRESTEP
    This is the path to a SQL file to perform pre-steps, the SQL should be ANSII standard insert, update, or delete operations. The SQL file is executed prior to masking.
Other
  • BADDATESTRING
    For DOB/DOD on dates stored in character fields, specify the data to replace the invalid data as YYYY/MM/DD.
  • BATCHSIZE
    Number of lines to commit to a database at a time.
  • BLANKSASNULLS
    Set to Y. For character data types, if the column contains blanks to the column width, treat as a null for keepnulls in the masking CSV.
  • CASEINSENSITIVEHASHLOV
    This option is always used with the HASHLOV function. With this option, you can define whether the HASHLOV function masks the data in a case-sensitive or case-insensitive mode. By default, this option is set to the case-insensitive mode. Enter N as a value to set this option to the case-sensitive mode.
    Example:
    The firstname and lastname columns include the data in the following format:
    firstname
    lastname
    Jean
    Muller
    JEAN
    MULLER
    jean
    muller
    jEAN
    mULER
    When the option is set to the case-insensitive mode (default mode) and you use the HASHLOV function, the data is masked as follows:
    firstname
    lastname
    Isabel
    Rowland
    Isabel
    Rowland
    Isabel
    Rowland
    Isabel
    Rowland
    Now, when you set the option to the case-sensitive mode and use the HASLOV function for masking, the same data is masked as follows:
    firstname
    lastname
    Melany
    Maynard
    Isabel
    Rowland
    Tania
    Sutton
    Daniella
    Buchanan
  • CHUNKSIZE
    (File masking only)
    Number of lines to write to a file at a time.
  • COMMIT
    =nnnn
    Commit after nnn rows for each table to be masked. For example, 1000 forces a commit after 1000 rows for each table.
  • DBUPDATES
    N—run in simulation mode
    S—create SQL file <table name>_UPDATES.sql
    P—see Prestep and Poststep options
    Note:
    DBUPDATES=S only available for non-DB2 databases with unique or primary key columns.
  • DB2BATCHUPDATE
    N (Default). If Y, use fast batched updates rather than standard "update where current of" cursor method (DB2 ONLY).
  • DIAGLEVEL
    Possible values: 0, 1, 2 or 4. Debug info is generated according to value.
  • DROPRESTART
    Set the value to N if you do not want to drop the restart column (which Fast Data Masker creates) after masking is complete. Retaining the restart column is helpful in scenarios where you want to use it for audit purposes. The default value is Y. This option is not applicable if you explicitly specify your own restart column.
  • EMPTYASNULL
    Set to Y. For character data types, if the column contains a blank or spaces, treat as a null for keepnulls in the masking CSV.
  • FASTIGNORE
    Set the value to Y if you want to use this option. This options is always used with the IGNORE function. When used with the IGNORE function, this option improves the masking performance.
    The IGNORE function inserts as well as retrieves data from the cross-reference table, which is why it is row-by-row processing and slow. However, with the FASTIGNORE option, you update the data in one SQL statement, rather than row-by-row (which is very slow).
    Review the following considerations when using the FASTIGNORE option:
    • Ensure that the cross-reference table and the table to be masked are on the same RDBMS.
    • Ensure that the cross-reference table and the table to be masked are on the same server.
    • Ensure that the cross-reference table must have old and new values pre-populated for the chosen cross-reference identifier.
    • Collation of the table to be masked and gtsrc_xref should be the same.
  • FETCHSIZE
    Number of lines to read from a database or file at a time.
  • FORMATENCRYPTDELIMITER
    Defines one or more single-character delimiters, to separate the value to mask into strings. FORMATENCRYPT then masks each string separately.
    - Each delimiter must be either a single character, or the case-insensitive keyword SPACE (to indicate a space, i.e. " "). If you enter more than one consecutive character, FORMATENCRYPT ignores the delimiter.
    - Separate delimiters with a space, for example "
    - , \ space
    " to use the characters hyphen, comma, backslash and space as delimiters.
  • FORMATENCRYPT1DELIMITER
    Defines one or more single-character delimiters, to separate the value to mask into strings. FORMATENCRYPT1 then masks each string separately.
    Delimiter logic for FORMATENCRYPT1DELIMITER is the same as FORMATENCRYPTDELIMITER.
  • GLOBALLICENSE
    Set the path to your global license key.
  • LOWERCASEKEY
    Specify a lowercase key for masking; for example,
    qazwsxedcrfvtgbyhnujmikolp
    . Ensure that the key does not start with the character
    a
    . You can use this option with the FORMATENCRYPT masking function.
    For example, if you use this option with the FORMATENCRYPT function, the function starts making the first occurrence of the lowercase character, which the function ignores if this option is not set.
  • MD5HASHLOV
    Set this value to Y to use an MD5 hashing algorithm with the HASHLOV functions. Leave this value blank to use the default Java hashing algorithm.
  • NUMERICKEY
    Specify a numeric key for masking; ; for example,
    8524569173
    . Maximum 15 digits are allowed. If you want to enter more than 15 digits, provide value in quotes; for example,
    "9182736450514239687"
    .
    You can use this option with the FORMATENCRYPT, FORMATLUHN, FORMATVIN, and HASHTURKISHID masking functions.
  • ORDERBY
    Y (Default)—you might want to turn it off for RMS(VMS). This value decides whether selected data is ordered by PK column or not.
  • PROCESSCOUNT
    Process count to limit the number of rows processed per table.
  • RELAXNONINDEXVALIDATION
    XREF on non-varchar columns of no PK/UK tables.
  • RESETRESTART
    Reset restart column (de_ident_ind) to null, masking starts from Row 1.
  • RESTART
    Restart mask from last fail point. Requires varchar column (de_ident_ind) added to table(s) to be masked, or use existing (empty) column in the tables to be masked. Add this column name to the "restart column" column in your masking CSV.
  • TRIMVALUES
    This option can accept "Y" for yes and "N" for no. Yes implies you want to remove the leading and trailing spaces from all the columns that you have selected for masking. If you do not specify the value, Fast Data Masker uses the default value N.
  • UPPERCASEKEY
    Specify an uppercase key for masking; for example,
    PLOKMIJNUHBYGVTFCRDXESZWAQ
    . You can use this option with the FORMATENCRYPT masking function.
  • USERFAASINDEX
    Use RFA in the WHERE clause of the update SQL (VMS(RMS) DB only).
  • USERRNASINDEX
    For non-indexed tables, use RRN function to get row identifier, and then combine with DB2BATCHUPDATE to use fast method (DB2400 only).
  • WHEREASSUBSET
    Y—Flat files are scrambled and subsetted according to the WHERE clause.
    N—Use WHERE as criteria to mask and generate output of all records.
    Default: Y
  • XMLNAMESPACE
    One or more tags in XML file or data contain xmlnamespace elements.