The Options tab includes additional parameters to control the masking run, audit options, cross-referencing options, and seed table options. A full list of the available options, what they do, and what values apply is as follows:
Optionstab includes additional parameters to control the masking run, audit options, cross-referencing options, and seed table options. A full list of the available options, what they do, and what values apply is as follows:
- AUDIT=ALLAll rows are audited.
- AUDIT=ROWnnnnnn represents the number of rows to be audited. For example, ROW1000 produces an audit of the first 1000.
- AUDIT=SAMPLEnnnEvery nnn rows is displayed. For example, SAMPLE100 produces an audit of every 100th row.
- AUDITDIRSet the path to the audit file directory.
- AUDITEPASSWORDSet the encrypted password for the audit ZIP file.
- AUDITFILEThe name of the file in which to store the audit information; myaudit.csv.
- AUDITONLYCOLUMNSMention the specific list of columns to be audited in the format—table1.column1, table2.column2, table3.column3.
- AUDITPASSWORDSet the password for the audit ZIP file.
- AUDITVALUESN—show only new values, not the old values in the AUDIT file. The default value is to show both.
- AUDITZIPZip and encrypt the audit CSV file. Values are winzip or jzip for the program to use for the zip.
- CASEINSENSITIVEXREFMake comparisons case insensitive (for cross-reference).
- CROSSREFCONNECTThe name of the connection file to read and write cross-reference data; Connectscramble.txt.
- CROSSREFTABLEThe name of the table to read and write cross-reference data to; Gtsrc_xref.
- ENCRYPTXREFEncrypt the old values in the cross-reference table.
- TRIMMEDXREFTrim values before comparing (for cross-reference).
- CDATEOverride the date (today) for the purposes of date calculation functions. For example, DOB (format: YYYYMMDD).
- HIGHDATEOverride the highest data that offset date functions process. For example, dates later than 22000101 are ignored.
- LOWDATEOverride the lowest data that offset date functions process. For example, dates earlier than 19000101 are ignored.
- BACKUPDIRThe directory name for backup files. If this setting is blank, the default from the Fast Data Masker directory is used.
- ERRORDIRThe directory name for error files. If this setting is blank, the default from the Fast Data Masker directory is used.
- LOGDIRThe directory name for log files. If this setting is blank, the default from the Fast Data Masker directory is used.
- SEEDFILEDIRThe directory name where seed data files are stored. The default value is seedtables sub-directory.
Fast Data Masker currently supports three languages for masking—English, German, and Spanish.
When Fast Data Masker starts, it verifies the current default locale. If the language is supported (one of the three listed), it processes messages in that language. If the language is not supported, it defaults to English unless set in the options file.
You can override the local language by altering the language option as follows:
- LANGUAGEUse one of the three languages—en (English), de (German), or es (Spanish)
- PARALLEL=nEnables users to attempt to run ‘n’ number of concurrent threads. This is constrained by the number of physical cores and processors available.Note:When masking Microsoft SQL Server tables, use the PARALLEL option only when the table has a primary key or unique index. If the table does not have a primary key or unique index and you use the PARALLEL option, masking is either slow or does not work.Fast Data Masker assigns a separate thread for each table in a CSV if there is more than one. However, a CSV would have only one table, which can be split using the WHERE clauses. For example, a CSV using the WHERE clauses below would have four splits:WHERE, CUSTID<100….WHERE, CUSTID BETWEEN 100 AND 200….WHERE, CUSTID BETWEEN 200 AND 300….WHERE, CUSTID>100
- CASEINSENSITIVESEEDMakes search for rd_ref_value column, using RANDLOV1 function case insensitive.
- LOADALLSEEDDATAN (default)—loads all seed data into Java memory for the RANDLOV1 function, irrespective of the distribution of rd_ref_values in the table.
- SEEDTABLECONNECTThe name of the connection file to get seed data from; Connectscramble.txt.
- SEEDTABLEThe name of the table to get the seed data from; Gtsrc_reference_data.
- SEEDTABLECOLUMNSComma separated list of the columns in SEEDTABLE.
- SHUFFLEDISTINCTThis option takes Y or N.Y selects distinct values for the shuffle creates.N is the default and selects all values.
- SHUFFLELIMITn—only select n values for the shuffle.
- SHUFFLEONLYThis option takes Y or N.Y does not update the database. Instead, it just produces the shuffle files or database shuffle values.
- POSTSTEPThis is the path to a SQL file to perform post-steps, the SQL should be ANSII standard insert, update, or delete operations. The SQL file is executed after the masking.
- PRESTEPThis is the path to a SQL file to perform pre-steps, the SQL should be ANSII standard insert, update, or delete operations. The SQL file is executed prior to masking.
- BADDATESTRINGFor DOB/DOD on dates stored in character fields, specify the data to replace the invalid data as YYYY/MM/DD.
- BATCHSIZENumber of lines to commit to a database at a time.
- BLANKSASNULLSSet to Y. For character data types, if the column contains blanks to the column width, treat as a null for keepnulls in the masking CSV.
- CASEINSENSITIVEHASHLOVThis option is always used with the HASHLOV function. With this option, you can define whether the HASHLOV function masks the data in a case-sensitive or case-insensitive mode. By default, this option is set to the case-insensitive mode. Enter N as a value to set this option to the case-sensitive mode.Example:The firstname and lastname columns include the data in the following format:firstnamelastnameJeanMullerJEANMULLERjeanmullerjEANmULERWhen the option is set to the case-insensitive mode (default mode) and you use the HASHLOV function, the data is masked as follows:firstnamelastnameIsabelRowlandIsabelRowlandIsabelRowlandIsabelRowlandNow, when you set the option to the case-sensitive mode and use the HASLOV function for masking, the same data is masked as follows:firstnamelastnameMelanyMaynardIsabelRowlandTaniaSuttonDaniellaBuchanan
- COMMIT=nnnnCommit after nnn rows for each table to be masked. For example, 1000 forces a commit after 1000 rows for each table.
- DBUPDATESN—run in simulation modeS—create SQL file <table name>_UPDATES.sqlP—see Prestep and Poststep optionsNote:DBUPDATES=S only available for non-DB2 databases with unique or primary key columns.
- DB2BATCHUPDATEN (Default). If Y, use fast batched updates rather than standard "update where current of" cursor method (DB2 ONLY).
- DIAGLEVELPossible values: 0, 1, 2 or 4. Debug info is generated according to value.
- DROPRESTARTSet the value to N if you do not want to drop the restart column (which Fast Data Masker creates) after masking is complete. Retaining the restart column is helpful in scenarios where you want to use it for audit purposes. The default value is Y. This option is not applicable if you explicitly specify your own restart column.
- EMPTYASNULLSet to Y. For character data types, if the column contains a blank or spaces, treat as a null for keepnulls in the masking CSV.
- FASTIGNORESet the value to Y if you want to use this option. This options is always used with the IGNORE function. When used with the IGNORE function, this option improves the masking performance.The IGNORE function inserts as well as retrieves data from the cross-reference table, which is why it is row-by-row processing and slow. However, with the FASTIGNORE option, you update the data in one SQL statement, rather than row-by-row (which is very slow).Review the following considerations when using the FASTIGNORE option:
- Ensure that the cross-reference table and the table to be masked are on the same RDBMS.
- Ensure that the cross-reference table and the table to be masked are on the same server.
- Ensure that the cross-reference table must have old and new values pre-populated for the chosen cross-reference identifier.
- Collation of the table to be masked and gtsrc_xref should be the same.
- FETCHSIZENumber of lines to read from a database or file at a time.
- GLOBALLICENSESet the path to your global license key.
- LOWERCASEKEYSpecify a lowercase key for masking; for example,qazwsxedcrfvtgbyhnujmikolp. Ensure that the key does not start with the charactera. You can use this option with the FORMATENCRYPT masking function.For example, if you use this option with the FORMATENCRYPT function, the function starts making the first occurrence of the lowercase character, which the function ignores if this option is not set.
- MD5HASHLOVSet this value to Y to use an MD5 hashing algorithm with the HASHLOV functions. Leave this value blank to use the default Java hashing algorithm.
- NUMERICKEYSpecify a numeric key for masking; ; for example,8524569173. Maximum 15 digits are allowed. If you want to enter more than 15 digits, provide value in quotes; for example,"9182736450514239687".You can use this option with the FORMATENCRYPT, FORMATLUHN, FORMATVIN, and HASHTURKISHID masking functions.
- ORDERBYY (Default)—you might want to turn it off for RMS(VMS). This value decides whether selected data is ordered by PK column or not.
- PROCESSCOUNTProcess count to limit the number of rows processed per table.
- RELAXNONINDEXVALIDATIONXREF on non-varchar columns of no PK/UK tables.
- RESETRESTARTReset restart column (de_ident_ind) to null, masking starts from Row 1.
- RESTARTRestart mask from last fail point. Requires varchar column (de_ident_ind) added to table(s) to be masked, or use existing (empty) column in the tables to be masked. Add this column name to the "restart column" column in your masking CSV.
- TRIMVALUESThis option can accept “Y� for yes and “N� for no. Yes implies you want to remove the leading and trailing spaces from all the columns that you have selected for masking. If you do not specify the value, Fast Data Masker uses the default value N.
- UPPERCASEKEYSpecify an uppercase key for masking; for example,PLOKMIJNUHBYGVTFCRDXESZWAQ. You can use this option with the FORMATENCRYPT masking function.
- USERFAASINDEXUse RFA in the WHERE clause of the update SQL (VMS(RMS) DB only).
- USERRNASINDEXFor non-indexed tables, use RRN function to get row identifier, and then combine with DB2BATCHUPDATE to use fast method (DB2400 only).
- WHEREASSUBSETY—Flat files are scrambled and subsetted according to the WHERE clause.N—Use WHERE as criteria to mask and generate output of all records.Default: Y
- XMLNAMESPACEOne or more tags in XML file or data contain xmlnamespace elements.