Create Data Generation Rules

Use the Generation functionality in the CA TDM Portal to create data generation rules. These rules include data functions that help you generate synthetic data when you perform the data publish operation. At the time of data publishing, the CA TDM Portal evaluates each column and resolves the rule that is applied to each column. The application then populates the column with the value that is generated based on the defined rules.
tdm40
Use the Generation functionality in the CA TDM Portal to create data generation rules. These rules include data functions that help you generate synthetic data when you perform the data publish operation. At the time of data publishing, the CA TDM Portal evaluates each column and resolves the rule that is applied to each column. The application then populates the column with the value that is generated based on the defined rules.
  1. Access the CA TDM Portal.
  2. Select an appropriate project and its corresponding version from the
    Project
    drop-down list in the top blue bar.
    This selection sets the required project and version context for all the related operations that you perform in the Portal.
  3. Click
    Generators
     in the left pane.
    The
    Generators
    page opens.
    Note:
    If the left pane is not visible, click the icon (represented by three horizontal bars) in the top left corner.
  4. Click the data generator that you want to use to publish data.
    The
    <Data Generator Name>
    page opens. This page includes information (for example, associated project and project version) about the data generator.
  5. Click the
    Select Tables
    button to open the
    Registered Tables
    dialog.
    This dialog lists the registered tables (
    used
    and
    unused
    ) based on the project that is associated to the selected generator. A 
    used
     table is a table that includes data (or data generation rules). 
    Used
    tables are represented by a tick mark in the
    Used
    column.
    Note:
    You can also view the table relationships. For more information, see View Table Relationships.
  6. (Optional) Click the
    Relational Edit
    button and select the following actions as required. These actions are applicable for the
    used
    tables in the selected data generator. These actions help you automate the data generation process to some extent:
    • Make All Children References
      Lets the foreign key column in the child table refer the primary key column in the parent table using data generation functions. This option helps you automatically generate data generation rules for the foreign key column in the child table, which otherwise is a manual and error-prone effort. This reference, therefore, enables you to establish a relationship between the parent and the corresponding child tables. This approach also maintains the referential integrity of the data at the time of publishing. 
      For example, consider a scenario where the value in the foreign key column
      purchaseOrder_tdm_root_SHRED_ID
      of the child table
      items
      references the primary key column
      SHRED_ID
      in the parent table
      purchaseOrder_tdm_root
      . When you implement this option, the value in the foreign key column of the child table is replaced with an expression
      ^purchaseOrder_tdm_root.SHRED_ID(1)^
      .  This expression is a data generation function that is generated automatically without any manual effort. And, it establishes a parent-child relationship.
    • Make All Parents default
      Replaces all the primary keys in the parent tables with the valid next sequence using data generation functions. 
      For example,
      SHRED_ID
      is a primary key column that includes some value after the sample data is imported into the table
      purchaseOrder_tdm_root
      . Therefore, when you implement this option, the sample value in this column is automatically replaced with the expression 
      @nextval(SHRED_ID_SEQ)@
      . This data generation function is automatically generated without any manual effort. This function establishes a sequence that generates unique, sequential values at the time of publishing.
    Click
    OK
    , review the summary of the affected tables in the
    Relational Editor Results
    dialog, and click
    OK
    .
    All 
    used
    tables are updated based on the selected options.
  7. Click the row corresponding to the table for which you want to create data generation rules. 
    All the table rows are added to the 
    <Data Generator Name>
     page, including columns. You can perform the following actions as required:
    • To add another table to the 
      <Data Generator Name>
      page, repeat this step.
    • Click the +r (Add Row) icon if no row is available in the table or you want to add another row to the table. To delete a row, click on the row number and then X (Delete Row) icon.
    • To view the constraints applicable to a column belonging to a derived object, move the mouse pointer to the column name. A tool tip appears that shows the constraints. When you move out of a table cell, the Portal validates the value based on the constraints applied to the column. 
    • Click the First Row Display button to view the values of first row of a used table.
      Used table first row details help you understand the columns used in the respective table. You can use this information to select appropriate tables for which you can write necessary data generation rules. You can see the following details of the first row of each table:
      • Column Name
      • Definition
      • Default
      • Global Default
      • Data Type
  8. Click the cells in each table row where you want to add the rule. You can also enter variables in the cell. You can type in the values in each cell, or use the pop-up menu (recommended) that appears when you click a cell. If you type in the values in the cells, the values entered in each row are saved only when all the mandatory cells in that row are filled in. If you choose to work with the pop-up menu, follow these steps:
    • Use the Cut and Paste icons to move the values from one cell to another. 
    • Use Copy icon to copy the values from one cell to another cell/row/column. Following are the available copy options:
      • Copy Row Down
        All Rows - all cells
        Copies the values of the cells in focused row to the corresponding cells in the rows below the current row of the same table.
        All Rows - empty cells
        Copies the values of the cells in the focused row to the corresponding empty cells in the rows below the current row of the same table.
        New Rows - all cells
        Copies the value of the cells in the focused row to the corresponding the cells in the new row added below the current row in the same table.
        New Rows - empty cells
        Copies the value of the cells in the focused row to the corresponding empty cells in the new row added below the current row in the same table.
      • Copy Column Down
        Copy Down All
        Copies the value of the focused cell to all the cells below in the same column in the same table.
        Copy Down Null (Empty)
        Copies the value of the focused cell to the empty cells below in the same column in the same table.
        Copy Down All with Increment
        Copies the value of the focused cell to all the cells below in the same column in the same table with the specified increment value.
        Copy Down Null with Increment
        Copies the value of the focused cell to the empty cells below in the same column in the same table with the specified increment value.
      • Copy Column Right
        Copy Right All
        Copies the value of the focused cell to all the cells on the right side in the same row.
        Copy Right Null (Empty)
        Copies the value of the focused cell to the empty cells on the right side in the same row.
        Copy Right All with Increment
        Copies the value of the focused cell to all the cells on the right side in the same row with the specified increment value. 
        Copy Right Null with Increment
        Copies the value of the focused cell to the empty cells on the right side in the same row with the specified increment value.
      • Copy Column Left 
        Copy Left All
        Copies the value of the focused cell to all the cells on the left side in the same row.
        Copy Left Null (Empty)
        Copies the value of the focused cell to the empty cells on the left side in the same row.
        Copy Left All with Increment
        Copies the value of the focused cell to all the cells on the left side with the specified increment value in the same row.
        Copy Left Null with Increment
        Copies the value of the focused cell to the empty cells on the left side with the specified increment value in the same row.
    • Use the icons F, C and V to insert Functions, Columns and Variables respectively in the cells. 
    • Use the Data Painter icon (brush icon) to open the
      Data Painter: <Table_Name>.<Column_Name>
      dialog and create data generation rules.
  9. Use the
    Data Painter: <Table_Name>.<Column_Name>
     dialog that opens when you click the Data Painter icon to edit and test the data functions. The dialog lets you click on objects in each of the three sections—Functions, Columns, and Variables. These objects transfer to the edit section (left side) where you manipulate them. The details of the three sections are as follows:
    Note:
    There is no character limit for data painter expressions in the CA TDM Portal. However, Datamaker has a limit of 16000 characters. If you expect parallel usage of this data generation rule in Datamaker, adhere to the Datamaker character limit.
    • Functions
      Functions can use hard-coded values, columns, or variables as parameters. Functions can also use other functions as parameters. For example, you can use a function as a result from a Boolean operator in the IF function.
    • Columns
      The Columns list contains any other columns in the table. This list also shows the columns in the other
      used
      tables.
    • Variables
      The Variables section contains a combination of system operators and any substitution variables you have created. For more information about creating and managing variables, see Create and Manage Variables.
    Note:
    The dialog also displays information about the constraints applicable to the columns belonging to derived objects. When you click the
    Validate
    button, the Portal validates the value based on the constraints applied to the column. 
  10. Expand the appropriate section (
    Functions
    ,
    Columns
    ,
    Variables
    ) and click the required object.
    The object is added to the edit section.
    In case of any priorpublishkeylist function, you must replace the
    "ld_id"
    value with the respective generator ID. Select the applicable generator from the auto populated list when you add a priorpublishkeylist function to a column.
    In case of any sqllist function, you must replace the
    "connection"
    value with the respective connection profile. Select the applicable connection profile from the auto populated list when you add a sqllist function to a column. The parameter connection is replaced with the name of connection profile adding the character "P" as a prefix.In case of any seedlist function, you must replace the
    "seedname"
    value with the respective seed data type. Select the applicable seed data type from the auto populated list when you add a seedlist function to a column. For more information, see Data Generation Functions and Parameters.
    Note:
    You can also use the search field at the top of the
    Functions
    section to find an object.
  11. Edit the rule and click
    Validate
    to verify that the data generation rule is working correctly. 
    If the validation is unsuccessful, a proper error message is displayed. For more information about Functions, Parameters, and Return Values see Data Generation Functions and Parameters.
  12. Click the
    Insert
    button to insert the validated rule into the cell.
  13. Follow the same steps for other cells in the table.
You have successfully added data generation rules to the table columns. After you add data generation rules to a table, the table becomes a
used
table. You can verify this by reviewing the presence of a tick mark in the
Used
column. You can now publish the data into the target database schema.
Example: Create Data Generation Rules for the Employee Table
In this example, you create data generation rules for the Employee table so that you can generate and insert synthetic data into the table cells. This synthetic data does not include real data; it includes random data that is as close to the real data as possible. This random data is generated based on the data generation rules that you specify.
The Employee table includes the following columns:
  • employee_id
  • first_name
  • last_name
  • email
  • phone
  • birth_date
  • salary
You add data generation rules to the cells in the Employee table as follows:
Note:
For more information about specific data functions and their usage, see Data Generation Functions and Parameters.
employee_id
  1. Select the
    NEXT [Next value for column]
    variable from the
    Variables
    list. This variable adds the next value in the sequence to this cell whenever you publish the table. 
    The variable is added as follows to the edit section in the left: 
    ~NEXT~
  2. Click 
    Validate
    .
    A random number is generated and is displayed in the field next to the 
    Validate
     button.
  3. Click 
    Insert
    .
    The variable is added to the 
    employee_id
     cell in the table.
first_name
  1. Select the
    randlov(percnull,@seedlist(seedname)@)
    function from the
    Functions, List of Values
    list. The data function is added as follows: 
    @randlov(percnull,@seedlist(seedname)@)@
  2. Search for the
    seedname
    value (
    FirstName
    in this case) in the pop-up menu and click the seedlist name to add it to the expression. 
    The data function is updated as follows: 
    @randlov(percnull,@seedlist(FirstName)@)@
  3. Enter
    0
    as a value for percnull. This function allows you to identify the percentage of rows that are null. Selecting 20 means that 20 percent of the values are designated null. 
    The data function is updated as follows: 
    @randlov(0,@seedlist(FirstName)@)@
  4. Click
    Validate
    .
    A random first name is generated based on the provided parameters and is displayed in the field next to the
    Validate
    button.
  5. Click
    Insert
    .
    The data function is added to the
    first_name
    cell in the table.
last_name
  • Follow the same steps as for the first_name column. The only difference is that you must select the seedlist name as
    LastName
    .
email
Note:
The format of the email is.
  1. Select the
    first_name
    column from the
    Column, employee
    list.
    The data function is added as follows: 
    ^first_name^
  2. Add a dot (.) after the column name and select the
    last_name
    column from the
    Column, employee
     list. 
    The data function is updated as follows: 
    ^first_name^.^last_name^
  3. Select the
    atsign()
    function from the
    Functions, String
    list. 
    The data function is updated as follows: 
    ^first_name^.^last_name^@atsign()@
  4. Add
    xyz.com
    after the last
    @
    symbol. 
    The data function is updated as follows: 
    ^first_name^.^last_name^@atsign()@ca.com
  5. Click
    Validate
    .
    A random email ID is generated based on the provided parameters and is displayed in field next to the 
    Validate
     button.
  6. Click
    Insert
    .
    The data function is added to the 
    email
     cell in the table.
phone
  1. Select the
    randlov(percnull,@seedlist(seedname)@)
     function from the
    Functions, List of Values
    list. 
    The data function is added as follows: 
    @randlov(percnull,@seedlist(seedname)@)@
  2. Search for the
    seedname
    value (
    US Phone no
    in this case) in the pop-up menu and click the seedlist name to add it to the expression.
    The data function is updated as follows: 
    @randlov(percnull,@seedlist(US Phone no)@)@
  3. Enter
    0
    as a value for percnull. 
    The data function is updated as follows: 
    @randlov(0,@seedlist(US Phone no)@)@
  4. Click 
    Validate
    .
    A random phone number is generated based on the provided parameters and is displayed in the field next to the 
    Validate
     button.
  5. Click 
    Insert
    .
    The data function is added to the
    phone
     cell in the table.
birth_date
  1. Select the
    addranddays(date,min,max)
    function from the
    Functions, Date &/or Time
    list.
    The data function is added as follows: 
    @addranddays(date,min,max)@
  2. For the
    date
    parameter, select
    SDATE [System Date]
    from the
    Variables, System Variables
    list. 
    The data function is updated as follows: 
    @addranddays(~SDATE~,min,max)@
  3. For the
    min
    parameter, enter -20000 as the minimum value and for the
    max
    parameter, enter -8000 as the maximum value. 
    The data function is updated as follows: 
    @addranddays(~SDATE~,-20000,-8000)@
  4. Click 
    Validate
    .
    A random date is generated based on the specified parameters and is displayed in field next to the 
    Validate
     button.
  5. Click 
    Insert
    .
    The data function is added to the
    birth_date 
    cell in the table.
salary
  1. Select the
     addrand(number,min,max)
    function from the
    Functions, Numeric
    list.
    The data function is added as follows: 
    @addrand(number,min,max)@
  2. Enter 100, 1000, and 100000 as values for the number, min, and max parameters respectively.
    The data function is updated as follows: 
    @addrand(100,1000,100000)@
  3. Click 
    Validate
    .
    A random value is generated based on the specified parameters and is displayed in field next to the 
    Validate
     button.
  4. Click 
    Insert
    .
    The data function is added to the 
    salary 
    cell in the table.
You have successfully generated data creation rules for the Employee table columns. Now, when you publish the data for this table, the data is generated based on the defined rules and then the generated synthetic data is inserted into the table.