End-to-End Scenario for PII Audit

tdm48
HID_PII_Audit_guide
The PII Audit End-to-End Scenario outlines the step-by-step process for a Test Data Engineer (TDE) to identify any Personally Identifiable Information (PII) data across multiple data sources in an environment.
 
 
 

 
This video describes how a Test Data Engineer (TDE) uses the CA TDM Portal to scan Connection Profiles for PII data against one or more Classifier Packs, confirm the findings, and create a draft report to be signed off. An Internal Data Controller reviews the findings in the draft report and signs off. An Internal Auditor can download and review the final Audit report and a Management User or an External Auditor can request the Audit Report from the TDE.
The basic flow for PII Audit is as follows:
Data Profiling Detailed Architecture
Data Profiling Detailed Architecture
Overview of PII Audit process
To run a PII Audit on your data, you must execute the following steps:
  1. Select the required 
    Connection Profile
     or 
    Environment
    .
    For more information about Connection profile and Environments, see Prepare the Environment for PII Data Scan.
  2. Select the 
    Classifier Packs
     against which the PII data is matched.
    For more information about Classifier Packs, see Manage Data Classifiers.
  3. Select the required 
    Scan Level
    .
  4. Select the 
    Matched Samples
     option if you want to view matched samples for each column in a table. 
  5. Add known Connection Profiles, Schemas, and Tables to 
    Include
     or 
    Exclude
     in the PII data scan.
Scan Level
The Scan Level ranges from Basic to All based on the percentage of data you want to scan in your environment. For example, if you set the Scan Level to 10%, CA TDM performs a PII data scan scan on 10% of your environment only with a minimum of at least 10 rows per table.
  • Basic:
     Performs a PII data scan on 10 samples of data for each column in a table of your environment. 
  • All:
     Performs a PII data scan on all columns and rows for all tables in the selected environment. 
     Running a scan on an entire Data Source may take a long time since every record is read.
Matched Samples
When you select 
Store matched Samples
, CA TDM collects ten samples of data for each column in a table and stores it in the repository until the Internal Data Controller signs off the report. The collected data is deleted after the Internal Data Controller signs off and no record of the data is preserved in CA TDM. The tables in the Heat Map include sample data for all columns that are identified as PII data.
Include or Exclude Connection Profiles, Schemas, and Tables
You can apply a filter to include or exclude Connection Profiles, Schemas, and Tables to reduce the size of your scan.
You can use the basic wild card characters such as 
*
 (used to match one or more characters) and 
?
 (used to match a single character) in the search terms to include or exclude any matching connection profile, schema, and table from the scan.
For example, when you enter 
*sys
 in the tables to be excluded, the scan excludes all tables that end in 
sys
.
 You can either include a connection profile, schema, and table or exclude it from the scan but not both.
Perform a PII Audit
Use CA TDM to run a PII Scan as part of the PII Audit process.
Follow these steps:
 
  1. Open the CA TDM Portal as administrator for the Project.
  2. Select the required Project in the header bar.
  3. Click 
    PII Audit
    .
    The PII Audit section of the UI expands.
  4. Click 
    Get Started
    , or click 
    Set-up
     from under the PII Audit section.
    The PII Data Scan Set-up page opens.
  5. Select whether you want to run a PII Scan on:
    • An 
      Environment
      . Select from created Environments.
      For more information about how to create an environment, see Create an Environment.
    • One or more 
      Connection Profiles
      . Check all the Connection Profiles you want to scan.
      For more information about how to create a connection profile, see Create and edit Connection Profiles.
  6. Select one or more Classifier Packs against which the PII data is matched and click 
    Next 
    to confirm your selection.
    For more information about how to create and import classifiers, see Manage Data Classifiers.
  7. Choose how much data to scan. You can choose to either:
    • Scan column names only
      This only scans the names of columns in your environment. This scan uses only classifiers of the type 'column' (see Manage Data Classifiers).
    • Scan column names and data
      This scans the names and contents of columns in your environment. You have two options to set how many rows to scan from each column:
      1.  Drag the slider to set the percentage of a column's rows to scan.
      2. (Optional) Set the maximum number of rows to scan for each column with the 
        Max Rows
         field.
         This maximum number only overrides the value from the slider if it is lower than the value based on the slider.
  8. (Optional) Select 
    Store Matched Samples
     to store the first 10 samples that triggered each Classifier.
    Data for the samples is copied from the Data Source into the CA TDM repository and deleted after the Internal Data Controller signs off.  
    Click 
    Next
    .
  9. Under 
    Include/Exclude Tables
    , select one of the following:
    • Scan All Tables
      Scan executes on all tables in the data sources.
    • Include / Exclude
      'Add Filters' section appears below.
  10. Under 'Add Filters', you can do the following:
    • Click 
      New Filter
       to add a filter.
    • In the fields for each filter, enter the appropriate Connection Profile, Schema, and a comma separated list of Table names.
  11. Click 
    Next 
    to confirm your selection.
    The PII Data Scan Execution page opens. This page lists your choices from the PII Audit process.
  12. If you are happy with the details of the Scan to be run, select one of the following Schedule options:
    • Now
      When you click 
      Profile
      , the PII Scan begins.
    • Schedule 
      When you click 
      Profile, 
      CA TDM schedules the PII Scan to run at the time you specify.
  13. Click 
    Profile 
    to begin or schedule the scan.
    CA TDM creates a new job under the 
    Job Requests
     tab.
Job Requests
You can review all PII Audit jobs, inlcuding those with the statuses Running, Not Started (scheduled for future start) and Complete.
  1. Click the 
    Job Requests
     tab under the 
    PII Audit
     section of the left-hand panel.
  2. Click the relevant Job request row to view Additional Information about that job. This shows the 
    State
     of the job, the 
    Duration
     of the job, the 
    Scan Level
    , the number of tables and columns 
    Scanned
    , the number of tables and columns 
    Classified
     as PII data.
    If the job is complete, you can click 
    Ready for Review
     to go directly to the Heat map view for that job.
Review Scan Results
You can view results of completed scan jobs. Click 
Ready for Review
 under the 
PII Audit
 section of the left-hand panel.
When CA TDM completes a scan job, the scan job appears in the table on this page. Click on the job's row in the table to see results in detail.
Heat Map view
The Heat Map provides an instant graphical view to identify the total potential risk from PII data that exists within the scanned environment. The top menu bar lists the total number of PII data found within the scanned environment and the number of tables that are marked as confirmed. Each square in the Heat Map represents a table in the Data Source. You can zoom into a specific section of the Heat map to better view the table details.
You can filter tables in the following two ways:
  • filter Search Tab
    :
    Use the filter 
  • tab to search for a table, column, tag, profiles, and schema in the Heat Map.
  • Risk Slider
    :
    Use the Risk slider to filter tables based on their Risk category, according to the PII Scan.
Use the Filter search tab to filter based on a search term
Type in the search term to view a drop down with all matches for tables, columns, tags, connection profiles, and schemas. Click a search result to view a smaller result set and redraw the Heat Map. You can also view the filters that are applied on the Heat Map. For example, type in CREDIT to search for all tables that begin with CREDIT and all matches are displayed in the drop down. By default the first match for any type with matches will be the search criteria itself. Click on any of the matches within the drop down to activate a filter and redraw the Heat Map.
You can use the basic wild card characters such as * (used to match one or more characters) and ? (used to match a single character) in the search terms. All active filter types are 'ANDed' together and matched against all of the remaining types. For example, to search for a string that contains 'customer' within the schemas containing 'account' or 'accounts', enter the search term '*account*' and select the matching schema from the drop down. Next enter the search term '*customer' which will show results for all matches in the remaining tables, columns, tags and profiles. Matching is case insensitive, so you can get results as follows:
 
tables: 'LEGACY_ACCOUNT', 'Account', columns: 'ACCOUNT_ID', 'ACCOUNT_CUSTOMER', 'Active_Account'
 
You can also filter tables based on their size and re-draw the Heat Map. The table size range is between small and extra large.
Use the Risk slider to filter tables based on their Risk category
Drag the edges of the slider over the Risk categories to adjust your selection and redraw the Heat Map for a more specific view of the potential PII data that are identified in the scanned environment. Depending on the number of distinct tags that are identified in a table, the tables are filtered and positioned in the Heat Map as follows:
Distinct tags
Risk Level
15+
Very High
10 - 14
High
5 - 9
Medium
1 - 4
Low
0
Very Low
Manually Review Data within Tables
You can review each table in the Heat Map and further investigate if the data identified as PII is correct.
Follow these steps:
 
  1. Select a table in the Heat Map.
    Hover your mouse over a table to view a summary of the table details and the tags that were identified as PII data. You can zoom into the Heat Map to view table details.
  2. You can perform one of the following actions on this table:
    • Click 
      Confirm 
      if the tags identified in a table are correct.
    • Click 
      Not PII
       if a table does not contain PII data.
    • Click 
      Investigate
       to see a list view that includes details of columns, tags, and the sample data matched for each column that was identified as PII data.
      From this view, you can do the following:
      • Click 
        View Random Row 
        to view a random row from the selected table to get a better understanding of data available in the selected table. 
      • Click tags that you confirm are appropriate to the column, to 'pin' the tag. To unpin a tag, click the tag again.
      • Click the plus icon to add tags for columns that should be identified as PII data.
        When you type in the Tag Name field, a drop-down list of available tags appears. If you add your own custom tag (i.e. not from the drop-down list), the next time you add a new tag, your custom tag is available from the drop-down list of tags.
         The tags that the Audit Scan automatically assigns, already have associated masking functions. User-defined tags do not have associated masking functions, until you define them from the Manage Data Classifiers.
      • Click the 
        X
         icon associated with each tag, to remove the tag from the column. You can click 
        Remove Unpinned Tags
         to remove all tags that are not pinned, from all columns .
         You must provide a reason when you manually add or remove tags from columns. The 'Reason' field automatically populates with the last input value.
      • Click 
        Confirm And Review Next Table 
        to automatically review the next table or click 
        Confirm And Close 
        to manually select the next table you want to review.
        A tick mark is added to the reviewed and confirmed tables in the Heat Map.
To better understand the Profiling scan details in a Heat Map, you can download all details of a Heat Map into a CSV file. The CSV file includes details such as Job ID, Job Name, when the scan was initiated, Connection Profile or Environment name that was scanned, all the Heat Map details for matched tags and where they were found. 
To download all details of a Heat Map in a CSV file, click 
Actions
 and select 
Download as CSV
.
Create and Sign Off Report
Depending on the user persona, a scan report summarizes all the scan details. For example, the Job ID, time when the scan was initiated, time when the scan was completed, environment that was scanned, the Classifier Packs that were used for the scan, the tags that were identified during the scan process, and so on.
As a 
TDE
, after you review and confirm all the tables, you perform the following steps to create a scan report and request sign-off: 
  1. Log in to the CA TDM Portal and navigate to 
    PII Audit
    Ready for Review
    .
  2. Click 
    Create Report
    .
    The Submit for Sign-Off page appears. 
  3. Click 
    Download/View Report
     to download and review the report.
  4. Click 
    Submit Report For Sign-Off
    .
    An email notification is sent to all the Internal Data Controllers.
As an 
Internal Data Controller
, you perform the following actions when you receive an email notification to review a scan report:
  1. Click the URL provided in the email and log in to the CA TDM Portal.
    The PII Audit Sign-Off page appears.
  2. Click 
    Download/View Report
     to download and review the report.
  3. After reviewing the scan report, click S
    ign-
    O
    ff
    .
    The Confirm Sign-Off dialog appears.
  4. Enter your comments and click 
    Sign-Off
    .
    The PII Audit Reports page displays a list of PII Audit reports that you have signed off or the reports that are pending approval.
When all the Internal Data Controllers sign off the scan report, an 
Internal Auditor
 can views the signed-off report. As an Internal Auditor, you perform the following actions to view this report:
  1. Log in to the TDM Portal and navigate to 
    PII Audit
    Reports.
     
  2. Click 
    Audit Report
     to download the final signed-off scan report.  
To view the final signed-off report, a 
Management User
 and an 
External Auditor
 request the Audit Report from a TDE.