End-to-End Scenario for PII Audit
The PII Audit End-to-End Scenario outlines the step-by-step process for a Test Data Engineer (TDE) to identify any Personally Identifiable Information (PII) data across multiple data sources in an environment.
This video describes how a Test Data Engineer (TDE) uses the CA TDM Portal to scan Connection Profiles for PII data against one or more Classifier Packs, confirm the findings, and create a draft report to be signed off. An Internal Data Controller reviews the findings in the draft report and signs off. An Internal Auditor can download and review the final Audit report and a Management User or an External Auditor can request the Audit Report from the TDE.
The basic flow for PII Audit is as follows:
Overview of PII Audit process
To run a PII Audit on your data, you must execute the following steps:
- Select the requiredConnection ProfileorEnvironment.For more information about Connection profile and Environments, see Prepare the Environment for PII Data Scan.
- Select theClassifier Packsagainst which the PII data is matched.For more information about Classifier Packs, see Manage Data Classifiers.
- Select the requiredScan Level.
- Select theMatched Samplesoption if you want to view matched samples for each column in a table.
- Add known Connection Profiles, Schemas, and Tables toIncludeorExcludein the PII data scan.
The Scan Level ranges from Basic to All based on the percentage of data you want to scan in your environment. For example, if you set the Scan Level to 10%, CA TDM performs a PII data scan scan on 10% of your environment only with a minimum of at least 10 rows per table.
- Basic:Performs a PII data scan on 10 samples of data for each column in a table of your environment.
- All:Performs a PII data scan on all columns and rows for all tables in the selected environment.Running a scan on an entire Data Source may take a long time since every record is read.
When you select
Store matched Samples, CA TDM collects ten samples of data for each column in a table and stores it in the repository until the Internal Data Controller signs off the report. The collected data is deleted after the Internal Data Controller signs off and no record of the data is preserved in CA TDM. The tables in the Heat Map include sample data for all columns that are identified as PII data.
Include or Exclude Connection Profiles, Schemas, and Tables
You can apply a filter to include or exclude Connection Profiles, Schemas, and Tables to reduce the size of your scan.
You can use the basic wild card characters such as
*(used to match one or more characters) and
?(used to match a single character) in the search terms to include or exclude any matching connection profile, schema, and table from the scan.
For example, when you enter
*sysin the tables to be excluded, the scan excludes all tables that end in
You can either include a connection profile, schema, and table or exclude it from the scan but not both.
Perform a PII Audit
Use CA TDM to run a PII Scan as part of the PII Audit process.
Follow these steps:
- Open the CA TDM Portal as administrator for the Project.
- Select the required Project in the header bar.
- ClickPII Audit.The PII Audit section of the UI expands.
- ClickGet Started, or clickSet-upfrom under the PII Audit section.The PII Data Scan Set-up page opens.
- Select whether you want to run a PII Scan on:
- AnEnvironment. Select from created Environments.For more information about how to create an environment, see Create an Environment.
- One or moreConnection Profiles. Check all the Connection Profiles you want to scan.For more information about how to create a connection profile, see Create and edit Connection Profiles.
- Select one or more Classifier Packs against which the PII data is matched and clickNextto confirm your selection.For more information about how to create and import classifiers, see Manage Data Classifiers.
- Choose how much data to scan. You can choose to either:
- Scan column names onlyThis only scans the names of columns in your environment. This scan uses only classifiers of the type 'column' (see Manage Data Classifiers).
- Scan column names and dataThis scans the names and contents of columns in your environment. You have two options to set how many rows to scan from each column:
- Drag the slider to set the percentage of a column's rows to scan.
- (Optional) Set the maximum number of rows to scan for each column with theMax Rowsfield.This maximum number only overrides the value from the slider if it is lower than the value based on the slider.
- (Optional) SelectStore Matched Samplesto store the first 10 samples that triggered each Classifier.Data for the samples is copied from the Data Source into the CA TDM repository and deleted after the Internal Data Controller signs off.ClickNext.
- UnderInclude/Exclude Tables, select one of the following:
- Scan All TablesScan executes on all tables in the data sources.
- Include / Exclude'Add Filters' section appears below.
- Under 'Add Filters', you can do the following:
- ClickNew Filterto add a filter.
- In the fields for each filter, enter the appropriate Connection Profile, Schema, and a comma separated list of Table names.
- ClickNextto confirm your selection.The PII Data Scan Execution page opens. This page lists your choices from the PII Audit process.
- If you are happy with the details of the Scan to be run, select one of the following Schedule options:
- NowWhen you clickProfile, the PII Scan begins.
- ScheduleWhen you clickProfile,CA TDM schedules the PII Scan to run at the time you specify.
- ClickProfileto begin or schedule the scan.CA TDM creates a new job under theJob Requeststab.
You can review all PII Audit jobs, inlcuding those with the statuses Running, Not Started (scheduled for future start) and Complete.
- Click theJob Requeststab under thePII Auditsection of the left-hand panel.
- Click the relevant Job request row to view Additional Information about that job. This shows theStateof the job, theDurationof the job, theScan Level, the number of tables and columnsScanned, the number of tables and columnsClassifiedas PII data.If the job is complete, you can clickReady for Reviewto go directly to the Heat map view for that job.
Review Scan Results
You can view results of completed scan jobs. Click
Ready for Reviewunder the
PII Auditsection of the left-hand panel.
When CA TDM completes a scan job, the scan job appears in the table on this page. Click on the job's row in the table to see results in detail.
Heat Map view
The Heat Map provides an instant graphical view to identify the total potential risk from PII data that exists within the scanned environment. The top menu bar lists the total number of PII data found within the scanned environment and the number of tables that are marked as confirmed. Each square in the Heat Map represents a table in the Data Source. You can zoom into a specific section of the Heat map to better view the table details.
You can filter tables in the following two ways:
Use the Filter search tab to filter based on a search term
Type in the search term to view a drop down with all matches for tables, columns, tags, connection profiles, and schemas. Click a search result to view a smaller result set and redraw the Heat Map. You can also view the filters that are applied on the Heat Map. For example, type in CREDIT to search for all tables that begin with CREDIT and all matches are displayed in the drop down. By default the first match for any type with matches will be the search criteria itself. Click on any of the matches within the drop down to activate a filter and redraw the Heat Map.
You can use the basic wild card characters such as * (used to match one or more characters) and ? (used to match a single character) in the search terms. All active filter types are 'ANDed' together and matched against all of the remaining types. For example, to search for a string that contains 'customer' within the schemas containing 'account' or 'accounts', enter the search term '*account*' and select the matching schema from the drop down. Next enter the search term '*customer' which will show results for all matches in the remaining tables, columns, tags and profiles. Matching is case insensitive, so you can get results as follows:
tables: 'LEGACY_ACCOUNT', 'Account', columns: 'ACCOUNT_ID', 'ACCOUNT_CUSTOMER', 'Active_Account'
You can also filter tables based on their size and re-draw the Heat Map. The table size range is between small and extra large.
Use the Risk slider to filter tables based on their Risk category
Drag the edges of the slider over the Risk categories to adjust your selection and redraw the Heat Map for a more specific view of the potential PII data that are identified in the scanned environment. Depending on the number of distinct tags that are identified in a table, the tables are filtered and positioned in the Heat Map as follows:
10 - 14
5 - 9
1 - 4
Manually Review Data within Tables
You can review each table in the Heat Map and further investigate if the data identified as PII is correct.
Follow these steps:
- Select a table in the Heat Map.Hover your mouse over a table to view a summary of the table details and the tags that were identified as PII data. You can zoom into the Heat Map to view table details.
- You can perform one of the following actions on this table:
- ClickConfirmif the tags identified in a table are correct.
- ClickNot PIIif a table does not contain PII data.
- ClickInvestigateto see a list view that includes details of columns, tags, and the sample data matched for each column that was identified as PII data.From this view, you can do the following:
- ClickView Random Rowto view a random row from the selected table to get a better understanding of data available in the selected table.
- Click tags that you confirm are appropriate to the column, to 'pin' the tag. To unpin a tag, click the tag again.
- Click the plus icon to add tags for columns that should be identified as PII data.When you type in the Tag Name field, a drop-down list of available tags appears. If you add your own custom tag (i.e. not from the drop-down list), the next time you add a new tag, your custom tag is available from the drop-down list of tags.The tags that the Audit Scan automatically assigns, already have associated masking functions. User-defined tags do not have associated masking functions, until you define them from the Manage Data Classifiers.
- Click theXicon associated with each tag, to remove the tag from the column. You can clickRemove Unpinned Tagsto remove all tags that are not pinned, from all columns .You must provide a reason when you manually add or remove tags from columns. The 'Reason' field automatically populates with the last input value.
- ClickConfirm And Review Next Tableto automatically review the next table or clickConfirm And Closeto manually select the next table you want to review.A tick mark is added to the reviewed and confirmed tables in the Heat Map.
To better understand the Profiling scan details in a Heat Map, you can download all details of a Heat Map into a CSV file. The CSV file includes details such as Job ID, Job Name, when the scan was initiated, Connection Profile or Environment name that was scanned, all the Heat Map details for matched tags and where they were found.
To download all details of a Heat Map in a CSV file, click
Download as CSV.
Create and Sign Off Report
Depending on the user persona, a scan report summarizes all the scan details. For example, the Job ID, time when the scan was initiated, time when the scan was completed, environment that was scanned, the Classifier Packs that were used for the scan, the tags that were identified during the scan process, and so on.
TDE, after you review and confirm all the tables, you perform the following steps to create a scan report and request sign-off:
- Log in to the CA TDM Portal and navigate toPII Audit,Ready for Review.
- ClickCreate Report.The Submit for Sign-Off page appears.
- ClickDownload/View Reportto download and review the report.
- ClickSubmit Report For Sign-Off.An email notification is sent to all the Internal Data Controllers.
Internal Data Controller, you perform the following actions when you receive an email notification to review a scan report:
- Click the URL provided in the email and log in to the CA TDM Portal.The PII Audit Sign-Off page appears.
- ClickDownload/View Reportto download and review the report.
- After reviewing the scan report, click Sign-Off.The Confirm Sign-Off dialog appears.
- Enter your comments and clickSign-Off.The PII Audit Reports page displays a list of PII Audit reports that you have signed off or the reports that are pending approval.
When all the Internal Data Controllers sign off the scan report, an
Internal Auditorcan views the signed-off report. As an Internal Auditor, you perform the following actions to view this report:
- Log in to the TDM Portal and navigate toPII Audit,Reports.
- ClickAudit Reportto download the final signed-off scan report.
To view the final signed-off report, a
Management Userand an
External Auditorrequest the Audit Report from a TDE.