Configure Threshold Profiles
Monitor specific devices or components, such as interfaces and IPSLA tests, and determine thresholds for given metrics or sets of metrics using threshold profiles.
Monitor specific devices or components, such as interfaces and IPSLA tests, and determine thresholds for given metrics or sets of metrics using threshold profiles. Metrics that fail to meet the threshold trigger event rules to generate a threshold violation event, and returning to acceptable operation triggers the event rule to raise an event (a clear event) indicating that the violation has been cleared.
You want to monitor the utilization of an interface, and trigger a threshold violation event when the utilization is above 75 percent. When the utilization drops below 75 percent, you want the threshold violation event to clear:
- Add Event RulesDefine the logic in an event rule, which determines when it should generate a threshold violation event in response to a metric exceeding the threshold.The example in this topic requires at least two event rules. One event rule determines when it generates a threshold violation event, while the other determines when it clears the violation (it raises a clear event).
- Add Conditions to the Event RuleAdd conditions to the event rule. You can add up to five event rule conditions to an event rule. Event rule conditions automatically appear in a view when the conditions in a certain event rule are met. Event rule conditions includeViolationandCleared.
The following video shows the threshold profile configuration process:
In this article:
Threshold Best Practices
Consider the following best practices when defining threshold profiles:
- Increase the granularity and flexibility of thresholding by assigning groups with specific components to the threshold profiles instead of devices.
- Expand threshold monitoring slowly. Start with a small group of components and verify that the monitoring engine does not become degraded.For more information, see Threshold Event Processing Self-Monitoring Metrics.
- Thresholds on components with one-minute polling have a high resource cost to the system.
- Threshold evaluations can slow down after a data aggregator restart whileDX NetOps Performance Managementprocesses cached poll data from the the data collectors.
Create a Threshold Profile
Users with the Create DA Threshold Profile or the Administer DA Threshold Profile role right can create threshold profiles.
You can edit or delete exiting threshold profiles.
Follow these steps:
- Do one of the following:
TheThreshold Profilespage appears.
- (Administrators) Hover overAdministration,Monitored Items Management, and then clickThreshold Profiles.
- (Users) Click the name of your user account in the upper-right corner, and then clickManage Threshold Profiles.
- Do one of the following:
- Select an existing folder under which you want to create the threshold profile.
- Create a folder for the threshold profile by clickingNew Folder, and then create your folder.
- Right-click the folder, and then clickNew Profile.TheCreate / Edit Threshold Profilepage appears.
- Specify the required information, and then clickSave:
- NameDefines the name for the threshold profile.
- FolderSpecifies the folder under which you want to create this threshold profile.
- StatusDefines the status of the threshold profile.Values:Enabled or DisabledDefault:Enabled
- (Administrators Only)OwnerSpecifies the owner for the threshold profile. Only the owner or a user with the Administer DA Threshold Profile role right can edit the threshold profile.
The threshold profile is created.
Next step:To enable event rules to generate threshold violation events for metrics that are part of a group, assign groups to the threshold profile.
Add Event Rules to a Threshold Profile
Event rules are based on a single metric family. The metrics determine the conditions that trigger event rules to generate threshold violation events or raise clear events. When a metric exceeds the threshold, the event rule generates a threshold violation event.
Threshold profiles require at least one event rule. You can edit or delete existing event rules.
Prerequisite:You have created or are in the process of creating a threshold profile.
Follow these steps:
- On theCreate / Edit Threshold Profilepage, in theEvent Rulespane, do one of the following:
TheCreate / Edit Event Ruledialog opens.
- To use an existing event rule as a template, select the event rule that you want to use as a template, and then clickCopy.
- Complete the following fields, and then clickSave:
The event rule is saved.
- NameDefines the name for the event rule.
- Metric FamilyDefines the metric family on which this event rule is based.
- Duration (sec)Specifies the total amount of time within a specified range of time, orWindow, that the threshold must violate what is defined in the event rule condition. The poll cycles that trigger the condition do not need to be consecutive. The duration is cumulative.Default:300 (5 minutes)
- Window (sec)Specifies the overall range of time that the event rule evaluates the condition and generates a threshold violation event when there are enough values, based on the specifiedDuration. The range of time is a rolling window.Default:300 (5 minutes)
- AggregationSpecifies whether the threshold applies to an aggregate value of all components for the device. This field appears only when you select a supported metric family.Values:No Aggregation or Aggregate Components by DeviceDefault:No Aggregation
- Edit the event rule conditions or add a condition.You can select theUtilization (%)metric only if you have selected the CPU or Memory metric family as the family on which this event rule is based. If you select this metric, you must chooseFixed Valueas theCondition Typefor the event rule.
- Save the threshold profile.
The event rule is added to the threshold profile.
Threshold Assessment Logic for Window and Duration
DX NetOps Performance Managementloads data (points) every minute. The event rules evaluate the threshold against these loaded data points. Threshold processing at the data layer operates in bulk. The monitoring profile's poll rate determines the frequency at which the event rules evaluate poll cycle data. For example, if you have set the poll rate to 1 minute, the event rules evaluate one-minute poll cycle data.
For more information about how to set poll rates, see Configure Monitoring Profiles.
At each poll cycle, the event rule retrieves the items that violate the threshold rules, using the following logic:
show me all items that violated x threshold profile for y duration within z window
If threshold processing returns the item, the event rule generates a threshold violation event if one is not already active.
The event rule raises clear events when the metrics meet the following conditions:
- The latest poll is below the clear threshold value.
- The condition does not meet the criteria to generate a threshold violation event.
If you have a threshold profile applied to a mix of items with both one- and five-minute poll rates, with event rules showing 60 seconds (1 minute)
DX NetOps Performance Managementevaluates every one-minute poll cycle data point for violations. Every five minutes,
DX NetOps Performance Managementevaluates the 5 minute polled item data points along with the 1 minute polled item data points. In this scenario, the 1 minute polled items raise events against each poll cycle, same as the 5 minute polled items, while both use the lower 1 minute
For the event rule to raise a clear event, it must no longer have enough violating data points to meet the duration in the window, and enough data points that meet the clear criteria for the duration within the window. The event rule is always looking at the number of data points in the window.
DX NetOps Performance Managementdoes not create a window upon raise or clear.
The following are example scenarios assuming the items targeted poll at the default 5-minute poll rate:
- Duration: 300 seconds/5 minutes; Window: 900 seconds/15 minutes.In this instance, any one poll cycle out of the possible three in theWindowraises the event. The poll cycle that triggers the condition must be consecutive.
- Duration: 600 seconds/10 minutes; Window: 900 seconds/15 minutes.In this instance, any two poll cycles out of the possible three in theWindowraises the event. In this example, poll cycle A+B, A+C, B+C, etc, can trigger the condition that raises the event. The poll cycles that trigger the condition must be consecutive.
- Duration: 900 seconds/15 minutes; Window: 900 seconds/15 minutes.In this instance, three poll cycles out of the possible three in theWindowraises the event. Only when poll cycle A, B, and C in a given rollingWindowviolate the threshold does the event rule raise the event. The poll cycles that trigger the condition do not need to be consecutive. The duration is cumulative.
- Duration: 900 seconds/15 minutes; Window: 1800 seconds/30 minutes.In this instance, three poll cycles out of the possible six in theWindowraises the event. The poll cycles can be cumulative, not consecutive. As long as any three poll cycle values out of each six evaluated for the rollingWindowviolate the threshold, the event rule raises an event.
Best practice:To prevent noisy events (a flapping condition), you can:
- (Recommended) Configure event rules inDX NetOps Spectrumthat recognizeDX NetOps Performance Managementthreshold violation events as a flapping condition, and consolidate them as symptoms/events on a single threshold event.For more information about how to configure event rules inDX NetOps Spectrum, see theDX NetOps Spectrumdocumentation.
- Adjust the event rule to use the window/duration logic. Change theDurationfrom 900 (15 minutes) in aWindowof 1800 (30 minutes) to aDurationof 600 (10 minutes) in aWindowof 1800 (30 minutes). While this change makes the event rule more sensitive on the front end, it also makes the event rule more sensitive in terms of clearing the threshold events.
Example: Duration and Window
A monitored device has a poll cycle of 5 minutes. An associated threshold profile has an event rule with a duration of 600 (10 minutes) and a window of 3600 (1 hour). The event rule does not raise an event when a metric triggers the event rule conditions for a single poll result because the 5-minute poll does not reach the 10-minute duration. The event rule raises an event only if a metric triggers the event rule conditions for a second poll result within one hour of the first triggering poll.
When a metric breaches a threshold, the event rule creates a threshold event. When the event rule clears the threshold event, it rechecks the threshold with the next poll cycle. If a metric breaches the threshold again, the event rule creates a new threshold event.
Standard Deviation Event Rule Conditions
Event rules that use standard deviation compare the poll results to the baseline for the device or component. Event rules calculate the baseline and the standard deviation value for the specific hour of the day of the week.
For more information about these calculations, see Baseline Calculations.
Event rules trigger standard deviation rules when the value of the metric differs from the baseline by the specified number of standard deviations. For event rules with conditions that use the
Aboveoperator, the rule is triggered when the value of the metric exceeds the baseline value plus the number of standard deviations. For event rules with conditions that use the
Belowoperator, the rule is triggered when the value of the metric is lower than the baseline value minus the number of standard deviations.
The baseline is 65% and the standard deviation is 10%. The rules states that an event triggers when CPU utilization is above 2 standard deviations. This event rule condition triggers when the CPU utilization is greater than 85%.
Percent of Baseline Event Rule Conditions
Event rules with conditions that use the
Percent of Baselinecondition type compare the poll results to the calculated baseline plus or minus a percentage of the calculated baseline for the device or component. Metrics trigger event rules when the qualifying poll data meets the criteria that you have specified for a
Percent of Baselineevent rule condition.
Percent of Baselineevent rule conditions are useful when there is a lot of or very little variation in the metric values. Consider using this event rule conditions when the standard deviation is above 3 or extremely low, like 0.1 or 0.0.
The calculated baseline is 60 degrees and the specified
Percent of Baselineis 50%. The rule states that an event triggers when the temperature rises above 50% of the baseline. This event rule condition triggers when the temperature is higher than 90 degrees.
60 + (+50%*60) = 60 + 30 degrees = 90 degrees
The calculated baseline is 60 degrees and the specified
Percent of Baselineis -50%. The rule states that an event triggers when the temperature falls below -50% of the baseline. This event rule condition triggers when the temperature is lower than 30 degrees.
60 + (-50%*60) = 60 - 30 degrees = 30 degrees
Assign a Group to a Threshold Profile
Identify the devices or components that the threshold profile monitors by assigning groups to it. Threshold profiles apply to the components of the devices in those groups that support the selected metric family. Event rules can raise threshold violation events for the metrics that are part of that group.
The event rules for threshold profiles that you assign to a collection apply only to the devices in that collection, and not to components and interfaces in the collection.
Best practice:For components and interfaces, assign groups to threshold profiles for event rules to raise events.
Follow these steps:
- On theThreshold Profilespage, select the threshold profile to which you want to assign a group from the Folder View or Table View.
- Click theGroupstab in the right-hand pane.A list of groups assigned to the threshold profile display.
- ClickManage.TheAssign Groups to Threshold Profilesdialog opens.
- Select the groups from theAvailable Groupstree that you want to assign to the threshold profile, and then click the right arrow to add it to theSelectedlist.
The groups that you added to the
Selectedlist are assigned to the threshold profile.
View Threshold Violation Events
Event rules set conditions to raise threshold violation events, such as when a threshold is violated and when the threshold violation is cleared. You can view the threshold violation events that have been generated as a result of a specific event rule
Threshold Profilespage. You can view the threshold violation events that have been generated as a result of all event rules in all threshold profiles on the
Events Displaydashboard. This procedure details how to view the threshold violation events that event rules have raised as a result of a specific event rule.
Follow these steps:
- On theThreshold Profilespage, select the threshold profile for which you want to view threshold violation events.
- Click theEventstab in the right-hand pane.A list of events for the threshold profile, including threshold violation events, display.
- (Optional) Select the threshold violation event, and then clickDetails.TheEvent Detailsdialog opens.
- (Optional) ClickChangenext to the time range, and select a default time range.You can also set a different time range by selectingCustom Time Range.