Configure Threshold Profiles

Monitor specific devices or components, such as interfaces and IPSLA tests, and determine thresholds for given metrics or sets of metrics using threshold profiles.
HID_Configure_Threshold_Profiles
Monitor specific devices or components, such as interfaces and IPSLA tests, and determine thresholds for given metrics or sets of metrics using threshold profiles. Metrics that fail to meet the threshold trigger event rules to generate a threshold violation event, and returning to acceptable operation triggers the event rule to raise an event (a clear event) indicating that the violation has been cleared.
Example:
You want to monitor the utilization of an interface, and trigger a threshold violation event when the utilization is above 75 percent. When the utilization drops below 75 percent, you want the threshold violation event to clear:
  • Add Event Rules
    Define the logic in an event rule, which determines when it should generate a threshold violation event in response to a metric exceeding the threshold.
    The example in this topic requires at least two event rules. One event rule determines when it generates a threshold violation event, while the other determines when it clears the violation (it raises a clear event).
  • Add Conditions to the Event Rule
    Add conditions to the event rule. You can add up to five event rule conditions to an event rule. Event rule conditions automatically appear in a view when the conditions in a certain event rule are met. Event rule conditions include
    Violation
    and
    Cleared
    .
The following video shows the threshold profile configuration process:

In this article:
2
Threshold Best Practices
Consider the following best practices when defining threshold profiles:
  • Increase the granularity and flexibility of thresholding by assigning groups with specific components to the threshold profiles instead of devices.
  • Expand threshold monitoring slowly. Start with a small group of components and verify that the monitoring engine does not become degraded.
    For more information, see Threshold Event Processing Self-Monitoring Metrics.
  • Thresholds on components with one-minute polling have a high resource cost to the system.
  • Threshold evaluations can slow down after a data aggregator restart while
    DX NetOps Performance Management
    processes cached poll data from the the data collectors.
Create a Threshold Profile
Users with the Create DA Threshold Profile or the Administer DA Threshold Profile role right can create threshold profiles.
You can edit or delete exiting threshold profiles.
Follow these steps:
  1. Do one of the following:
    • (Administrators) Hover over
      Administration
      ,
      Monitored Items Management
      , and then click
      Threshold Profiles
      .
    • (Users) Click the name of your user account in the upper-right corner, and then click
      Manage Threshold Profiles
      .
    The
    Threshold Profiles
    page appears.
  2. Do one of the following:
    • Select an existing folder under which you want to create the threshold profile.
    • Create a folder for the threshold profile by clicking
      New Folder
      , and then create your folder.
  3. Right-click the folder, and then click
    New Profile
    .
    The
    Create / Edit Threshold Profile
    page appears.
  4. Specify the required information, and then click
    Save
    :
    • Name
      Defines the name for the threshold profile.
    • Folder
      Specifies the folder under which you want to create this threshold profile.
    • Status
      Defines the status of the threshold profile.
      Values:
      Enabled or Disabled
      Default:
      Enabled
    • (Administrators Only)
      Owner
      Specifies the owner for the threshold profile. Only the owner or a user with the Administer DA Threshold Profile role right can edit the threshold profile.
The threshold profile is created.
Next step:
To enable event rules to generate threshold violation events for metrics that are part of a group, assign groups to the threshold profile.
Add Event Rules to a Threshold Profile
Event rules are based on a single metric family. The metrics determine the conditions that trigger event rules to generate threshold violation events or raise clear events. When a metric exceeds the threshold, the event rule generates a threshold violation event.
Threshold profiles require at least one event rule. You can edit or delete existing event rules.
Prerequisite:
You have created or are in the process of creating a threshold profile.
Follow these steps:
  1. On the
    Create / Edit Threshold Profile
    page, in the
    Event Rules
    pane, do one of the following:
    • Click
      New
      .
    • To use an existing event rule as a template, select the event rule that you want to use as a template, and then click
      Copy
      .
    The
    Create / Edit Event Rule
    dialog opens.
  2. Complete the following fields, and then click
    Save
    :
    • Name
      Defines the name for the event rule.
    • Metric Family
      Defines the metric family on which this event rule is based.
    • Duration (sec)
      Specifies the total amount of time within a specified range of time, or
      Window
      , that the threshold must violate what is defined in the event rule condition. The poll cycles that trigger the condition do not need to be consecutive. The duration is cumulative.
      Default:
      300 (5 minutes)
    • Window (sec)
      Specifies the overall range of time that the event rule evaluates the condition and generates a threshold violation event when there are enough values, based on the specified
      Duration
      . The range of time is a rolling window.
      Default:
      300 (5 minutes)
    • Aggregation
      Specifies whether the threshold applies to an aggregate value of all components for the device. This field appears only when you select a supported metric family.
      Values:
      No Aggregation or Aggregate Components by Device
      Default:
      No Aggregation
    • Edit the event rule conditions or add a condition.
      You can select the
      Utilization (%)
      metric only if you have selected the CPU or Memory metric family as the family on which this event rule is based. If you select this metric, you must choose
      Fixed Value
      as the
      Condition Type
      for the event rule.
    The event rule is saved.
  3. Save the threshold profile.
The event rule is added to the threshold profile.
Threshold Assessment Logic for Window and Duration
DX NetOps Performance Management
loads data (points) every minute. The event rules evaluate the threshold against these loaded data points. Threshold processing at the data layer operates in bulk. The monitoring profile's poll rate determines the frequency at which the event rules evaluate poll cycle data. For example, if you have set the poll rate to 1 minute, the event rules evaluate one-minute poll cycle data.
For more information about how to set poll rates, see Configure Monitoring Profiles.
At each poll cycle, the event rule retrieves the items that violate the threshold rules, using the following logic:
show me all items that violated x threshold profile for y duration within z window
If threshold processing returns the item, the event rule generates a threshold violation event if one is not already active.
The event rule raises clear events when the metrics meet the following conditions:
  • The latest poll is below the clear threshold value.
  • The condition does not meet the criteria to generate a threshold violation event.
If you have a threshold profile applied to a mix of items with both one- and five-minute poll rates, with event rules showing 60 seconds (1 minute)
Duration
and
Window
values,
DX NetOps Performance Management
evaluates every one-minute poll cycle data point for violations. Every five minutes,
DX NetOps Performance Management
evaluates the 5 minute polled item data points along with the 1 minute polled item data points. In this scenario, the 1 minute polled items raise events against each poll cycle, same as the 5 minute polled items, while both use the lower 1 minute
Duration
and
Window
values.
For the event rule to raise a clear event, it must no longer have enough violating data points to meet the duration in the window, and enough data points that meet the clear criteria for the duration within the window. The event rule is always looking at the number of data points in the window.
DX NetOps Performance Management
does not create a window upon raise or clear.
The following are example scenarios assuming the items targeted poll at the default 5-minute poll rate:
  • Duration: 300 seconds/5 minutes; Window: 900 seconds/15 minutes.
    In this instance, any one poll cycle out of the possible three in the
    Window
    raises the event. The poll cycle that triggers the condition must be consecutive.
  • Duration: 600 seconds/10 minutes; Window: 900 seconds/15 minutes.
    In this instance, any two poll cycles out of the possible three in the
    Window
    raises the event. In this example, poll cycle A+B, A+C, B+C, etc, can trigger the condition that raises the event. The poll cycles that trigger the condition must be consecutive.
  • Duration: 900 seconds/15 minutes; Window: 900 seconds/15 minutes.
    In this instance, three poll cycles out of the possible three in the
    Window
    raises the event. Only when poll cycle A, B, and C in a given rolling
    Window
    violate the threshold does the event rule raise the event. The poll cycles that trigger the condition do not need to be consecutive. The duration is cumulative.
  • Duration: 900 seconds/15 minutes; Window: 1800 seconds/30 minutes.
    In this instance, three poll cycles out of the possible six in the
    Window
    raises the event. The poll cycles can be cumulative, not consecutive. As long as any three poll cycle values out of each six evaluated for the rolling
    Window
    violate the threshold, the event rule raises an event.
Best practice:
To prevent noisy events (a flapping condition), you can:
  • (Recommended) Configure event rules in
    DX NetOps Spectrum
    that recognize
    DX NetOps Performance Management
    threshold violation events as a flapping condition, and consolidate them as symptoms/events on a single threshold event.
    For more information about how to configure event rules in
    DX NetOps Spectrum
    , see the
    DX NetOps Spectrum
    documentation
    .
  • Adjust the event rule to use the window/duration logic. Change the
    Duration
    from 900 (15 minutes) in a
    Window
    of 1800 (30 minutes) to a
    Duration
    of 600 (10 minutes) in a
    Window
    of 1800 (30 minutes). While this change makes the event rule more sensitive on the front end, it also makes the event rule more sensitive in terms of clearing the threshold events.
Example: Duration and Window
A monitored device has a poll cycle of 5 minutes. An associated threshold profile has an event rule with a duration of 600 (10 minutes) and a window of 3600 (1 hour). The event rule does not raise an event when a metric triggers the event rule conditions for a single poll result because the 5-minute poll does not reach the 10-minute duration. The event rule raises an event only if a metric triggers the event rule conditions for a second poll result within one hour of the first triggering poll.
When a metric breaches a threshold, the event rule creates a threshold event. When the event rule clears the threshold event, it rechecks the threshold with the next poll cycle. If a metric breaches the threshold again, the event rule creates a new threshold event.
Standard Deviation Event Rule Conditions
Event rules that use standard deviation compare the poll results to the baseline for the device or component. Event rules calculate the baseline and the standard deviation value for the specific hour of the day of the week.
For more information about these calculations, see Baseline Calculations.
Event rules trigger standard deviation rules when the value of the metric differs from the baseline by the specified number of standard deviations. For event rules with conditions that use the
Above
operator, the rule is triggered when the value of the metric exceeds the baseline value plus the number of standard deviations. For event rules with conditions that use the
Below
operator, the rule is triggered when the value of the metric is lower than the baseline value minus the number of standard deviations.
Example:
The baseline is 65% and the standard deviation is 10%. The rules states that an event triggers when CPU utilization is above 2 standard deviations. This event rule condition triggers when the CPU utilization is greater than 85%.
Percent of Baseline Event Rule Conditions
Event rules with conditions that use the
Percent of Baseline
condition type compare the poll results to the calculated baseline plus or minus a percentage of the calculated baseline for the device or component. Metrics trigger event rules when the qualifying poll data meets the criteria that you have specified for a
Percent of Baseline
event rule condition.
Percent of Baseline
event rule conditions are useful when there is a lot of or very little variation in the metric values. Consider using this event rule conditions when the standard deviation is above 3 or extremely low, like 0.1 or 0.0.
Examples:
The calculated baseline is 60 degrees and the specified
Percent of Baseline
is 50%. The rule states that an event triggers when the temperature rises above 50% of the baseline. This event rule condition triggers when the temperature is higher than 90 degrees.
Math:
60 + (+50%*60) = 60 + 30 degrees = 90 degrees
The calculated baseline is 60 degrees and the specified
Percent of Baseline
is -50%. The rule states that an event triggers when the temperature falls below -50% of the baseline. This event rule condition triggers when the temperature is lower than 30 degrees.
Math:
60 + (-50%*60) = 60 - 30 degrees = 30 degrees
Assign a Group to a Threshold Profile
Identify the devices or components that the threshold profile monitors by assigning groups to it. Threshold profiles apply to the components of the devices in those groups that support the selected metric family. Event rules can raise threshold violation events for the metrics that are part of that group.
The event rules for threshold profiles that you assign to a collection apply only to the devices in that collection, and not to components and interfaces in the collection.
Best practice:
For components and interfaces, assign groups to threshold profiles for event rules to raise events.
Follow these steps:
  1. On the
    Threshold Profiles
    page, select the threshold profile to which you want to assign a group from the Folder View or Table View.
  2. Click the
    Groups
    tab in the right-hand pane.
    A list of groups assigned to the threshold profile display.
  3. Click
    Manage
    .
    The
    Assign Groups to Threshold Profiles
    dialog opens.
  4. Select the groups from the
    Available Groups
    tree that you want to assign to the threshold profile, and then click the right arrow to add it to the
    Selected
    list.
  5. Click
    OK
    .
The groups that you added to the
Selected
list are assigned to the threshold profile.
View Threshold Violation Events
Event rules set conditions to raise threshold violation events, such as when a threshold is violated and when the threshold violation is cleared. You can view the threshold violation events that have been generated as a result of a specific event rule
Threshold Profiles
page. You can view the threshold violation events that have been generated as a result of all event rules in all threshold profiles on the
Events Display
dashboard. This procedure details how to view the threshold violation events that event rules have raised as a result of a specific event rule.
Follow these steps:
  1. On the
    Threshold Profiles
    page, select the threshold profile for which you want to view threshold violation events.
  2. Click the
    Events
    tab in the right-hand pane.
    A list of events for the threshold profile, including threshold violation events, display.
  3. (Optional) Select the threshold violation event, and then click
    Details
    .
    The
    Event Details
    dialog opens.
  4. (Optional) Click
    Change
    next to the time range, and select a default time range.
    You can also set a different time range by selecting
    Custom Time Range
    .