Threshold Monitoring and Threshold Limiter Behavior

The threshold limiter monitors how long the evaluation engine takes to process rules in the Data Aggregator. If the threshold monitoring exceeds the specified percentage of the poll cycle, the evaluation engine enters a DEGRADED state. In the DEGRADED state, the evaluation engine waits for the monitoring to drop below the specified percentage. After a specified time, the threshold monitoring engine reassesses whether to suspend threshold evaluations. If threshold violations continue to exceed the percentage during the time period, the evaluation engine is suspended. Threshold evaluations will not resume, even if you restart the Data Aggregator. If the threshold violation does not exceed the specified percentage, the evaluation engine returns to normal operation.
capm260
The threshold limiter monitors how long the evaluation engine takes to process rules in the Data Aggregator. If the threshold monitoring exceeds the specified percentage of the poll cycle, the evaluation engine enters a DEGRADED state. In the DEGRADED state, the evaluation engine waits for the monitoring to drop below the specified percentage. After a specified time, the threshold monitoring engine reassesses whether to suspend threshold evaluations. If threshold violations continue to exceed the percentage during the time period, the evaluation engine is suspended. Threshold evaluations will not resume, even if you restart the Data Aggregator. If the threshold violation does not exceed the specified percentage, the evaluation engine returns to normal operation.
Do not modify the threshold limiter settings. The default settings provide protection against potential polled data loss. For any changes to the limiter settings, contact CA Support.
View the Threshold Monitoring Dashboard
The Threshold Monitoring dashboard provides information about the state of the threshold monitoring engine. Use this dashboard to see changing trends over time. This dashboard includes two views:
  • The Number of Event Rules Evaluated - Total
    This view displays the number of actual rule evaluations that have occurred for an associated set of polled items.
  • Percentage of Poll Cycle to Complete Event Processing
    This view displays the percentage of the poll cycle that the threshold monitoring engine takes to complete event processing.
Follow these steps:
  1. Navigate to
    Administration
    , and click a Data Aggregator data source.
  2. In the Tree View tab, expand the
    All Data Aggregators
    collection, and select the same Data Aggregator data source.
  3. Click the name of the Data Aggregator on the Details tab.
  4. Click the
    Threshold Monitoring
    tab in the Data Aggregator Pages view.
Threshold Monitoring Engine Status Events
Threshold monitoring engine status events describe the status of threshold evaluations. You can see these events in the Data Aggregator Events tab. The following table shows the possible Threshold monitoring engine status events:
Event Type
Event Subtype
Description
Administration event
Threshold monitoring engine status
Threshold evaluations have been enabled.
Administration event
Threshold monitoring engine status
The Threshold Monitoring Engine has transitioned to a degraded state. This means that threshold evaluations are taking longer than the configured threshold {X} and will be suspended in {X} minutes if this condition persists.
Administration event
Threshold monitoring engine status
Threshold evaluations have been disabled by the System Administrator.
Administration event
Threshold monitoring engine status
Threshold evaluation operations are still taking longer than the configured threshold {X}. The system is suspending threshold evaluations. Please contact the System Administrator to evaluate the monitoring configuration.
Administration event
Threshold monitoring engine status
The Threshold Monitoring Engine is no longer degraded and is functioning normally.
Administration event
Threshold monitoring engine status
The Threshold Monitoring limiter has been disabled.
Administration event
Threshold monitoring engine status
The Threshold Monitoring limiter has been enabled.
Administration event
Threshold monitoring engine status
Threshold evaluation operations are longer than the configured maximum threshold {X}. The system is suspending threshold evaluations. Please contact the System Administrator to evaluate the monitoring configuration.
Take Action If Threshold Evaluations Are Suspended
If threshold evaluations are suspended, consider the following options before you resume evaluations:
  • Try to correlate the change in performance to configuration changes in the system.
  • Reduce the overall number of active event rules. Turn off event rules one at a time. Check the performance after you turn off each rule before turning off another rule.
  • Reduce the overall number of active event rules that have windows greater than 300 seconds.
  • Reduce the number of Violation event conditions within event rules.
  • Reduce the number of event rules that use a condition type of Standard Deviation.
  • Verify that only
    required
    collections are applied to the monitoring profile or threshold profiles that contains event rules.
  • Verify that only
    required
    devices are contained within collections that are associated with these monitoring profiles or threshold profiles.
Threshold Limiter Behavior
To determine whether to suspend threshold evaluations, the limiter looks at how long the engine takes to evaluate thresholds as a percentage of the poll cycle time.
Percentage of Poll Cycle Threshold
specifies the percentage of the poll cycle that can be used to monitor thresholds. By default, the Percentage of Poll Cycle Threshold attribute value is 80 percent.
For example, 4 minutes for items that are polled at a 5-minute rate. The threshold monitoring engine becomes DEGRADED when the engine takes more than 240 seconds to complete threshold evaluations. An event is generated on the Data Aggregator item when the threshold monitoring engine becomes DEGRADED.
Recovery Interval
specifies how long the threshold monitoring engine remains in the DEGRADED state. By default, the Recovery Interval attribute is 15 minutes. If the processing time does not drop below the specified percentage with the recover interval, threshold evaluations are suspended. An event is generated on the Data Aggregator item when threshold evaluations are suspended.
Resume Threshold Evaluations
Threshold evaluations are not resumed automatically after they are suspended. Resume them manually.
If threshold evaluations are suspended frequently, contact CA Support.
Follow these steps:
  1. Enter the following information in a web browser:
    URL: http://DA_host:port/rest/thresholdmonitoring/config
  2. Take note of the ID value of the ThresholdMonitoringConfiguration item.
    Example:
    <ThresholdMonitoringConfigurationList>
    <ThresholdMonitoringConfiguration version="1.0.0">
    <ID>16</ID>
    <ThresholdMonitoringEnabled>true</ThresholdMonitoringEnabled>
    <PercentOfPollCycleThreshold>80</PercentOfPollCycleThreshold>
    <ThresholdMonitoringLimiterEnabled>true</ThresholdMonitoringLimiterEnabled>
    <RecoveryIntervalInMinutes>15</RecoveryIntervalInMinutes>
    </ThresholdMonitoringConfiguration>
    </ThresholdMonitoringConfigurationList>
  3. Open a REST client editor or HTTP tool that sends requests and gets responses.
  4. Set the Content-type to application/xml.
  5. Enter the following filter criteria:
    • URL: http://
      DA_host
      :
      port
      /rest/thresholdmonitoring/config/
      ID
      • ID
        The identification number that is assigned to the ThresholdMonitoringConfiguration item.
    • HTTP method = PUT
    • Resume threshold evaluations on the Body tab of the HTTP Request pane:
      <ThresholdMonitoringConfiguration version="1.0.0">
      <ThresholdMonitoringEnabled>true</ThresholdMonitoringEnabled>
      </ThresholdMonitoringConfiguration>
    Threshold evaluations resume.
Change the Default Behavior of the Threshold Limiter
In some situations, CA Technologies may recommend that you modify the behavior of the threshold limiter.
Follow these steps:
  1. Navigate to the following URL:
    http://DA_host:port/rest/thresholdmonitoring/config
  2. Note of the ID value of the ThresholdMonitoringConfiguration item.
  3. Open a REST client editor or HTTP tool that sends requests and gets responses.
  4. Set the Content-type to application/xml.
  5. Enter the following filter criteria:
    • URL: http://
      DA_host
      :
      port
      /rest/thresholdmonitoring/config/
      ID
      • ID
        The identification number that is assigned to the ThresholdMonitoringConfiguration item.
    • HTTP method = PUT
    • Increase the Percentage of Poll Cycle Threshold value on the Body tab of the HTTP Request pane.
      <ThresholdMonitoringConfiguration version="1.0.
      0">
      <PercentOfPollCycleThreshold>
      percent
      </PercentOfPollCycleThreshold>
      <RecoveryIntervalInMinutes>
      minutes
      </RecoveryIntervalInMinutes >
      </ThresholdMonitoringConfiguration>
      • percent
        specifies the percentage value.
      • minutes
        specifies the number of minutes to wait before reassessing the threshold monitoring engine.
    The threshold limiter runs with the updated values.
Disable the Limiter
In rare situations, CA Technologies may recommend that you disable the threshold limiter.
Follow these steps:
  1. Navigate to the following URL:
    http://DA_host:port/rest/thresholdmonitoring/config
  2. Note of the ID value of the ThresholdMonitoringConfiguration item.
  3. Open a REST client editor or HTTP tool that sends requests and gets responses.
  4. Set the Content-type to application/xml.
  5. Enter the following filter criteria:
    • URL: http://
      DA_host
      :
      port
      /rest/thresholdmonitoring/config/
      ID
      • ID
        The identification number that is assigned to the ThresholdMonitoringConfiguration item.
    • HTTP method = PUT
    • Disable the limiter on the Body tab of the HTTP Request pane:
      <ThresholdMonitoringConfiguration version="1.0.0">
      <ThresholdMonitoringLimiterEnabled>
      false
      </ThresholdMonitoringLimiterEnabled>
      </ThresholdMonitoringConfiguration>
    The limiter is disabled and the evaluation engine cannot enter a DEGRADED state.