Manage Alarms with Centralized Alarm Policies
An alarm policy defines a set of metrics and alarm conditions in a centralized location, so that monitoring administrators can view and manage alarm reporting easily. Administrators can also create alarm policies in response to new conditions and needs. They can manage all aspects of alarm behavior in an alarm policy; for example, manage the alarm thresholds, timing, and messages configured for alarms. The Alarm Policies feature lets you perform the following actions:
uim902
HID_Alarm_Policy
An alarm policy defines a set of metrics and alarm conditions in a centralized location, so that monitoring administrators can view and manage alarm reporting easily. Administrators can also create alarm policies in response to new conditions and needs. They can manage all aspects of alarm behavior in an alarm policy; for example, manage the alarm thresholds, timing, and messages configured for alarms. The Alarm Policies feature lets you perform the following actions:
- View a list of alarm policies.
- Add alarm policies.
- Add and delete conditions that trigger an alarm.
- Add alarm conditions to monitor individual devices, a group of devices, or a specific monitoring technology (such as Docker).
- Configure Time Over Threshold alarming to reduce alarm noise to an actionable level.
- Customize alarm messages to provide the information you need.
Contents
2
2
Prerequisites
The following are the prerequisites for creating an alarm policy:
- Ensure that the robot version is 7.96 or later.
- If your robot version is 9.20 or 9.20S, ensure that the MCS version is 9.20. If this compatibility is not maintained, MCS profiles and alarm policies will not work.
- Ensure that the profile is an enhanced monitoring profile and it is collecting metrics. For the complete list of supported enhanced profiles, see Configuring Alarm Thresholds in MCS article in the probes documentation.
Create an Alarm Policy
The complete process to create an alarm policy requires you to work in UMP and Operator Console. You create an enhanced profile in UMP (with metrics collection enabled). Only when the metrics collection starts, you can create an alarm policy in Operator Console.
Follow these steps:
- Log in to UMP.
- Create an enhanced monitoring profile with metrics collection enabled. The following screenshot shows an enhanced monitoring profile in UMP:
- In UMP, select theOperator Consoleoption from theActionsdrop-down list.The Operator Console window opens.
- In Operator Console, clickSettingsin the left pane.
- Click theAlarm Policiescard.TheAlarm Policiespage opens.
- Click the plus icon
at the bottom of the page.
- Enter a policy name in theAlarm policy namefield.Enter a policy name that helps you distinguish one policy from other policies. If you are creating an alarm policy for a device or group, you can include the device name or IP address or the group name. Include key words in the name to make it easier to search for a specific policy.
- ClickAdd condition(
).
TheSet Conditiondialog opens. This dialog lets you define alarm conditions.An alarm condition defines what is monitored. You can set alarm conditions for a group (device and container), a specific device, or a monitoring technology. - Select the type of alarm condition on theSet Conditiondialog:
- DeviceMonitors the state or performance metrics for a device component.To configure an alarm condition for a device, select a device name, the metric, and the component that you want to monitor. The following example screenshot shows the settings for the Device type:When there are multiple hosts that collect the same metric, the list of monitoring hosts also displays in theSet Conditionpage. You can select only one host at a time to create alarm condition. Create another condition to collect metric on the other host. The following example screenshot shows theSelect a Monitoring Hostsection that appears when there are multiple hosts that collect the same metric:
- Monitoring technologyMonitors metrics associated with a specific monitoring technology.To configure an alarm condition for a monitoring technology, select a monitoring technology, a configuration profile, and a metric. The following example screenshot shows the settings for the Monitoring Technology type:
- GroupMonitors the state or performance metrics for a group (container or device). Add alarm conditions that apply to all devices in a group. Groups are displayed as a navigation tree; container groups followed by subgroups. Expand a container group to select a subgroup. When you create the condition at the container group, all subgroups (child container groups and device groups) in that container group inherit it. Support for container group is helpful in scenarios where you want a single threshold for each metric on a device. That threshold policy can be rolled down from the container group to the device group and then to the device.To enable the alarm policy functionality for container groups, use the MCS raw configuration to set the value of theenable_container_support_for_alarm_policyparameter in thetimedsection totrue. By default, the value isfalse.To configure an alarm condition for a group, select the group name and the metric that you want to monitor on all devices in a group. You can also specify whether you want to generate alarms on all the components or for some specific components. By default, alarms are generated on all the components. To generate alarms on specific components, use a regular expression to filter the components. Select one of the following options depending on your requirements:
- All ComponentsLets you generate alarms on all the components of all devices in a group.
- RegExLets you filter the components based on a regular expression, which enables you to generate alarms only on the filtered components. Use meta characters such as*and?to construct a regular expression and pattern matching. RegEx supports regular expressions written in PERL. For example, if you want to generate alarms on the CPU Usage of the CPUs—CPU-11, CPU-12, and CPU-13—of all the devices in a group, you can define the regular expression as:CPU-1[1-3]. You can also use simple text with wild card operators for matching the target string. For example, theCPU*expression matches all the CPUs on the system (CPU-0, CPU-1 and so on till CPU-15). There are certain limitations on how you can define specific regular expressions.
The following screenshot shows settings for the Group type:
- ClickOKto save the condition information.
- Specify an appropriate priority in thePriorityfield to evaluate the metric condition for the alarm policy at the group level. The condition that has the highest priority is used for generating alarms on the device. The range of the priority value is from 0 through 10000. You can specify the priority value only for the alarm policy at the group level, not at the device level or monitoring technology level. At the device level, the priority of the condition is set to the highest value and it takes precedence over other condition priorities for the same metric on that device. At the monitoring technology level, though the UI does not show the condition priority, CA UIM internally sets the value to 100, which cannot be changed. The default priority value is 100 at the group level and monitoring technology level.For more information about specific use cases, see the related section.The following screenshot shows the priority of a condition for a device-level alarm policy. Note that the priority is set to Highest and the value cannot be changed:The following screenshot shows the priority of a condition for a group-level alarm policy. Note that the priority field shows the default priority of 100; you can change the value in this case:
- Set the alarm threshold by entering the alarm severity, threshold type (static or dynamic), operator, threshold value, and and alarm timing (Immediate or Time over Threshold), as needed.If you selectTime over threshold, enter the number of minutes, hours, or days the metric needs to violate the threshold value. For example, when theTime over thresholdis three hours in 4 hours, Infrastructure Management generates an alarm when there is a consecutive threshold violation for three hours within a four-hour time period.The following screenshot shows the alarm condition with condition priority as 100 (default value), alarm severity as Critical, threshold type as static, operator as greater than, threshold value as 80, and alarm creation timing as Immediate:
- Click the arrow next to theAlarm messagessection to review the default alarm messages. You can also customize the alarm messages to contain additional information.
- ClickSave(in the lower right corner) to create an alarm policy with one or more alarm conditions.This alarm policy generates alarms with the default alarm messages when the configured thresholds are violated.
The following example screenshot shows a created alarm policy:

When you create an enhanced profile in UMP and the probe template includes default threshold values, then a default alarm policy is created in the Operator Console for this enhanced profile. The creator of the default alarm policy is displayed as
CA default policy
in the Operator Console. Additionally, when you convert your non-enhanced profile to an enhanced profile, a corresponding alarm policy is created in the Operator Console for the converted profile. Creation of this alarm policy adds threshold values that are present in the non-enhanced profile to the spooler metric (plugin_metric) section. The creator of this alarm policy is displayed as CA profile migration
in the Operator Console.FAQs
This section provides more information on some specific areas related to alarm policy.
How do I create a new alarm policy in disabled state?
When you create an alarm policy in disabled state, the alarm policy is created successfully but it is not enforced by default. This ability gives you the option to evaluate your alarm policy before you enable it to receive alarms.
Follow these steps:
- ClickSettings(
).
- Select theAlarm Policiescard.A list of existing alarm policies appears.
- Click the plus icon
at the bottom of the page.
The new policy screen appears. - Enter a name in theAlarm Policy Namefield.
- ClickAdd condition(
).
- Select the type of alarm condition on theSet conditionsdialog.
- Select options that apply to the type of alarm condition.
- ClickOKto save the condition information.
- Set the alarm threshold. Modify the alarm severity, threshold type (static or dynamic), and alarm timing, as needed.
- Click theSave and Disablebutton.The alarm policy is created in the disabled state and the status tag for the newly created alarm policy displaysDisabled, in the alarm policies page.
How do I disable (or enable) an existing alarm policy?
If you want to disable (or enable) an existing alarm policy, you can do so. By disabling the existing alarm policy, you no longer receive any alarms for that policy. This allows you to temporarily disable the alarm policy without the need to delete it. And, when you want to receive alarms from the same disabled alarm policy, you can simply enable it. You are not required to create a new alarm policy.
Follow these steps:
- ClickSettings(
).
- Select theAlarm Policiescard.A list of existing alarm policies appears.
- Click the required alarm policy.
- Toggle the option in the lower left corner toDisabled (orEnabled).The alarm policy is disabled (or enabled) and a relevant confirmation message is displayed. For example, the policy status displays the
tag against the disabled alarm policy, when you look at the list of policies in the Alarm Policies page.
The following screenshot shows an example where an existing alarm policy is disabled:
Click the
Delete
button (in the lower left corner) to delete an existing alarm policy.How do I disable an alarm condition?
You can disable a specific alarm condition in an alarm policy. In case of multiple conditions in an alarm policy, disabling one condition does not affect other existing conditions. Doing this will stop generating alarms for disabled alarm conditions from an alarm policy, while other alarms from conditions that are still enabled will continue to be generated. For example, you have created an alarm policy for a device that the File and Directory Scan (dirscan) probe monitors. For the same metric, you have created two separate conditions with different threshold values. You now want to disable one of the conditions.
Follow these steps:
- ClickSettings(
).
- Select theAlarm Policiescard.A list of existing alarm policies appears.
- Click the required alarm policy.
- Scroll to the alarm condition that you want to disable.
- Select theInline Menu(
), and then select
Disable condition. - SelectSave.The condition is disabled and alarms are no longer generated for the disabled alarm condition. The status of the condition (
) is displayed next to it. The following screenshot shows an example:
To enable the condition, select
Enable
condition
, and click Save.
The status of the condition is changed and the Disabled tag no longer appears.
What are the limitations for regular expressions usage?
The following regular expressions cannot filter components for a group:
- Incorrect regular expression:CPU-(0|1)Workaround:Use the regular expression:CPU-[0-1]Matches the components:CPU-0 and CPU-1
- Incorrect regular expression:CPU.11Workaround:Use the regular expression:/CPU.11/Matches the component:CPU-11
- Incorrect regular expression:total/iWorkaround:Use the regular expression:/[tT][oO][tT][aA][lL]/Matches all occurrences of the stringtotalirrespective of the case. That is, the expression matchestotal,Total,tOtal,toTal,TotAl,TOTAL, and so on.
The following regular expression has limitations on how it searches for the components:
- tmp1|tmp2: Matches all the directories starting withtmp1 (such astmp1,tmp11,tmp14,tmp156,tmp1.x) and onlytmp 2.
When an alarm policy is created, all alarm policy-related information is written in the plugin_metric configuration file (
..\Nimsoft\plugins\plugin_metric\plugin_metric.cfg
). MCS deploys the alarm policy to spooler. Spooler reads the configuration and generates alarms based on the condition. plugin_metric.cfg is the central place for all the alarm policies related to all the probes of a robot. The following plugin_metric.cfg snippet shows the information about an alarm policy for the dirscan probe:

Alarm policy logs are available under
..\Nimsoft\probes\service\wasp
. The name of the log file is policy_management.log
.How do I correct the plugin_metric file?
When you create an alarm policy or an enhanced profile, its configuration information is written in the plugin_metric file.
In robot versions prior to the secure versions, sometimes, this information is not written properly in the plugin_metric file. For example, you create an alarm policy, but that alarm policy configuration is not deployed properly. In this case, the corresponding information is not updated correctly in the plugin_metric file and this creates issues. Similarly, when you delete a child profile from the UMP UI, the same information is not deleted from the plugin_metric file. This issue has been fixed in the robot version released with CA UIM 9.2.0.
To resolve such issues in your environment, you can use the
plugin_metric_correction
callback that is available for the mon_config_service probe. This callback re-deploys enhanced profiles and alarm policies based on your input.Follow these steps:
- Ensure that you do not create any MCS profiles or alarm policies when you are performing this operation.
- (Optional) Open the mon_config_service raw configuration and increase the thread count to 10 in thetimedsection for each parameter:
- device_processing_threads
- config_deployment_threads
- Access the probe utility (pu) for the mon_config_service probe.
- Locate and select theplugin_metric_correctioncallback from the drop-down list.
- Enter the appropriate information for the following parameters, as required:
- process_all_devices_flagEnter the value as true if you want to re-deploy enhanced profiles or alarm policies on all the devices. If you select this parameter, all the remaining parameters are not required.
- robot_namesEnter the specific robot name on which you want to re-deploy the enhanced profiles or alarm policies. If you want to use more than one entry, enter a comma-separated list.
- computer_system_idsEnter the specific computer system ID (cs_id) on which you want to re-deploy the enhanced profiles or alarm policies. If you want to use more than one entry, enter a comma-separated list.
- cm_group_idsEnter the specific group ID on which you want to re-deploy the enhanced profiles or alarm policies. All the devices that are part of that group are considered for re-deployment. If you want to use more than one entry, enter a comma-separated list.
Note:You can use any combination ofrobot_names,computer_system_ids, andcm_group_ids. - Run the callback.A message appears in the right pane stating that the process has started for the devices. However, note that no completion message is displayed. The process completes all related tasks in the background. If you want to check the status, you need to verify the database.
- Verify the status by running the following queries:
- select * from ssrv2policytargetstatus where cs_id in (<ID>);
- select * from ssrv2profile where cs_id in (<ID>);
- Similarly, to find whether any error has occurred, run the following query:
- select * from ssrv2audittrail whereuseridlike 'plugin_correction%';
You have successfully repaired the plugin_metric file.
Consider the following sample hierarchy to understand various scenarios:
Priority Condition for Alarm Policy

- This sample hierarchy includes a root container group (C1).
- The root container group includes child container groups (C2, C3, C4, C5, and C6).
- Two child container groups (C3 and C6) contain device groups (G1 in C3 and G2 in C6).
- These device groups include certain devices (D1 in G1 and D1, D2 in G2). The device D1 is part of the two device groups G1 and G2.
- An alarm policy condition (PC1, PC2, PC3, PG1, PC4, PC5, PC6, and PG2) is created for each group. The alarm policy conditions PG1 and PG2 are for device groups; all other alarm policy conditions are for container groups.
For applying alarm policies to the device D1 in context of the above hierarchy, the following use cases are applicable:
Use Case 1: Alarm policy with the condition having the same metric and the same priority
If a device is part of multiple groups where conditions have the same metric and the same priority, then all the conditions are applied to the device.
For example, if the metrics and priorities are as follows, then all alarm policy conditions PC1, PC2, PC3, PG1, PC4, PC5, PC6, and PG2 are applied and corresponding alarms are generated. In this example, the metric M1 is present in all conditions, and all conditions have the same priority of 100. Therefore, eight alarms are generated in this case.
- PC1Metric: M1, Priority: 100
- PC2Metric: M1, Priority: 100
- PC3Metric: M1, Priority: 100
- PG1Metric: M1, Priority: 100
- PC4Metric: M1, Priority: 100
- PC5Metric: M1, Priority: 100
- PC6Metric: M1, Priority: 100
- PG2Metric: M1, Priority: 100
Use Case 2: Alarm policy with the condition having the same metric and different priorities
If a device is part of multiple groups where conditions have the same metric and different priorities, then the highest priority is taken into consideration to decide which alarm is generated. CA UIM verifies whether all the conditions for the device contain different priorities for the same metric. If so, the highest priority is taken into consideration.
For example, if the metrics and priorities are as follows, then PC2 and PC4 have the highest priority of 200 for the same metric M1. In this case, only two alarms are generated for these conditions (PC2 and PC4), because they have the highest priority out of all other conditions:
- PC1Metric: M1, Priority: 100
- PC2Metric: M1, Priority: 200
- PC3Metric: M1, Priority: 100
- PG1Metric: M1, Priority: 100
- PC4Metric: M1, Priority: 200
- PC5Metric: M1, Priority: 100
- PC6Metric: M1, Priority: 100
- PG2Metric: M1, Priority: 100
Use Case 3: Alarm policy with the condition having multiple metrics and the same priority
If a device is part of multiple groups where conditions have multiple metrics and the same priority, then all the metrics will be applied to the device.
For example, if the metrics and priorities are as follows, then two alarms are generated for the metric M1, two for M2, one for M3, one for M4, one for M5, and one for M6:
- PC1Metric: M1, Priority: 100
- PC2Metric: M1, Priority: 100
- PC3Metric: M2, Priority: 100
- PG1Metric: M3, Priority: 100
- PC4Metric: M4, Priority: 100
- PC5Metric: M5, Priority: 100
- PC6Metric: M6, Priority: 100
- PG2Metric: M2, Priority: 100
Use Case 4: Alarm policy with the condition having multiple metrics and different priorities
If a device is part of multiple groups where conditions have multiple metrics and different priorities, then the highest priority is taken into consideration and the corresponding metrics is applied.
For example, if the metrics and priorities are as follows, then two alarms are generated for the metric M1 because PC2 and PC4 have the highest priority (200):
- PC1Metric: M1, Priority: 100
- PC2Metric: M1, Priority: 200
- PC3Metric: M2, Priority: 100
- PG1Metric: M5, Priority: 100
- PC4Metric: M1, Priority: 200
- PC5Metric: M1, Priority: 100
- PC6Metric: M3, Priority: 100
- PG2Metric: M2, Priority: 100
Upgrade/Migrate Scenarios
While upgrading/migrating from a previous version to 9.2.0, the following scenarios are considered:
- When you upgrade an existing alarm policy (created in 9.0.2) to 9.2.0, the priority of the condition for the upgraded alarm policy is set to 100 at the group level and monitoring technology level and to the highest value at the device level. The behavior of the upgraded alarm policy is the same as explained in the above-mentioned use cases (Use Case 1, Use Case 2, Use Case 3, and Use Case 4).
- When you migrate a device-level legacy profile to an enhanced profile, the priority of the condition for the device-level alarm policy always gets the highest priority.
- When you migrate a group-level legacy profile to an enhanced profile, the priority of the condition for the group-level alarm policy takes the same priority as that of the profile.
Additional Considerations
Review the following considerations:
- The metric_precedence parameter in the plugin_metric.cfg file is updated with the condition priority.
- When a new container is added to the hierarchy or an existing one is deleted from the hierarchy, the alarm policy is applied based on the new hierarchy. And, if the condition priority is the same, all the alarm policies in the hierarchy are applied to the device.
- When an alarm policy is deleted from the hierarchy, all related entries are removed from the database and the plugin_metric.cfg file.
- For two different alarm policy conditions for the same device and the same metric, alarms are generated from both the conditions as the priority remains the same for both of them.
- If an alarm policy has multiple conditions and you make any update to the alarm policy, the priority of the conditions change accordingly.
How do I determine if an alarm policy needs to be updated?
You should observe the existing alarms in the (
) view. There may be too many alarms that are generated for a metric, the performance levels you want to monitor are outside the industry norm, or you want to differentiate monitoring for regional and global locations to account for localized issues. Once you develop a monitoring strategy, you can change alarm behavior by opening the alarm policy that generates the alarms and updating, adding, or deleting the alarm thresholds. See the next topic for information about accessing a specific alarm policy.
Alarms

How do I access alarm policies?
Follow these steps:
- ClickSettings(
).
- Select theAlarm Policiescard.A list of existing alarm policies appears.
- From theAlarm Policiesview, click a policy name to view the configuration. Use the "Custom filter" field to quickly search for a specific policy. Click the column headings to sort policies alphabetically by technology, policy name, or creator.
The following information is provided in the policy list to help you locate a specific alarm policy.
- Monitor- Displays the monitoring technology for an alarm policy.
- Alarm policy- Provides the policy name and the metrics that are configured in the policy.The alarm policy name is either the name of the monitoring profile from which the alarm policy was generated, or the name you entered when you created the policy. Mouse over the metrics under the policy name to see a complete list of metrics configured in the policy.
- Applies to- Shows the device, group, component, combination of components monitored by a policy, and the type of target being monitored.
- Creator- Displays the username who created an alarm policy, or CA default policyappears if Infrastructure Management generated the alarm policy automatically. The date reflects the policy creation date or the date the policy was last updated.
Can I create several alarm conditions for the same metric?
You can configure several alarm conditions from the same metric. In the same alarm policy, you could configure the same alarm condition for the same metric, but apply the metric thresholds to different groups. This provides consistent monitoring across the devices in various groups.
Example:
A monitoring administrator monitors Windows devices for the San Francisco, Chicago, and Boston business units. The Windows devices are grouped by business unit. Because alarm policies can contain alarm threshold configuration for more than one device, group, or technology, the monitoring administrator creates a single alarm policy to apply to the devices in the three business units individually.
One way to configure the alarm policy is to create an alarm condition for each group, and each metric to be monitored. The following table shows an alarm condition created for the Boston and Chicago groups:
Condition | Group | Metric | Monitoring probe | Component | Priority | Thresholds |
Generate an alarm when the configured thresholds are violated. | Boston | Up time | cdm | All components | 100 | Critical, static, greater than, 80, Immediate |
Generate an alarm when the configured thresholds are violated. | Chicago | Up time | cdm | All components | 100 | Critical, static, greater than, 80, Immediate |
Why would I change alarm thresholds?
Configured alarm thresholds are carried over from a monitoring profile during the one-time alarm policy generation process. You might want to change the threshold settings for the following reasons:
- The alarm severity is too high or low.
- Instead of receiving persistent (immediate) alarms, you want to receive alarms only after successive alarm threshold violations have occurred within a configured window of time (Time over threshold).
- You want different performance thresholds for regional groups of computers, or for older versus new devices and servers.
How do I modify, add, or delete alarm thresholds?
Generated alarm policies provide alarms based on predefined, best practices monitoring. Update the threshold settings to reflect your monitoring needs.
Follow these steps:
- In an alarm policy, scroll to the desired alarm condition.
- ClickExpand(v) to view the configured thresholds.
- Modify the configured alarm severity, threshold type (static or dynamic), operator, or threshold value as needed.
- Modify the configured alarm creation timing.If you selectTime over threshold, enter the number of minutes, hours, or days the metric needs to violate the threshold value. Next, enter the number of minutes, hours, or days to specify the total window of time. For example, when theTime over thresholdisthree hours in 4 hours, Infrastructure Management generates an alarm when there is a consecutive threshold violation for three hours within a four-hour time period.
- ClickAdd(
) or
Delete() to add or delete thresholds for a metric.
- ClickSave(lower right corner) to save your change to the alarm policy.Note:You cannot save updates to an alarm policy until you have entered the required information for each threshold configured in an alarm condition.If you delete a threshold, alarms that were previously generated remain in the system until the close alarm rule time frame is reached.
Can I configure more than one threshold for a metric?
You can configure more than one threshold for a metric to track different severities. The following scenario describes a case in which several thresholds for a metric alerts an administrator to perform different actions to address performance issues.
Use Case
To help you keep track of the user experience or determine when to upgrade equipment, you could configure different thresholds for CPU Usage. For example, you could configure the following three thresholds to generate alarms for different purposes:
- To help you determine when equipment should be updated or replaced, configure a threshold that generates a critical alarm when CPU usage is at 95 percent for 24 hours within a 36-hour window (time over threshold alarming).
- Configure a second threshold to generate a major alarm any time CPU usage exceeds 90 percent (immediate alarm). This alarm could help you track processing jobs that should be scheduled to run after hours.
- Generate a minor alarm when CPU usage is greater than 60 percent for 4 days within a 5-day window of time (time over threshold alarming). This alarm would let you know that users are experiencing data processing delays.
The following screenshot shows several thresholds that are configured for a single metric.

How do I edit an alarm condition?
For any alarm condition, you can modify what is being monitored, the selected metric, and the threshold. You can also monitor the same metric for a device or group, or configure an alarm condition for a technology. When you configure alarm conditions for a technology, the alarm condition is applied to any device with that technology in your environment.
Follow these steps:
- Within an alarm policy, scroll to theConditionthat you want to change.
- ClickEdit.
- Modify any option on theSet conditiondialog.
- Expand (v) Type, Device, Metric, Component, Monitoring technology, or Group.
- Select the desired setting.
- If you change the type of condition, ensure that all the options are configured.
- ClickOKto save your updates.
- Expand (v)Thresholds.
- Modify existing alarm thresholds, if needed.
- ClickAdd threshold(
) to configure another threshold.
- Select an alarm severity, the type of threshold, an operator, and enter a threshold value.
- Next, select the timing for an alarm.
- ClickRemove threshold(
) to delete a configured threshold.
- Save(lower right corner) the updates to the alarm policy.
How do I delete an alarm condition?
When you delete an alarm condition from a policy, alarms are no longer generated for the metric. If the metric is enabled, CA UIM continues to generate metric data. CA UIM saves the alarm history for the configured period of time.
Follow these steps:
- Scroll to the alarm condition you want to delete.
- Click theInline Menu(
), and then select
Delete condition.Alarms are no longer generated for the deleted alarm condition.
How do I customize alarm messages?
Each alarm policy can have up to three predefined alarm messages: a general message, a Time Over Threshold message, and a close alarm message. These predefined messages provide sufficient information to help you troubleshoot an issue. However, you can customize the alarm messages to contain additional information. For each type of predefined message, there is a list of supported variables that you can use in a message to indicate the exact device and threshold violation details. A general and close alarm message appears for each alarm policy. The Time over Threshold violation alarm message appears after a Time over Threshold alarm is configured.
The default alarm violation messages and variables are:
- Immediate threshold violation message${metric_name} on ${component_name} for ${device_name} is at ${metric_value} ${metric_unit).Example: CPU monitor on C:/ for test_system is at 90percent.
- Time over Threshold violation message${metric_name} on ${component_name} for ${device_name} is at ${metric_value} ${metric_unit). It has violated the threshold for at least ${tot_slider} ${tot_slider_unit} out of ${tot_time_frame} ${tot_time_frame_unit}.Example: CPU monitor on C:/ for test_system is at 90%. It has violated the threshold for at least 1 minute out of 5 minutes.
- Close alarm message${metric_name} on ${component_name} for ${device_name} is OK.Example: CPU monitor on C:/ for test_system is OK.
You can customize any of the default alarm violations messages to provide information that is relevant to your environment. You can enter text that describes a business location, or can add the variables that provide the information you want. For a complete list of supported variables, see the Alarm Message Variables topic.
Follow these steps:
- Within an alarm policy, scroll to the Alarm messages section.
- Click theInline Menu(
) for the message you want to change.
The Alarm Messages dialog displays the alarm message and the available variables. - Enter text and additional variables to modify the message.
- At any time, you can clickReset to Defaultto return the modified message to the predefined default settings.
- ClickSaveto update the message with your changes.
What do I need to know about alarm thresholds?
The alarm threshold settings determine when an alarm is generated. An alarm threshold consists of three elements:
- Alarm Severity: The severity of an alarm.Alarms can be critical, major, minor, warning, or informational.
- Threshold: Identifies how threshold violations are handled.A threshold is composed of a threshold type (static or dynamic), an operator, and a value.
- Threshold type: For static alarms, violations are determined based on an absolute value that is collected for a metric. Dynamic alarms are generated when the calculated average trend is a configured percentage equal to, above, or below the calculated baseline for a metric.
- Operatorand Threshold Value: Identifies the acceptable state or level of performance.An alarm is generated when a sample, collected for a metric at a configured interval, violates the threshold value.
- Alarm Creation Timing: Indicates how long after a threshold violation occurs that an alarm is generated.Infrastructure Management can generate an alarmimmediatelyafter a threshold violation occurs or after a certain number of threshold violations occur within a configured time period (Time over threshold).
What are alarm thresholds tied to?
An alarm threshold is tied to a single metric. You can configure alarm thresholds for a device, a monitoring technology, or a group.
What is the difference between a static and a dynamic alarm?
There are two types of alarms: static and dynamic. A static alarm is generated when a metric reaches a configured threshold value. For example, when CPU Usage on a target device reaches 95%, the policy generates a critical alarm. When you are monitoring a device that has persistent issues, consider configuring a static alarm.
Dynamic alarms are generated based on the moving average of the baseline data that was collected over the previous 28 days. When you specify a threshold value for a dynamic alarm, an alarm is generated when the calculated average of the data reaches the configured percentage above or below the average trend. The calculated average trend can change over time as the collected baseline data changes. If you enter a dynamic threshold of >10% for CPU Usage, and the average trend of CPU Usage for the last 28 days is 85, an alarm is generated when the CPU Usage goes above 95%.
When you are monitoring a healthy, stable device whose resources are used in a consistent manner, configure a dynamic alarm.
What is the difference between immediate and time over threshold alarming?
Infrastructure Management can generate an alarm
immediately
after a threshold violation occurs, or after a certain number of threshold violations occur within a configured time period (Time over threshold
). The Time Over Threshold is an event processing rule which reduces the number of alarms that are generated when threshold violation events occur. You can use Time Over Threshold to filter out data spikes and monitor problematic metrics over a set period. Instead of sending an alarm immediately after a threshold violation occurs, the Time Over Threshold function:- Monitors the events that occur during a user-defined sliding time window.
- Tracks the length of time that the metric is at each alarm severity.
- Raises an alarm if the cumulative time the metric is in violation during the sliding window reaches the set Time Over Threshold.
For example, you could configure a static or dynamic alarm that is generated when the threshold has been continuously violated for 5 minutes in a 15-minute sliding time period. The following figure shows when the alarm is generated.
Time Over Threshold Alarm

Can I change the name of a monitoring profile after a corresponding alarm policy is generated?
Do not change the name of a monitoring profile after it is used to generate an alarm policy. Alarm policies are dependent on monitoring profiles. If you change the monitoring profile name or the corresponding alarm policy name, CA UIM stops generating alarms for the devices, groups, or technologies monitored by the alarm policy. Other than the lack of alarms, there is no indication or error message that a profile has been deleted.
Can I change the name of an alarm policy that was generated from a monitoring profile?
Do not change the name of an alarm policy generated from a monitoring profile. Alarm policies are dependent on monitoring profiles. If you change the monitoring profile name or the corresponding alarm policy name, CA UIM stops generating alarms for the devices, groups, or technologies monitored by the alarm policy. Other than the lack of alarms, there is no indication or error message that a profile has been deleted.
Can I delete the monitoring profile after the alarm policy is generated?
Do not delete a monitoring profile associated with an alarm policy. The alarm policies are dependent on monitoring profiles. If you inadvertently delete a monitoring profile, CA UIM stops generating alarms for the devices, groups, or technologies monitored by the associate alarm policy. Other than the lack of alarms, there is no indication or error message that a profile has been deleted.
How do I search for an alarm policy?
Click
), and then select the
Settings
(

Alarm Policies
card. A filtering mechanism is available in the top left corner of the alarm policies list. Enter a technology, an alarm policy name, a metric name, or a creator to search for a specific alarm policy.How many alarm thresholds can I configure for a metric?
For a single metric, you can configure as many thresholds as you need to monitor a target device.
My alarms are chatty or I'm seeing alarm flapping. What can I do?
Consider adjusting the alarm threshold setting. If you created a monitoring configuration profile using the predefined threshold settings, these setting might not be appropriate for your environment. If you are seeing alarm flapping—where an alarm is generated, quickly closed, and generated again within a short time period—consider configuring the Time Over Threshold timing option for an alarm. When you configure the Time Over Threshold (TOT) option, an alarm is generated only when the TOT threshold is reached the configured number of times, during the configured sliding window.
How can I reset an alarm message to default settings?
You can return a customized alarm message to the predefined alarm message at any time.
Follow these steps:
- ClickInline Action(
) next to the desired alarm message.
- On the Alarm message dialog, clickReset to Default.The predefined message appears in the Alarm Messages panel. The next alarm that is generated displays the predefined alarm message.