Use Enterprise Manager Failover

Enterprise Manager failover works by designating Enterprise Managers as follows:
apmdevops106
Enterprise Manager failover works by designating Enterprise Managers as follows:
  • One Enterprise Manager as a primary computer and the other as a secondary or backup.
  • Both Enterprise Managers as primary computers.
The responsibility of the secondary is to take control when the primary fails and to relinquish control when the primary recovers. The responsibility of the primary is to take control from the secondary.
You can configure Enterprise Manager failover when a pair of Enterprise Managers shares a single Introscope installation. Typically, the two Enterprise Managers are on different computers. The Introscope installation must reside on a network file system because both Enterprise Managers need read access to the Introscope installation. However only one Enterprise Manager at a time needs write access. For example, the Introscope installation could be shared using a Network Attached Storage (NAS) protocol such as these protocols:
  • Network File System (NFS)
    Use NFS version 4 (NFSv4) when configuring the Enterprise Manager failover. Older NFS versions can cause severe database issues.
  • Server Message Block (SMB)
To provide a cluster stability, CA Technologies recommends that you configure failover for your MOM Enterprise Manager. In addition, you can configure failover for one or more Collectors and CDVs within a cluster or for a standalone Enterprise Manager. If you are configuring failover for the MOM, Collectors, and CDVs in a cluster, you configure each Enterprise Manager individually.
  • Enterprise Manager failover for Enterprise Team Center Master is scheduled for testing. CA cannot support EM failover for ETC Master until testing is complete.
  • If you create a MOM Enterprise Manager Hot Failover in a Linux environment, you must restart the server on which the primary Enterprise Manager and NFS share are deployed.
  • (Linux only) SMB shared directories are not supported for sharing metadata on Linux. Use a Network File System (NFS) instead.
To configure and use Enterprise Manager failover, perform these steps:
Enable Enterprise Manager Failover
By default, Enterprise Manager failover is not enabled.
Follow these steps:
  1. Navigate to the
    <EM_Home>\config
    directory and open the
    IntroscopeEnterpriseManager.properties
    file.
  2. Locate the Hot Failover section.
  3. Set the failover properties.
    When agents, TIMs, and Workstations try to connect to an Enterprise Manager, they try all the IP addresses for a host name. If you have defined a logical host in DNS with the IP addresses of the primary and secondary Enterprise Managers, then the agents, TIMs, and Workstations can use this for the Enterprise Manager host name and connect to whichever Enterprise Manager is running.
  4. Be sure that the secondary Enterprise Manager computer users have write permission to the primary Enterprise Manager SmartStor data directory.
    In a failover situation, this permission allows the secondary Enterprise Manager to write data to the primary Enterprise Manager SmartStor database.
Note:
In a failover environment, you can use the
EMCtrl
script to start a primary Enterprise Manager. The secondary Enterprise Manager should not use the control script. For example, on a UNIX platform, use the control script
EMCtrl.sh
to start the primary. Use the
./Introscope_Enterprise_Manager
command to start the secondary
.
Enterprise Manager Failover Rules
These rules apply to Enterprise Manager failover:
  • When you start the Enterprise Manager with a failover, start each Enterprise Manager one at a time. If there is a primary or secondary configuration, the primary must be started first.
  • If both Enterprise Managers are primary, then the second primary waits for the first primary to fail and does not relinquish control when the first primary recovers.
  • For failover, the primary, secondary, and installation directories are typically on different hosts. However, the Introscope installation (where Introscope is originally installed) and Enterprise Managers can reside on the same host. For more information, see Configuring Enterprise Manager failover to work on a single host.
  • If the host of an Enterprise Manager matches one of the primary hosts, then the Enterprise Manager assumes the role of primary.
  • If the host of an Enterprise Manager matches one of the secondary hosts, then the Enterprise Manager assumes the role of secondary.
  • If the host of an Enterprise Manager does not match the primary or secondary host, then the Enterprise Manager does a normal startup. However a warning message is logged.
  • A secondary Enterprise Manager checks every two minutes (configurable) to see if a primary Enterprise Manager is waiting to start. If so, the secondary yields to the primary and shuts down.
    The property controls how often a running secondary Enterprise Manager checks to see if a primary Enterprise Manager is waiting to start.
After yielding to the primary Enterprise Manager, the secondary Enterprise Manager
shuts down
. You will need to restart it manually by using one of the methods described in the next section.
  • A primary Enterprise Manager does not yield to a secondary Enterprise Manager or another primary Enterprise Manager.
  • The startup and shutdown sequence is as follows: When starting the failover pair, start the primary Enterprise Manager first then the secondary. When shutting down the failover pair, stop the secondary first then the primary.
Restart the Secondary Enterprise Manager after Yielding to the Primary
After the secondary Enterprise Manager has yielded to the primary, it exits the JVM.
To restart the secondary Enterprise Manager, use
one
of the following methods:
  • Use a shell script.
  • Run the secondary Enterprise Manager as a Windows service.
    Note
    : The secondary exits with a status code of 23 (
    org.eclipse.equinox.app.IApplication.EXIT_RESTART
    ) if it shuts down because it yields to a primary. The secondary exits with a 0 status code if it shuts down as usual because of a command-line interrupt or by the Workstation.
  • (Windows-only) Use the following command:
:RESTART "Introscope Enterprise Manager" IF ERRORLEVEL 23 GOTO RESTART
Configure Enterprise Manager Failover to Work on a Single Host
You can configure Enterprise Manager failover to work on a single host rather than a shared file system. You can set up the Enterprise Manager failover on a single host to permit a second Enterprise Manager to take over when the first fails. Set up failover on a single host when you do not want to perform the following activities:
  • Run an Enterprise Manager as a Windows service
  • Restart an Enterprise Manager using a shell script.
Configuring both failover Enterprise Managers on the same host creates a single-point of failure that could result in outages.
Follow these steps:
  1. On a single host, navigate to the
    <EM_Home>/config
    directory.
  2. Open the
    IntroscopeEnterpriseManager.properties
    file in a text editor.
  3. Modify the following Enterprise Manager properties as shown:
    introscope.enterprisemanager.failover.enable=true introscope.enterprisemanager.failover.primary=localhost
  4. To start the first primary Enterprise Manager, click Introscope Enterprise Manager.exe.
  5. To start the second primary Enterprise Manager, click Introscope Enterprise Manager.exe.
    When the second primary Enterprise Manager starts properly, in the command prompt (the console) a message displays stating: acquiring primary lock. If this message does not appear, then either failover is incorrectly configured or the primary Enterprise Manager is not running.
Note:
For information about testing failover on a single host, see TEC1845647.
MOM Failover and CA CEM TIMs
If you are deploying CA CEM in your CA APM environment, you configure the MOM to connect with each TIM. When you enable a TIM, the MOM provides the TIM with two IP addresses: the MOM IP address and the IP address of the collector running the TIM Collection Service. This collector connects with the TIM to collect the TIM data.
If you have configured the MOM failover and a TIM is running and enabled, the primary MOM provides the TIM with this information:
  • Primary MOM IP address
  • The secondary MOM IP addresses that you defined in the
    IntroscopeEnterpriseManager.properties
    file.
The TIM stores these IP addresses, which you can view in the System Setup, TIM, TIM Settings page.
For the failover MOM to connect to the TIM, the following conditions must be met:
  • The TIM is running and enabled
  • The
    IntroscopeEnterpriseManager.properties
    are configured as follows:
    • The
      IntroscopeEnterpriseManager.failover.enable
      property is set to true.
    • One or more primary MOM IP addresses are defined in the
      introscope.enterprisemanager.failover.primary
      property.
Note:
If you define more than one primary MOM IP address, it is unnecessary to define a secondary IP address.
  • If you define only one primary MOM IP address, define one or more secondary MOM IP addresses in the
    introscope.enterprisemanager.failover.secondary
    property in the
    IntroscopeEnterpriseManager.properties
    file.
Note:
You can define more than one secondary MOM.
  • If the address of the failover MOM changes, then disable and enable all the TIMs. Enabling the TIMs allows the TIMs to obtain the IP address of the new failover MOM.
    You can see the failover MOM IP addresses in the CEM console. See the TessIpAddr field in the TIM Settings page.
If these conditions are met, when the primary MOM fails the failover MOM automatically connects to the TIM. You can configure the failover to access another primary MOM or to a secondary MOM.
If these conditions are not met, the following actions occur:
If...
When a MOM fails...
TIM is not running and enabled
No connection to the TIM before or after MOM failure. The TIM must be running and enabled to be recognized by the MOM and receive the MOM connection.
IntroscopeEnterpriseManager.failover.enable
property is set to false
The collector continues to pull data from the TIM, but no alternative primary or secondary MOM tries to connect to the TIM.
The alternative primary or secondary MOM connects to the collector.
No MOM additional primary or secondary IP addresses defined
Collector continues to pull data from the TIM, but no alternative primary or secondary MOM tries to connect to the TIM. Any MOM configuration changes are not sent to the TIM.