About SpectroSERVER Fault Tolerance
About SpectroSERVER Fault Tolerance
Fault tolerance requires more than one
SpectroSERVERto manage a given landscape. A copy of the database for that landscape is loaded on each
SpectroSERVER. However, only a single copy is active at any time. The
SpectroSERVERwith the active database is known as the
primary. The inactive database runs on a standby
SpectroSERVER, which is the secondary
SpectroSERVER. You can also install another inactive copy of the database on a tertiary
If the primary
SpectroSERVERfails, the database on the secondary
SpectroSERVERbecomes active, and the secondary
SpectroSERVERstarts managing the network. Applications that are connected to the primary
SpectroSERVERare automatically switched to the secondary
SpectroSERVER. When the primary
SpectroSERVERreturns to service, the applications automatically switch back to the primary
SpectroSERVER, and the secondary
SpectroSERVERbecomes inactive again.
Not all applications can exercise the full range of their capabilities when they are being run from a secondary
SpectroSERVER. The main reason to set up a fault-tolerant environment is to ensure continuous monitoring of the network, not to create a full copy of
DX NetOps Spectrum.
SpectroSERVERPrecedence in a Fault Tolerant Environment
Primary, secondary, and tertiary
SpectroSERVERs that manage the same landscape must all have the same landscape handle and the same modeling catalog. The servers are distinguished from one another with a numeric precedence value. The lowest number indicates the primary
SpectroSERVERs are installed with a default precedence value of 10. To designate a
SpectroSERVERas a secondary server, assign it a higher precedence number, such as 20. Likewise, a tertiary
SpectroSERVERwould have a higher precedence than the secondary, for example, 30.
When you first set up a fault tolerant environment, you can assign precedence values at the time you are loading database copies on any standby
SpectroSERVERs using the SSdbload utility.
To change precedence values later, you can use the Loaded Landscapes subview. Access this subview by selecting a local landscape in the Navigation panel, and then selecting the Information tab in the Component Detail panel.
The Loaded Landscapes subview is different from the
SpectroSERVERControl subview. Access the
SpectroSERVERControl subview by selecting the VNM in the Navigation panel and then selecting the Information tab in the Component Detail panel.
A single database is active at any given time in a fault tolerant
DX NetOps Spectrumenvironment. Therefore, the other databases must be updated periodically to reflect new models and changes to attribute values in the active database. This synchronization of data is accomplished through the
DX NetOps SpectrumOnline Backup feature. You can run Online Backup on demand or at regularly scheduled intervals. When you run Online Backup against the primary
SpectroSERVER, it creates a backup copy of the current database. Online Backup automatically loads the copy onto each designated secondary
As in any DSS environment, each of the
SpectroSERVERs in a fault tolerant environment must have the same modeling catalog installed. Online Backup copies the current modeling catalog. However, it does not copy all the .i files or other elements that are associated with individual management modules. Therefore, if you install any new management modules on your primary
SpectroSERVER, also install the same new management modules on any secondary
For more information, see the
EventDisp and the Alertmap files that are defined in the <
$SPECROOT>/custom/Events directory are propagated to fault-tolerant servers when the secondary
SpectroSERVERpolls the primary
SpectroSERVERfor status information.
Support for Fault-Tolerant Archive Manager
You can run the Archive Manager on the secondary
SpectroSERVERhost in a fault-tolerant
SpectroSERVERenvironment. This secondary Archive Manager provides visibility to events in OneClick when the primary Archive Manager is down.
Primary or secondary
SpectroSERVERlocally stores events in the following two scenarios:
- When primary Archive Manager is down, and the primarySpectroSERVERis running. In this case, primarySpectroSERVERlocally stores events as they are created until primary Archive Manager is up.
- When the primarySpectroSERVERhost itself is down. In this case, the secondarySpectroSERVERlocally stores events as they are created until the primary Archive Manager is up.
You can start the secondary Archive Manager on the secondary
SpectroSERVERhost to provide visibility to not only events as they are created when the primary Archive Manager is down, but also historical events.
When you start the secondary Archive Manager, it acts as a client to the primary
SpectroSERVERto receive and log events as they are created. This behavior does not affect the normal connection between the primary
SpectroSERVERand primary Archive Manager. When the primary Archive Manager goes down, OneClick fails over to the secondary Archive Manager to provide event data.
When the primary
SpectroSERVERhost itself goes down, the secondary
SpectroSERVERlocally stores events, but also forwards events to secondary Archive Manager. When the primary Archive Manager comes up, the secondary
SpectroSERVERtransfers all the locally stored events to it.
Archive Manager Data Synchronization
The secondary Archive Manager provides a best-effort synchronization of events, and there is no event synchronization that occurs between the primary Archive Manager and the secondary Archive Manager. When the secondary Archive Manager is running and connected to a
SpectroSERVER, it receives a copy of all events as they are generated. Anytime the secondary Archive Manager is down, events are not stored on the secondary. This functionality is distinctly different from the functionality of primary Archive Manager, where the
SpectroSERVERstores the events for later transfer to the primary Archive Manager.
This means that when the secondary Archive Manager is started for the first time, its DDM database does not contain any events, and no attempt is made to synchronize with the primary. Once the secondary Archive Manager has been running for MAX_EVENT_DAYS configured in the .configrc, it is generally in sync with the primary Archive Manager database.
Generate an Alarm If the Secondary
SpectroSERVERIs Not Restarted
When a primary
SpectroSERVERsynchronizes its database with the secondary
SpectroSERVER, a Contact Lost to Secondary Server (0x00010c0e) event and alarm are generated. The secondary
SpectroSERVERhas been brought down to load the new database from the primary
You can set up a rule to process this alarm so that the alarm is generated only if the secondary
SpectroSERVERis not restarted.
The EventPair rule lets you specify that a new event is generated if the Contact Lost to Secondary Server event occurs and a Contact Established to Secondary Server (0x00010c0f) event does not follow within a specified time period. You can then specify that this new event creates an event and an alarm indicating that the secondary
SpectroSERVERis still down.
Follow these steps:
- Open the EventDisp file with a text editor.The EventDisp file is located in the<$SPECROOT>/SS/CsVendor/Cabletron directory.
- Find the line that reads 0x00010c0e E 50 A 2, 0x00010c0e and change this line to the following:0x00010c0e R Aprisma.EventPair, 0x00010c0f,<numberofsecondstowait><generatedeventcode>
- <generatedeventcode>Is the event code to generate if the secondarySpectroSERVERdoes not come up within the time specified in<numberofsecondstowait>.
- Add the following line to the EventDisp file:<generatedeventcode>E 50 A 2, <generatedalarmcode>
- <generatedeventcode>Is the event code generated in Step 2 if the secondarySpectroSERVERdid not come up. 'E 50' indicates that the event is logged and has a severity value of 50. A 2 indicates that a major alarm is created.<generatedalarmcode>is the alarm code to generate based on this event.
- Create a Probable Cause file for this alarm that indicates that contact with the secondarySpectroSERVERhas not been reestablished after data synchronization.
For more information, see the
Secondary SpectroSERVER Readiness Levels
SpectroSERVERis considered to be at one of three different levels of readiness. Readiness depends on server configuration and status. The readiness levels are defined as follows:
- HotThe secondarySpectroSERVERis running and is available to take over immediately upon failure of the primarySpectroSERVERbecause it is already polling. To configure a secondarySpectroSERVERfor this level of readiness, add the following line to the .vnmrc file: secondary_polling=yes. This statement causes the standby to commence polling and processing traps whenever it starts, regardless of its connection status with the primarySpectroSERVER.
- WarmThe secondarySpectroSERVERis running, but the server can take a short time to become fully available. The secondarySpectroSERVERhas not been configured to start pollinguntilit loses contact with the primarySpectroSERVER. For example, it has no secondary_polling entry in the .vnmrc file, or the entry is set to no.If the secondary_polling entry is not in the .vnmrc file or the entry is set to no, the secondarySpectroSERVERdoes not process traps while in standby mode.
- ColdThe secondarySpectroSERVERis not running and must be started when there is a failure of the primarySpectroSERVER. In this case, it is irrelevant whether the secondarySpectroSERVERis configured for secondary polling.