Execution Server High Availability and Scalability

Execution Server High Availability ensures that the unavailability of a specific Execution Server does not lead to the unavailability of associated Agent Servers.
ra50
Execution Server High Availability ensures that the unavailability of a specific Execution Server does not lead to unavailability of associated Agent Servers.
 
 
2
 
2
 
 
Overview
Agent Server configuration includes a list of Execution Servers, or supernodes. At start-up, the Agent Server attempts to connect to an Execution Server from this list. If Execution Server is unavailable, the Agent Server attempts to connect to an alternative Execution Server from the list. If an Execution Server is unavailble, this reconnection mechanism enables automatic failover of Agent Servers to an alternative available Execution Server.
To prevent Execution Server overload, the Execution Server rejects incoming connection attempts when the number of connected Agent Servers exceeds a configurable hard limit value.
When the number of connected Agent Servers exceeds the configured limit, Agent Servers automatically failback to a recovered Execution Server. While connection attempts by new Agents are accepted, the new agents are directed to try to connect to an alternative Execution Server.
If the number of connected agents exceeds a soft limit, this reconnection retry mechanism enables automatic failback of Agent Servers to an available Execution Server that is recovered from a previous failure. This mechanism ensures fair load balancing of Agent Servers across all available Execution Servers.
You can set the following values in Agent Server 
nimi_config.xm
l configuration file:
  •  
    config/nimi/keepalive/client/supernodes
    List of Execution servers in the Agent Server configuration
  •  
    config/nimi/keepalive/server/
    Configurable hard limit value of connected Agent Servers
  •  
    config/nimi/keepalive/server/warn-capacity
    Configurable soft limit value of connected Agent Servers
Formulaically
The number of execution servers that are connected to a group of agents is 
N
.
The maximum number of agents that are connected to single Execution Server is 
M
 
So, the maximum number of Agents that 
N
 Execution Servers can support, to support the failure of a single Execution Server, is
 (N-1)*M. 
In other words,
 
when the hard limit capacity is set to 
M
, the soft limit warning capacity should be set to 
(N-1)/N
 of 
M
.
Example
This example enables High Availability for a group of 3 Execution Servers where each Execution Server can support 1,000 connected Agents. In this case, set the soft limit warning capacity to two thirds of the 1,000 agents, or approximately 667. In other words, the maximum total number of Agents that are connected to these 3 Execution Servers should be 2,000, or approximately 666 Agents connected to each Execution Server.
When one of these Execution Servers becomes unavailable, the 666 Agents fail over and connect to one of the remaining 2 Execution Servers. This scenario leaves 1,000 Agents connected to the remaining 2 Execution Servers.
nimi_config.xml - High Availability Setup of 3 Execution Servers
<config>
<nimi>
<keepalive>
<server>
<
capacity
>1000</
capacity
>
<!-- how much nodes to accept-->
<
warn-capacity
>667</
warn-capacity
>
<!-- over this limit new connecting nodes will be asked to seek another supernode-->
</server>
</keepalive>
</nimi>
</config>