Strategies for HTTP Load Balancing

This topic describes strategies to consider when configuring HTTP load balancing for your gateway.
gateway83
This topic describes strategies to consider when configuring HTTP load balancing for your
CA API Gateway
.
Choosing a strategy often includes several sub-components: balancing, affinity, failure detection. 
Contents:
Load Balancing Strategy for New Connections
A Load Balancing strategy is also known as the "Balancing Algorithm". These are the most common strategies. 
Round-Robin
This is the simplest strategy, easiest to visualize, easiest to implement. This method rotates incoming requests across the different Gateway nodes, regardless of the load. This strategy is less desirable especially for SSL, as it causes a large increase in the number of SSL negotiations. It also can prevent HTTP 'keepalive' from working, depending on the particular Load Balancer implementation. Lastly, this strategy does not consider the account server load.
Least Connections
This method selects the service with the least number of active connections to ensure that the load of active requests is balanced on the services and nodes. This is a common strategy that is simple to implement and understand, and it works well in most instances. It is a good alternative to round-robin, as the number of connections can be a reasonable proxy for load if the server in question is doing the same function.
Least Load
This method selects the node with the least amount of load. However, it is uncommon for a Load Balancer to do a real system load query using SNMP query or something similar. Reason: Doing so is more resource intensive compared to other passive methods.
Modern Load Balancers use several mechanisms to select the optimal back-end pool member, including:
  • application response time
  • pre-configured and dynamic pool member ratios
  • number of active sessions (which is different from number of connections)
  • predictive methods that analyze performance over time and anticipate growing pool member load.
Application response times are commonly used, but they have a serious downside. HTTP 500 responses and other errors from broken application servers have typically fast response times AND the errors always close the keep-alive. This means that a server that is down causes a load balancing pool to favor it unnaturally. Reason: Both the connection count is below and the average response time is be short.
Choosing an Affinity Strategy
Affinity is how the Load Balancer chooses a server for a connection from a client that has previously sent a connection. Different affinity strategies can affect performance. 
SSL Session Affinity
SSL Session Affinity is the most flexible affinity strategy and is preferred for SSL sessions. This strategy inspects the section of the SSL initial packet that encodes the SSL session identifier. If jjit is empty, then this represents a net new connection. 
Almost all public and private API traffic is protected by SSL.
IP Affinity
IP Affinity is the default setting in most standard Load Balancer configurations. This setting is sufficient if you have a use case that is strictly Business to Consumer.
This strategy is less effective with smaller counts of client systems. This is because the "averages" do not produce good load balancing if the number of client systems is within an order of magnitude as the number of server systems being balanced.
IP Affinity is also ineffective when proxy servers are used. These proxies effectively consolidate many clients behind a single IP address.
HTTP Session Cookie Affinity
HTTPS sessions are less applicable in the API world, as relatively few modern API implementations use cookie-based sessions. OAuth or JWT credentials are the emerging trends, and they are not session based, but make messages idempotent.
A session cookie must terminate SSL on the Load Balancer (or be non-SSL) to inspect the HTTP headers to look at the session cookies.
Most APIs are SSL protected.
No Affinity
Not using affinity is not recommended for most heavy usage situations. Even for light-usage applications where balancing is not the main focus, affinity helps with High Availability.
Failure Detection
The standard method of detecting failures is to send a request to a service. The 
CA API Gateway
has an HTTP service that is designed specifically to work with Load Balancer active detection modes. This has performance implications and should not be used if the design involves multiple virtual end points (Called VIPs in F5) connecting to the same pool of servers.
For more information, see "Load Balancer Health Check" in Configuring the Load Balancer.
Some systems use ICMP ping. This does not detect many failure cases and is not recommended.
Real World Guidance
CA Technologies provides the following guidance based on experience with real-world deployments. 
Choose the Affinity and Load Balancing strategy appropriate to the use case
Consider the following issues:
  • Some Load Balancers allow you to configure the load balancing algorithm and affinity separately. This capability is crucial. Some popular older devices did not, which causes serious issues. For example, Cisco® CSM and CSS are examples of models that do not have separate configuration.
  • Keep in mind this distinction: 
    • "Load Balancing Algorithm (Strategy)" is about choosing the back-end server to send a request from a client with which the Load Balancer has never connected before
    • "Affinity" is where to send a request from a client from which the Load Balancer has already seen traffic.
  • Affinity makes a significant difference for SSL. The ability to reuse SSL sessions increases performance dramatically. Tests are performed on VMware-based Virtual Appliances show a throughput increase of more than 15x (from 1.7K TPS to 28K TPS ). 
  • Performance tests differ from production scaling. For instance, customers often want perfect load distribution, but have a limited number of clients. As a result, performance test staff may turn off affinity to achieve better load distribution. The untintended side effect is that this may increase the CPU load on the Gateways due to the overhead of SSL session negotiation.
Set both Load Balancer timeouts and Routing Assertion connection and read timeouts appropriate to system behavior at a business level
The 
CA API Gateway
 is factory set with the following defaults:
  • 30 second connection timeout
  • 3 retries
  • 60 second read timeout
This may impact user experience in high-performance environments, as users may resend requests before the 60 seconds timeout, which increases the server load.
You should decide which is preferable at your business: respond quickly with an error to a client API request, or respond more slowly with fewer errors. This may also encourage you to check your entire signal chain for performance.
Many Load Balancers have timeouts for how long a connection is active (meaning how long it waits for the back-end to respond). If that time is shorter than the Gateway read timeout, then you encounter the CLOSE_WAIT issue. To prevent this from happening, configure the Load Balancer to send a TCP RST if the connection closes before the read completes. 
SSL Termination
If you terminate SSL on the Load Balancer, some use cases may not function as expected. Sometimes you can rewrite policy to make this work and sometimes you cannot.
A common specific case: Mutual Auth SSL to mobile devices using the MAG MSSO SDK is more difficult to do with Load Balancer-based SSL termination because of the complicated certificate provisioning. There are some ways around this, but they have large implications in terms of policy authoring support.
Conclusion
CA Technologies recommends using SSL session affinity with an HTTP Load Balancer. It provides the most benefits, with the least number of drawbacks.