Apply Rate Limit Assertion

The Apply Rate Limit assertion allows you to limit the rate of transactions passing through the gateway for a given user, client IP address, or other identifier. When this limit is reached, the Gateway can either begin throttling requests or it can attempt to delay the requests until the rate falls below the limit. You can also set a maximum concurrency level to prevent a user from monopolizing Gateway resources.
gateway
The 
Apply Rate Limit 
assertion allows you to limit the rate of transactions passing through the 
CA API Gateway
for a given user, client IP address, or other identifier. When this limit is reached, the Gateway can either begin throttling requests or it can attempt to delay the requests until the rate falls below the limit. You can also set a maximum concurrency level to prevent a user from monopolizing Gateway resources.
Use this assertion only if you need to limit the flow of transactions entering the Gateway. If you have a cluster of Gateways, the limits entered in this assertion are divided among the number of "up" nodes in the cluster. A node is considered “up” if it has posted its status within the past 8 seconds (configurable via the 
ratelimit.clusterStatusInterval
 cluster property). The Apply Rate Limit Assertion checks the status of cluster nodes every 43 seconds (configurable via the 
ratelimit.clusterPollInterval 
cluster property).
The Gateway automatically adjusts the rates internally should nodes be added or removed from a cluster. There is no need to modify the values in this assertion. If no authenticated user is established in the policy, then the IP address of the requestor is used instead in the Apply Rate Limit Assertion.
Using the Assertion
  1. Do one of the following:
    • To add the assertion to the Policy Development window, see Add an Assertion.
    • To change the configuration of an existing assertion, proceed to step 2 below.
  2. Right-click 
    Apply Rate Limit... 
    in the policy window and choose 
    Rate Limit Properties
     or double-click the assertion in the policy window. The assertion properties are displayed.  
  3. Configure the properties as follows:
    Setting
    Description
    Maximum requests per second
    Specify how many requests per second should be processed by the Gateway or cluster. You can enter a context variable that resolves to the maximum requests value.
    The context variable must either be single-value or multivalued with a specific index reference.
    Cluster wide
    If the Gateway cluster comprises more than one node, this setting determines whether the value entered in the
    Maximum requests per second field
    is split among the nodes or applied to each node.
    • Select this check box to split the value cross all the nodes in the cluster. For example, if the maximum is 100, each node in a 4-node cluster will be limited to 25 requests per second. If a node drops out of the cluster, the 100 limit is redistributed across the remaining three nodes.
    • Clear this check box to allow the maximum requests value on
      each
      node. For example, if the maximum is 100, each node in a 4-node cluster will be allowed 100 requests per second, resulting in an effective maximum of 400 requests per second. If one node drops out of the cluster, the effective maximum drops to 300 requests per second (3 x 100).
    Spread limit over
    X
    sec window
    Determines whether to allow a burst of requests to be spread across a window of time or whether to enforce a hard cap.
    • Select the check box to allow requests to arrive in arbitrary bursts that exceed the
      Max requests per second
      rate over an
      X
      second window. This can avoid throttling of traffic over prolonged traffic bursts.You may enter a context variable containing the
      second window value. This variable can be either single-value or multivalued with a specific index reference.
    • Clear the check box to disallow bursts. In this scenario, the Gateway only accepts requests arriving no sooner than
      1/limit
      of a second. For example, if the
      Max requests per second
      is 100, at least 1/100 second must have elapsed between requests. Requests that arrive sooner are either throttled or shaped (based on the "When limit exceeded" setting). Disallowing burst traffic is recommended only for advanced users.
    It is not recommended to disable burst traffic on a counter that will be servicing multiple concurrent requests, particularly at high rates. Doing so can lead to unintended throttling or delaying of multiple requests that arrive at exactly the same time.
    The following graph illustrates how spreading the limit will allow more traffic and throttle fewer requests.
    rate_limit_arc2
    rate_limit_arc2
     
    The effect is akin to a gas tank that slowly refills when not being used. Each request "consumes" some gas and the request fails if there is no more gas. The "Spread limit over" setting lets you control the size of the gas tank.
    Limit each
    Specify how limiting should occur:
    • by the
      User or client IP
      address
    • by the
      Authenticated user
      name
    • by the
      Client IP
      address
    • by the
      SOAP operation
      within the request
    • by the
      SOAP namespace
      within the request
    • by the
      Gateway node
    • by a
      Custom
      counter value (enables a limit per value of a context variable); enter the node identifier followed by a context variable that resolves to the correct entity during run time.
      To help you construct a custom format, the entry box displays the actual node identifier and context variable associated with each of the other limit options when you select the Custom option. For example, when you first open the Rate Limit Properties,
      User or client IP
      is selected by default. Now, choose
      Custom
      and then reselect
      User or client IP
      . You see that the actual coding behind this is
      <node identifier>-${request.clientid}
      .
    The limit breakdown impacts both the maximum number of requests per second as well as the maximum concurrency.
    For example, if you choose “by client IP address” and set the maximum concurrency to 10 and maximum number of requests per second to 100, the assertion will fail if any incoming IP address exceeds either the concurrency of 10 or the 100 requests per second; all IP addresses combined are permitted to exceed these limits however. You can combine multiple instances of this assertion to impose difference limits by different breakdown factors, such as “maximum 10 per IP and maximum 100 for all combined”.
    When limit exceeded
    Specify what should happen if the rate limit is exceeded:
    • Throttle:
      Excess requests causes this assertion to fail and send audit code 6950
      (Rate limit exceeded on rate limiter XXXX)
      to the audit log.
    • Shape:
      The assertion attempts to delay requests to avoid exceeding the limit. If the
      API Gateway
      is unable to spare sufficient resources to hold a request any further, a 503
      (Service Unavailable)
      error may still occur.
    • Log Only:
      The assertion logs that the rate limit has been exceeded, but the assertion does not fail. Audit message 6950 is logged.
    • Blackout for
      X
      sec:
      Select this check box to fail all requests for the next
      X
      seconds after the limit is exceeded, even if the rate of requests falls below the limits defined in this assertion.
      IMPORTANT:
      For blackout period greater than 13 seconds, increase the
      ratelimit.cleanerPeriod
      cluster property to prevent the rate limit counters from being flushed before the blackout period ends. If the counters are flushed prematurely, the rate limits are not applied. For more information on this cluster property, see Rate Limit Cluster Properties.
    The number of threads that can be queued within a node is defined by the
    ratelimit.maxQueuedThreads
    cluster property. For more information, see Rate Limit Cluster Properties.
    Maximum concurrent requests
    Indicate whether to enforce concurrency limits for a given named rate limiter (as specified by the
    Limit each
    setting).
    • Unlimited:
      Concurrency is not enforced. A named rate limiter can have an unlimited number of active requests simultaneously in the Gateway or cluster. This may result in someone consuming a disproportionately high amount of system resources.
    • Limited to:
      Ensure that no named rate limiter can have more than the specified number of concurrent requests passing through this assertion. Requests that exceed the concurrency limit will cause the assertion to fail, with the audit event 6953
      (Concurrency exceeded on rate limiter XXXX)
      .
      You can enter a context variable that contains the maximum concurrent requests value. This variable can be either single-value or multivalued with a specific index reference.
    • Cluster wide:
      If the Gateway cluster comprises more than one node, this setting determines whether the value entered in the
      Limited to
      field is split among nodes or to be applied to each node. This setting is the default.
      • Select this check box to split the value across all the nodes in the cluster. For example, if the maximum is 10, each node in a 5-node cluster will result in a concurrency limit of 2 requests per node.
      • Clear this check box to allow the maximum requests value on
        each
        node. For example, if the maximum is 10, every node in the cluster will be allowed 10 concurrent requests.
    Additional note about how the concurrency limit works:
    • The concurrency counter is incremented when a request passes through the Apply Rate Limit Assertion (even if the assertion ends up failing). The counter is decremented once the request is completely finished.
  4. Click [
    OK
    ]
     
    when done.