The Gateway Dashboard provides a wealth of information about your gateway. Separate tabs on the dashboard provide information about service metrics and cluster status.
The Gateway Dashboard provides a wealth of information about your
CA API Gateway. Separate tabs on the dashboard provide information about service metrics and cluster status.
Dashboard - Service Metrics
The Service Metrics window in the Dashboard allows you to continuously monitor performance statistics of a
CA API Gatewaycluster. Message processing rates and response times are displayed in real time, in a dynamic chart that can be filtered by cluster node, published service, or resolution. Policy violations and routing failures are highlighted for greater visibility.
You can print the Service Metrics window by selecting,Print.
(1) The Policy Manager connection timeout is disabled when the Dashboard is open, to allow for uninterrupted viewing of metrics. (2) If nothing is displayed in the Dashboard, check whether service metrics are disabled. Service metrics are disable if the
setting serviceMetrics.enabledconfigured to use a MySQL database.
To open the Service Metrics window:
- From the Policy Manager Main Menu, click [View] >Dashboard(on the browser client, from the Monitor menu). The Service Metrics window is displayed.
The Service Metrics window displays traffic flow based on the default settings of All Nodes, All Services, and Fine resolution.
The Service Metrics window is divided into the following sections:
- The filters across the top let you select the node, service, and resolution for the graph
- A moving chart containing three plots: response times (top), notification indicators (middle strip), and message rates (bottom)
- A summary at the right, containing tabs for the selected interval or latest interval
The following mouse actions are available to interact with the display:
Left click anywhere in moving chart
Selects a time interval. Statistics about that interval are displayed in the [
Selected] tab in the Interval Summary section.
Right click anywhere in moving chart
Lets you view the audit events that have occurred during the time interval. Select
Show Audit Events(<
time>) from the menu that pops up. The events are displayed in a Gateway Audit Events window.
Point at any bar
Positioning the mouse pointer over any colored bar displays a tooltip containing more information about what is happening.
Drag mouse pointer left to right over any bar
Zooms in for a closer look within a time period (see
Zooming Time Intervalsbelow).
Drag mouse point right to left
Zooms out (see
Zooming Time Intervalsbelow).
Select the information that you want to view from the drop-down lists:
- Gateway Node: Select the Gateway node to monitor or use the default "<All Nodes>" to view data that is combined from all nodes.
- Published Service: Select the service to monitor or use the default "<All Services>" to view data that is combined from all services.Clicking on a service name in the "Services with problems" box highlights an interval and brings the "Selection" tab to the front with statistics of that interval. If there are routing failures or policy violations in that interval, services with those problems are listed in the "Services with problem" list box. Clicking a service name in that list box selects a single service, as if a service name was selected from the Published Service list.
- Resolution: Select a resolution for the graph:Fine(5 sec),Hourly, orDaily.
Response Timeplot at the top of the chart shows the front end and back end response times, with minimum, maximum, and average values for each time increment. The graph is updated based on the selected
Fine= every 5 seconds;
Hourly= every clock hour,
Daily= every calendar day. The response times are expressed in milliseconds and the corresponding numeric values are shown in the details section.
The Fine interval can be changed using the
metrics.fineInterval cluster property. Restart the
CA API Gatewaycluster for this property change to take effect.
Front Endresponse time is the time it takes for
API Gatewayto receive a request from a client, then send a response back to the client. The
Back End response time is the time it takes for
API Gatewayto forward the request to the web service, then receive a response from the web service. Thus, the front end time always includes the back end time.
The Back End response time includes all routings, if there are multiple routing assertions in the policy.
To see the data collected for a particular time interval, point to the corresponding bar and the information is displayed in a tooltip.
The notification bar is the horizontal strip in the middle of the moving chart. Its purpose is to alert you to potential problems: a red square indicates a time interval where routing failures occurred, while a yellow square indicates policy violations have occurred. The services with the problems are listed in the Interval Summary area.
The Message Rate plot at the bottom of the chart shows the message rate, broken down by routing failure, policy violation, and successful requests. The colored bars show at a glance where problems may be occurring. The corresponding numeric values for message rates are shown in the details section. Note that the time axis displays the Gateway time, which may not be in the same time zone as the machine running the Policy Manager.
To see the data collected for a particular time interval, point to the corresponding bar and the information is displayed in a tooltip. You can also right-click any time period and select Show Audit Events. This displays a static Gateway Audit Events window containing only the audit messages for the selected time interval. This can help you isolate and troubleshoot any problems quickly. Repeat this procedure on any other time periods that you want to investigate—there is no need to close the Gateway Audit Events window first.
The panel to the right of the moving chart contains two tabs:
- The [Selection] tab displays information about the selected time interval on the Message Rate Chart. This tab expands on the information presented in the tooltip.
- The [Last<resolution>] tab displays information for the last resolution interval.
The following information is displayed in either tab:
- Interval period. This is fixed for the [Selected] tab, but updated dynamically when the [Last<resolution>] tab is selected. Note that the Gateway time zone is used; this may differ from the local time if the Policy Manager is run on a different machine.
- The minimum, maximum, and average response times for the indicated time period. These are categorized by front end and back end processing, broken down by minimum, maximum, and average values.
- The message processing statistics for the indicated time period, broken down by routing failures, policy violations, and successful requests.
- Any services with routing failures or policy violations (shows red or yellow in the Notification Bar and Message Rates chart) in the "Services with problems" box. When the [Selection] tab is currently selected, the problem applies to the bin currently selected. When the [Last...] tab is current selected, the problem applies to the latest bin.You can always click any bin to see the service names with problems again. You can click on a service name to filter the published services to only that service.
Zooming Time Intervals
You can zoom both the Response Time or Message Rate plots for a closer look at the time intervals.
- To zoom in, press and hold the left mouse button while dragging the pointer from left to right across one or more bars, then release the mouse button. The plot re-scales to the width of the mouse drag. You can repeat the zoom multiple times.For example, using the hourly resolution, each bar represents a one hour period and the labels are four hours apart. If you zoom into three bars, the resulting graph shows ten minute increments along the time line.
- To zoom out, press and hold the left mouse button and perform a short left drag motion anywhere within the chart, then release the mouse button; it is not necessary to drag over a bar. The plot re-scales back to its original resolution.
Dashboard - Cluster Status
The Cluster Status window in the Dashboard displays the status of the Gateway cluster node(s) and provides service statistics. The information in this window is automatically updated every few seconds, with the last update time shown at the bottom left corner of the window.
The Cluster Status window contains two tables:
- TheGateway Statustable at the top displays node information and CPU and server statistics by Gateway node.
- TheService Statisticstable at the bottom displays service activity statistics.
In either table, you can click a column heading to sort the rows in ascending or descending order based on that heading. You can print the Cluster Status window by selectingPrint.
To open the Cluster Status window:
- From the Policy Manager Main Menu, click [View] >Dashboard(on the browser client, from theMonitormenu).
- Click the [Cluster Status] tab.
The Cluster Status window appears.
Gateway Status Table
The Gateway Status table displays information about each Gateway node. You can also rename or remove a node.
Name of the cluster node assigned during configuration. Displays three status icons:
: Node is active.
: Node is inactive. When inactive nodes are detected, the tab name changes to to bring this to your attention.
: Node status is undetermined. Policy Manager is in the process of assessing the status of the node and will change the icon to active or inactive once the status is determined.
Load Sharing %
Indicates the percentage of total cluster traffic being handled by the node over the past 60 seconds, expressed both as a percentage and as a dynamic bar graph.
A value of "0" indicates no activity.
Request Routed %
Indicates the percentage of current routing activity being handled by the node over the past 60 seconds, expressed both as a percentage and as a dynamic bar graph.
A value of "0" indicates no activity.
The average number of work processes completed over the last 60-second period. For Appliance Gateway installations, values reaching or exceed the number of CPUs on the appliance indicates that the Gateway is under heavy load and overall server performance is slow. For example, a value of 4.0 or greater on a 4-CPU appliance indicates heavy load on the Gateway.
This setting does not apply to Software Gateway installations.
Server start time to the current time. Use this information to analyze the number of requests processed by the node per time period.
The IP address for the direct Gateway-to-Gateway connection.
You can perform the following operations on a node:
Rename a node
The new node name is immediately reflected in the Gateway Node column and throughout the Policy Manager.
Renaming a node only changes how the name is displayed in the Policy Manager. It does not affect the actual host name of the Gateway node.
Delete a node
You can delete inactive nodes so that they no longer appear in the Gateway Cluster table. For information on making a node inactive, see Deactivating a Cluster Node.
To delete an inactive node:
The node is immediately removed from the Gateway Status window.
Deleting a node removes it from the Gateway Cluster. If a deleted node is reactivated, you must stop and restart the applicable Gateway in order to see the node in the Cluster Status window.
View log information for a node
When you are setting up the
CA API Gateway, use the logs to help you diagnose issues for a Gateway node.
To view log information for any gateway node:
Service Statistics Table
The Service Statistics table initially displays information for all services.
- To filter the list of services shown, enter a service name in the Service Name box. You may also enter a partial name, wildcards, or a regular expression to achieve broader matches.
By default, the statistics reflect what has occurred since the cluster started.
- To restart all counters, click [Restart'Counting since']. This resets all values to zero and begins counting from that moment on. This is useful to get a "snapshot" of the statistics, without losing the cluster cumulative totals.
- To see the statistics accumulated since the cluster was installed, click [Count since cluster install]. This is a cumulative total that is not affected by cluster starts and shutdowns.
The following table describes the columns in the Service Statistics table.
Name of the service assigned during configuration.
Number of requests that passed policy assertions but failed at the back end web service.
Number of requests that failed policy assertions.
Number of request messages that have been successfully routed (completed).
Success (last min.)
Number of request messages that have been successfully routed in the last minute of up time.