Triage and Diagnose Problems

Introscope agents collect and report application and environmental performance metrics. These metrics help you identify performance problems for an application at anytime during its execution. Early detection of a problem helps you to address it before degraded performance affects customers. The application diagnoser identifies performance problems to prevent downtime for their application users (customers).
apmdevops97
Introscope agents collect and report application and environmental performance metrics. These metrics help you identify performance problems for an application at anytime during its execution. Early detection of a problem helps you to address it before degraded performance affects customers. The application diagnoser identifies performance problems to prevent downtime for their application users (customers).
Triage and Diagnose Problems
Perform the following steps:
View Alert Notification Messages
Alert notification messages appear automatically whenever an alert triggers a notification action. The messages that appear depend on the following alert settings:
  • Each Period While Problem Exists
    Produces a problem message every period that the simple alert is in caution or danger.
  • When Severity Increases
    Produces a problem message on any period when the state of the simple alert escalates:
    • from normal to caution
    • from normal to danger
    • from caution to danger
  • Whenever Severity Changes
    Produces a problem message, resolution message, or both on any state transition. For example, a state change of a simple alert from danger to caution produces the following messages:
    • Resolution: The danger status is resolved.
    • Problem: The caution status is still a problem.
    This type of resolution alert produces a resolution message if the state changes from caution or danger.
  • Report Only Final State Whenever Severity Changes
    Produces a problem message or resolution message only for the final state of an alert transition. For example, for a change from danger to caution, the simple alert triggers only a problem message for the final state, which is caution. This type of resolution alert produces a resolution message only if the state goes to normal.
Use the information in the message to determine the source of the alert. A meaningful name for an alert provides information about the source, so you can recognize the alert by its name. Based on your familiarity with your applications or with the alerts themselves, investigate the alert as follows:
  • Investigate application and business metrics
  • Investigate network and agent metrics
    Note:
    An action must be defined for the simple alert or summary alert to generate the alert notification.
Follow these steps:
  1. In WebView, click the Alert Notifications link when a number appears in the link. This number indicates received notification messages or when a New Alert Notification pop-up appears.
    All the alert notifications appear with details such as the metric name, the threshold value, and the time at which the alert is triggered.
    Note:
    A simple alert or a summary alert status generates two types of information messages -- a problem message and a resolution message.
  2. (Optional) Click Empty to clear the alert notifications.
  3. Click Close.
  4. Use tools to find more information about a problem, for example:
    • Historical metrics
    • Search
Monitor and Investigate Application and Business Metrics
Note
: The Triage Map is not available for SAP WebView users.
You can view application-centric or business process-centric metrics for your monitored applications as follows:
  • View deployed applications and business-centric metrics, in both live and historical modes.
  • Discover dependencies between application layers and constituent pieces of each layer.
  • Investigate high-level health indicators for applications and their constituent frontends, backends, and middleware.
  • Investigate aggregated health metrics for applications.
Most instrumented components report the standard metrics.
Note:
The application-centric view in the Triage Map tab displays aggregated health metrics. The agent-centric view in the Metric Browser tab displays metrics that are collected from the single host where the agent is configured.
Follow these steps:
  1. In WebView, click Investigator, Triage Map.
    A tree shows a hierarchical view of your system with the following high-level nodes:
    • By Business Service
      -- A business-centric hierarchy of your business services, processes, and transactions.
    • By Frontend
      -- An application-centric hierarchy of your applications.
      A frontend is an instance where an application makes socket-client connections to other elements. These connections are referred to as backend calls.
      Each application has two subnodes:
      • Health
        -- Aggregated metrics across the physical locations.
      • Backend Calls
        -- Metrics for calls to other elements supporting the selected application.
  2. Look at the tree for alert indicators.
    Colored alert indicators show the aggregated status of the metric or element they decorate. Caution or danger alerts indicate problems. The indicators reflect the real-time status of monitored components.
  3. Click a node of interest.
    Depending on the type of metric or element, data appears that corresponds to the node you selected.
    • Application Triage Map
      Shows a visual display of auto-discovered components and dependencies for applications that agents are monitoring:
      • Arrow connectors represent relationships between map components. Connections between a selected component and its dependencies are emphasized with darkened lines. The corresponding map element is highlighted with a shadow, and its dependencies appear in full color. Components that are not participating in the selected element appear dimmed.
      • The customer experience (CE) icon resembles a chess pawn. This icon appears next to the corresponding BT oval when transaction impact monitors are available.
      • While Balances is a Business Transaction (BT), the alert on the Balances oval corresponds to its child BTC, Check Balances.
    • Graphs
      Shows performance values over time in a graph. In real-time views, the graph dynamically displays the most recent time period that fits in the graph. If the graph displays an alert, caution and danger thresholds appear as yellow and red lines, respectively.
  4. Mouse over a map element or a graph metric line of interest.
    Tooltips show information according to your selection.
    • Map element tooltips show the alert level and metric information.
    • Graph line tooltips show the type of problem that triggered the alert and the threshold value.
    This information helps you to investigate performance problems.
  5. View agent locations and resource locations.
    The Locations table lists physical locations for agents or resources. Browse this list and look for metric spikes on individual hosts.
  6. View the health of an application:
    • Right-click a component in the application triage map and select View Health Metrics for
      Name.
      The tree changes for the application you selected:
      • To the Health node for By Frontends.
      • To the Business Transaction Components node for By Business Service.
      The Overview tab displays the default metric graphs, for example, Average Response Time, Responses Per Interval. Danger (red) and caution (yellow) lines appear when threshold values are exceeded.
    • Click the Health node or Backend Calls node under each application that is listed under By Frontend.
      Aggregated health metrics appear for the application.
  7. (Optional) Click Hide Alert Thresholds.
    All alert thresholds disappear from the charts. Clicking this button toggles between hiding and showing alert thresholds for all charts on the page.
View Application Status and Details
You can use the Triage Map to view the status of your application and application details.
  • The By Business Service map lets you monitor the status of business services and transactions.
  • The By Frontend map lets you monitor application status.
For example, a caution or danger alert occurs on a component in the By Frontend map. Right-click the component in the map and select Show Alert Details for
Name
. The Alert Details pane appears and shows all the alerts that are associated with the component with their current state. Look for the abnormal alert.
Follow these steps:
  1. In WebView, click Investigator, Triage Map.
  2. Expand the following nodes in the triage map tree:
    • By Business Service
      -- This node displays a business-centric hierarchy of your business services, processes, and transactions.
    • By Frontend
      -- This node displays an application-centric hierarchy of your applications.
  3. Click the application for which you want to see metrics.
    Auto-discovered application components and their dependencies appear in the application triage map in the right pane. You can mouseover map elements to display metrics.
  4. Right-click a component in the application triage map and select one of the following options:
    • Show Locations for
      Name
      Opens a pane that shows location details and statuses for the application in a table.
    • Show Alert Details for
      Name
      Opens a pane that shows alert details and statuses for the application in a table.
    • View Health Metrics for
      Name
      Changes the tree for the application you selected:
      • To the Health node for By Frontends.
      • To the Business Transaction Components node for By Business Service.
      The Overview tab displays the default metric graphs, for example, Average Response Time, Responses Per Interval. Danger (red) and caution (yellow) lines appear when threshold values are exceeded.
    • Edit Alert for
      Name
      Opens the Edit Alert for Name dialog.
      Note:
      To hide any open panes, right-click a component and select the Hide option that corresponds to the pane you want to close.
View Agent Locations and Metrics
You can find agent locations that are reporting data for an application. The agent locations contain nodes that correspond to application and system resources, which contain metrics.
Follow these steps:
  1. In Webview, click Investigator, Triage Map.
  2. In the Triage Map tree, click the application for which you want to view information.
    The application triage map appears.
  3. Right-click a component or a live connection arrow and select Show Locations for
    Name
    from the context menu.
    A Locations table appears in the bottom pane.
  4. Perform any of the following actions:
    • Look for locations that are in caution or danger states -- colored cells indicate where metrics exceed thresholds.
      Note:
      Alerts in the Locations table represent the status as of the last interval; they do not observe sensitivity settings.
    • View more information -- mouseover any row in the list of locations.
      Tooltips display the path to the node in the tree where you can see more metrics that the agent reported at this location.
    • Sort the list by clicking a table column heading.
  5. Jump to the agent location in the Metric Browser tree by performing any of the following actions in the Locations table:
    • Click the blue arrow icon on a row.
    • Right-click a row and select Jump Here In Metric Browser.
    The display changes to the Metric Browser tree to show the performance metrics for the selected agent.
View Resource Locations and Metrics
You can find resource locations that are reporting data for an application, and then view health information about resources.
Follow these steps:
  1. In WebView, click Investigator, Triage Map.
  2. In the Triage Map tree, click the application for which you want to view information.
    The application triage map appears.
  3. Right-click a Resources pane and select Show Resources for
    Name
    Locations from the context menu.
    The Resource Metrics for
    Name
    Locations table appears in the bottom pane.
  4. Perform any of the following actions:
    • Look for alert indicators that are in caution or danger states. These states indicate metrics that exceed thresholds.
      Note:
      Alerts in the Locations table represent the status as of the last interval; they do not observe sensitivity settings.
    • Sort the list by clicking a table column heading.
  5. Jump to the agent resource location in the Metric Browser by performing any of the following actions in the Locations table:
    • Click the blue arrow icon on a row.
    • Right-click a row and select Jump To Agent Resources In Metric Browser.
    The display changes to the Metric Browser tree resource location. The Resources tab shows the resource metrics of the selected location.
    These metrics provide health information about resources for that location.
Monitor and Investigate Network and Agent Metrics
You can monitor network and agent-centric metrics to view the status of your networks and agents. Metric data in both tree and tab formats lets you view different types of information about the components or resources.
Follow these steps:
  1. In WebView, click Investigator, Metric Browser.
    The Metric Browser tree lists metrics and other information:
    • SuperDomain
      -- This node contains metrics for all agents that report to the Enterprise Manager. Metrics are organized in a Host|Process|Agent hierarchy.
      • Custom Metric Host (Virtual)
        -- This node represents a virtual host that contains metrics which a specific, individual agent does not report. For example, custom metrics or custom aggregated agents appear under this node. (This node does not correspond to a physical host computer.)
      • Hosts
        -- This node represents a computer that hosts an agent. Each host node contains a process node for the instance of the application being monitored, which in turn contains an agent node. The agent node contains nodes that correspond to application and system resources, which contain metrics. The application resources that appear in the agent node differ based on whether the agent type is Java or .NET.
    • Domains
      -- If the agents that report to the Enterprise Manager are organized into domains, this node contains a sub-node for each domain. Each sub-node represents an agent that is installed on individual application server host or the equivalent.
  2. Select a node in the metric tree.
    In the right pane, metrics appear that correspond to the node you selected.
  3. In the right pane, look at the metrics for alert indicators.
    Colored alert indicators show the status of the metric or element they decorate. Caution or danger alerts indicate problems.
  4. (Optional) Click Hide Alert Thresholds.
    All alert thresholds disappear from the charts. Clicking this button toggles between hiding and showing alert thresholds for all charts on the page.
  5. Mouseover a metric value point.
    A tooltip with metrics and other information appears.
  6. View live data in the Investigator, or select a range of time to view historical data. The default view of data is Live.
View General Information for Metrics
When you select a node in the Metric Browser tree, general information about the metric appears in the right pane. You can view live or historical data depending on the node that you select. The charts and graphs reflect the time range. You can display blame point metrics (errors and stalls) at different points in time.
Follow these steps:
  1. In WebView, click Investigator, Metric Browser.
  2. Select a node in the tree.
    The General tab shows a graphic view of the metric in the right pane.
    For some nodes in the tree, you can see the path to that node object in the Investigator hierarchy. For example, when you select the Frontends node, the General tab in the right pane shows this path:
    *SuperDomain*|HostName|ProcessName|AgentName|Frontends
    For some other nodes in the tree, the General tab shows the slowest top-ten view of the selected node. For example, when you select the EJB node, the General tab shows the response times of the top ten called components. These components appear in a graph chart. Graphs plot values over time. In real-time views, the graph dynamically displays the most recent time period that fits in the graph. If the graph displays an alert, caution and danger thresholds appear as yellow and red lines, respectively.
  3. Select different resource nodes in the tree. Java resources include Servlets, JSP, EJBs, and JDBC; for .NET, resources include ASP.NET, ADO.NET, and serviced components.
    The ten slowest and worst metrics for the selected resource appear in the General tab. These metrics appear in a bar chart. Bar charts display current data values as horizontal bars. The bar chart is the default view for Top N Filtered Views. The bar chart is available for live data viewing only.
  4. View the response times of the top-ten called components. Select the following nodes in the tree: Servlet, EJB, or JSP for Java, or ASP.NET, ADO.NET, and serviced components for .NET.
    The General tab shows the response times of the top ten called components in a bar chart. If you see fewer than ten bars in the bar chart, there are fewer than ten monitored components under that resource.
View Live and Historical Data
As you monitor your managed applications, live data changes continuously to show the most recent data. Historical data displays show all the data that has accumulated in the historical data store.
To view historical data, you use the Time Window control to select a time range. Using a time range can help you quickly identify the time that a problem occurred. For example, if you think a problem occurred in the last hour, set the time range to 1 Hour. Look at the data from the current time backward. If you do not see the event within that hour range, use the controls to move backward to locate the time that the problem occurred.
Note:
The Time Window control is visible in WebView with a few exceptions such as transaction trace and some portions of management modules. The following procedure uses the Investigator tab as an example.
Follow these steps:
  1. In WebView, click Investigator, Metric Browser.
  2. Navigate *SuperDomain* tree to the metric for which you want to see historical data. For example, navigate to *SuperDomain*,
    Host_Name
    , Tomcat, Tomcat (*SuperDomain*), Servlets, Average Response Time.
  3. Select a time value for the historical view from the Time Window drop-down list. The following option requires explanation:
    • Custom Range
      Opens the calendar feature, which lets you specify start and end dates and times.
    The data for that time range appears. The end time is set to the current time unless you use a custom range with a different end time.
    Note
    : The value that you select in the Time Window control applies to other metrics or dashboards in the same window. For example, if you select a 1 Hour value, this value applies as you view the Errors tab or other tabs.
  4. (Optional) Adjust the time range using the Change Start Time arrow controls.
Search for Available Metrics
You can quickly find metrics to detect and fix issues.
Follow these steps:
  1. In WebView, click Investigator, Metric Browser.
  2. Select a node in the tree that contains metrics. The node sets the scope of the search. For example, if you select a Frontend in the tree, search returns only the resources with metrics under that node.
  3. In the right pane, click the Search tab.
  4. Enter either a string or a regular expression in the Search field. If you want to enter a regular expression, select the Use Regular Expression check box.
  5. Click Go.
    The Search table lists the resources with metrics that match the search argument, and the value for each. To sort the list by the contents of that column in ascending or descending order, click a column header.
  6. To find more information in the Search table, use the following short-cuts:
    1. Click Show More to display search results in the Min, Max, and Count columns.
    2. In the first column, select one or more check boxes, and click Draw Graph.
      The graph in the lower pane displays the current (live) metric data view. You can mouseover a data point to open a tooltip with more information.
    3. To view historical data in a graph, select a time range option from the Time Window drop-down list.
    4. In the second column, click the blue arrow icon to follow the metric to its default location in the metric browser tree. The General tab displays a graph of the current metric data view in the right pane.
  7. Click a different node in the tree that contains metrics. The search argument used in the previous search remains active, and is applied to the selected node.
More Information: