Create Custom Plug-ins with EPAgent

You can create custom plug-ins with the EPAgent to collect additional metrics specific to your application.
apmdevops97
You can create custom plug-ins with the EPAgent to collect additional metrics specific to your application.
The two guidelines are:
Metric Data Format
The EPAgent can parse metric data provided by plug-ins or other metric-producing programs plugged into the EPAgent using either Simple or XML formats.
Simple Format for Metric Data
Specify one metric name and value per line using the format:
<metric_name>=<value>
For example:
diskWrites=37
You can also include a reference to a resource segment:
<resource_segment>:<metric_name>=<value>
For example all one line:
Resource Usage|File IO:diskWrites=37 Apache Errors:LastErrorString=ERROR: Apache shutdown unexpectedly
Simple format guidelines:
  • In the simple format, the metric name should not contain an equals sign (=). If there is need for an equals sign in the name, use the XML format.
  • The value may contain an equals sign (string), and the EPAgent always parses all characters up to the first equals sign (left to right) as the metric name, and all characters after the first equals sign as the value.
  • Any value composed of numeric digits will be interpreted as numeric data and reported as a CA Introscope; "IntCounter" type.
  • Any value composed of anything other than numeric digits will be interpreted as string data and reported as an CA Introscope; "string event" type.
XML Format for Metric Data
Where the simple format limits the Introscope metric types, XML style format allows the plug-in to report additional information, such as Introscope metric name, Introscope metric type, and value, as in the following example:
<metric type="LongCounter" name="Resource Usage|File IO:diskWrites" value="37" /> <metric type="StringEvent" name="Apache Errors:LastErrorString" value="ERROR: Apache shutdown unexpectedly" />
XML format guidelines:
  • This format allows full support of Introscope data types and equal signs in both metric names and values.
  • The type attribute of a metric must be one of the following:
    • PerIntervalCounter -- the value is a rate per interval where the interval can change. These metrics are aggregated over time by summing the values. For example, if there were 10 method invocations per 15 seconds followed by 15 method invocations per 15 seconds, then aggregating to 30 seconds would result in "25 method invocations per 30 seconds".
    • IntCounter -- int values can go up and down
    • IntAverage -- int value that is averaged over time
    • IntRate -- the value is a per second rate. These metrics are aggregated over time by taking the average of the values.
    • LongCounter -- long values can go up and down
    • LongAverage -- long value that is averaged over time
    • StringEvent -- represents a type which periodically generates Strings. This recorder does not have a notion of current value; it merely reports events in the order in which they are reported to it.
    • Timestamp -- a type which generates successively increasing timestamps.
  • The comparison is case-insensitive (to make it easy for the plug-in writer). If a numeric type is supplied, but the value is non-numeric, nothing is reported to Introscope and the EPAgent will log an error.
Considerations With Either Type of Custom Plug-in Format
Consider the following precautions when you use either type of custom plug-in format:
  • Supporting both formats causes an interaction where the system does not recognize metric names that start with the less-than sign.
  • With both formats, if the format is unparseable, such as garbage or incorrect syntax, the EPAgent ignores the line and logs an error.
  • If the plug-in returns multiple lines, parsing continues with the next line.
  • For each metric name, only one metric type can be specified. If more than one type is specified, the following error occurs:
    mm/dd/yy hh:mm:ss PM PDT [ERROR] [EPAgent] Metric name from plugin 'Plugin <plugin_name>' is invalid: "<metric_name>" is already in use by another DataRecorder of a different type
Error or Event Data Format
The EPAgent can parse error or event data provided by plug-ins in two different formats:
  • Simple
  • XML
Simple Format for Error or Event Data
In general, simple format scripts start with the following fixed string:
event:
Text after the colon:
  • is part of a "name=value" pair, with each pair separated by the ampersand character (&).
  • is an optional parameter to the event.
The example below is the output of a hypothetical script that monitors the Firefox browser process and sends a notification when the browser exits.
event:type=processWentAway&processName=firefox
Simple XML Format for Error or Event Data
Events can also be specified in an XML format, which gives the full expressive power of events in the Agent. The simplest XML format event gives the name of a resource that generated the event (an example might be "Connection Pool" or "Java Virtual Machine"). The example below provides a notification that some event happened in 
Some Resource
.
<event resource="Some Resource"/>
The timestamp will be the time the event was created, and the duration of the event will be zero.
XML Format for Error or Event Data With Parameters and Time Data
You can configure event notification with an explicit timestamp and an explicit duration. The timestamp format is any Java-parsable format. The duration is in milliseconds. The example below is an event with a duration of one minute (60,000 milliseconds):
<event resource="Some Resource" startTime="123003000" duration="60000"> <param name="urgent" value="true"/> </event>
Create an Error Snapshot in XML Format
An error snapshot must indicate its type as an error snapshot in the parameters:
<event resource="Some Resource" startTime="123003000" duration="60000"> <param name="Trace Type" value="ErrorSnapshot"/> </event>
Nested Components
The example below shows an event with nested subcomponents. An event can have from zero to an infinite number of sub-components, and each of those can also have from zero to an infinite number of sub-components. In practice, the level of nesting tends to be small or zero.
<event resource="Some Resource"> <calledComponent resource="Another Resource"> <param name="isCorrelated" value="uncertain"/> <calledComponent resource="A Third Resource"/> <calledComponent resource="A Fourth Resource"/> </calledComponent> </event>
EPAgent Events and Transaction Traces
You can view EPAgent events in the Event Viewer as transaction traces by selecting the Trace View tab.
Trace views are easier to understand when time information is contained in the event sent by the EPAgent. To include time information, use the startTime and offset attributes on the <event> and <calledComponent> tags.
The startTime attribute is absolute time. Its format is anything that java.util.Date.parse() can parse. Specifying startTime in the <event> element is not required -- if absent, it defaults to the value of the current time, as specified by the Java methods System.currentTimeMillis() or new Date().getTime(). Omitting startTime from a <calledComponent> element makes the time default to the time of the containing element, so that if no startTime attribute is specified anywhere, everything defaults to the current time.
The offset attribute is an integer value. It is interpreted as time in milliseconds and is added to the startTime attribute (whether startTime is default or explicit) to produce the actual time reported for the <event> or <calledComponent>.
Example 1
<event resource="Customized Web Server" startTime="123456789" duration="500"> <calledComponent resource="Web Server Module" offset="300" duration="100"/> </event>
The trace view of this event has "Customized Web Server" starting at 123456789 and "Web Server Module" starting at 123457089 (123456789 + 300). Specifying a duration in each element produces a useful trace view showing:
  • "Customized Web Server" running 300 milliseconds
  • "Web Server Module" called by "Customized Web Server" and running for 100 milliseconds
  • "Customized Web Server" running for another 100 milliseconds after "Web Server Module" returns
Example 2
<event resource="Customized Web Server" duration="500"> <calledComponent resource="Web Server Module" offset="300" duration="100"/> </event>
This example is similar to Example 1 except that "Customized Web Server" starts at the current time, and "Web Server Module" starts 300 milliseconds later. Note how no part of this example requires the EPAgent script to know the current time.
Example 3
<event resource="Customized Web Server" startTime="123000000" offset="1000" duration="5000"> <calledComponent resource="Web Server Module" startTime="123003000" duration="200"/> </event>
Here "Customized Web Server" starts at 123001000 (123000000 + 1000) and "Web Server Module" starts at 123003000.
Notice again how specifying durations promotes readability and usability. Incorrectly specifying startTimes, offsets and durations can make trace views hard to read, so care must be taken when using them. In particular, the start time -- computed by adding startTime and offset of a <calledComponent> element -- should always be after the start time of its containing <event> or <calledComponent> and the (start time + duration) of a <calledComponent> should always be less than the (start time + duration) of its containing <event> or <calledComponent>.
XML Schema for Error or Event Data
The formal XSD schema supported is:
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified"> <xs:element name="event" type="eventElement"> <xs:annotation> <xs:documentation>The root element for events. This element is nearly equivalent to the calledComponent element, except that the event element must occur only once, at the outermost level.</xs:documentation> </xs:annotation> </xs:element> <xs:element name="param"> <xs:complexType> <xs:attribute name="name" type="xs:string" use="required"/> <xs:attribute name="value" type="xs:string" use="required"/> </xs:complexType> </xs:element> <xs:element name="calledComponent" type="eventElement"> <xs:annotation> <xs:documentation>A component called by the containing element. This element is nearly equivalent to the event element, except that this element cannot occur at the outermost level. </xs:documentation> </xs:annotation> </xs:element> <xs:complexType name="eventElement"> <xs:sequence> <xs:element ref="param" minOccurs="0" maxOccurs="unbounded"/> <xs:element ref="calledComponent" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute name="startTime" type="xs:dateTime" use="optional"/> <xs:attribute name="offset" type="xs:integer" use="optional" default="0"/> <xs:attribute name="duration" type="xs:dateTime" use="optional" default="0"/> </xs:complexType> </xs:schema>