Enterprise
provider
A diversified financial services
firm deployed an
solution to help them manage their
complex infrastructure as an integrated system rather than as sets of unrelated
silos.
The primary concern of the
financial enterprise was that more than 100 mission-critical applications were
used by their business units, such as applications to monitor their exposure to
interest and currency fluctuations. Basically, the enterprise had too many
mission-critical applications for the network teams to track. At the same time,
their network changed and evolved so quickly that the application teams could
not stay abreast of the changes.
As a result, when network
failures occurred, or when network maintenance was required, they had no way of
knowing which applications were affected. Attempts to do this with traditional
tools—where one specified network dependencies for each application—failed
because of the need to manually maintain the correlation logic every time the
network topology changed. If they missed a change, the correlation would fail.
By deploying an
VMware Smart Assurance
solution, the enterprise was able to solve their
problem by separating the network, applications, and business analysis into
separate domains, thus breaking the problem into manageable pieces, as well as
automating the most complicated part of their problem: defining the network
dependencies of the distributed applications. Their
VMware Smart Assurance
solution consisted of the following:
- IP Availability ManagerandIP Performance Manager— To monitor failures in the network infrastructure and identify the affected systems, automatically calculating the affect of any network path redundancies
- VMware Smart AssuranceApplication Services Manager — To perform root-cause and impact analysis of distributed applications based on events and topology information imported from theIP Availability Manager,IP Performance Manager, and the SAM adapters
- — To correlate failures in the technology infrastructure to the businesses they impact (for example, processing on NASDAQ trades)
- — To bring it together into a single system, where users can customize views of the data they need, and use the appropriate tools to quickly restore service
- — To provide web-based, personalized views for business unit managers (for example, in the foreign exchange unit) showing a map representation and a summary view of the status of their systems.Enterprise solution illustrates theirVMware Smart Assuranceimplementation.Enterprise solution
By connecting the analysis from the network, application, and business domains, provided the enterprise with an end-to-end view of causes and impacts — automatically correlating network failures with their impacts on applications and critical business processes. For example, it showed that a switch failure blocked access to a Solaris system that ran one of the two Oracle databases that supported their accounts receivable application. This resulted in degraded performance of the accounting application, and placed their ability to complete their quarterly financial reporting on time at risk.By knowing the business impact of each problem, the operations staff effectively prioritized their support efforts on the most critical problems; that is, they aligned their efforts with the overall business objectives of the enterprise.At the same time, becauseIP Availability ManagerandIP Performance Managerautomatically adapt to changes in the network topology, operations did not need to devote teams of people to modify the analysis every time the network topology changes. This dramatically reduced the cost to maintain their management systems. In fact, because all of the logic is based on the infrastructure topology, they only needed to manually maintain the business and application elements.VMware Smart Assurancesoftware automatically adapted everything else: root-cause and impact analysis, personalized views, and so forth.In addition to solving the problem of application and business impact, the also streamlined operational processes.By identifying the root causes, the support staff eliminated most of the time-consuming fault isolation process and immediately proceeded to problem resolution. By identifying which events are impacts of other problems, the support staff avoided wasting time trying to chase down and fix symptoms of other problems. Finally, by relating the root causes with the impacts, operations staff streamlined communications. For example, when estimating the time to resolve a root-cause problem, that information was automatically propagated to all affected systems, applications, and business processes.Because they identified the actual root-cause failures (for example, a card on a switch), they were able to automate the dispatch of service technicians from their maintenance provider. More importantly, because they positively identified the failed element, they avoided the delays caused by technicians who arrived on site with the wrong parts.By integrating theVMware Smart Assurancesolution with their existing system and application management tools (by way of SAM adapters), the enterprise was able to leverage their previous investments in management tools and accelerate the deployment of the overall solution.