This is the second in a series that explains what tools are available to achieve end-to-end monitoring and management. Part one, “End-to-End Monitoring and Management is Worth the Effort,” explained the importance of monitoring and management and offered a five-part way to plan for it.
Implement Effective Monitoring
There are world-class tools available from IBM and other suppliers to get the job done effectively. Both IBM Tivoli NetView and IBM Tivoli OMEGAMON XE have deep monitoring capabilities. There is overlap in the product functionality, yet many System z customers use both product families simultaneously. Tivoli NetView for z/OS provides systems and network monitoring and management to address the needs of IT departments. It also has an equally important role to play in System z automation.
Tivoli OMEGAMON XE provides thorough monitoring and problem management throughout the System z and zEnterprise system. This software provides visibility that improves the usability and performance of System z and its important subsystems. In addition to the OS, there are Tivoli OMEGAMON XE products with specific support for z/VM and Linux on System z, as well as storage, CICS, DB2, IMS and messaging on the z/OS platform.
Most IT departments running System z servers have embraced monitoring. Early in their IT experiences, they saw the benefits on monitoring, and were more than willing to pay the cost in software licensing and resources to run and support their monitoring approach. Where some IT departments need help today is in centralizing, consolidating and better integrating their monitoring so they can do take a more holistic view of the monitoring data from all their systems.
Effectively Use the Management Toolset
We often separate monitoring from management in our discussions because they are so different, however most system and network management products offer a combination of monitoring and management functionality. IT personnel that have different levels of responsibilities and areas of focus use products like Tivoli NetView and OMEGAMON. The products are organized functionally to meet the needs of level one, two and three personnel where each level requires an increasingly specialized and deeper level of information. The products, mainly through the specialized functionality of their product family, make it possible to dive deeply into subsystems like CICS, IMS and DB2 subsystems.
A special area of the management toolset is automation software. Since the mid-1990s, many mainframe operations groups changed their focus from operating the system through message monitoring and actions to handling exceptions for a system that has been automated to a high degree. This automation has been achieved in different ways using different tools and products. Some IT departments created projects that suppressed messages, automated the response to messages and implemented monitoring commands that included built-in automated recovery actions. System programmers and operations personnel staffed these projects.
Other IT departments chose the approach that achieved a high degree of automation without a lot of staff involvement, implementing automation through products that were turned on over a weekend. Monday morning, the IT world was a different place for operations personnel. Today, IBM has a variety of products that provide system and network automation (see Table 1).
Table 1. Automation Products Offer Choices
Invest in Software Integration
Software integration in System z environments is possible using the rich variety of interfaces available. If you focus on a few ways to display and process monitoring data, then the challenge is getting the results of the monitoring into that common viewing point. That is where the interfaces come in.
z/OS messages are an important interface. Messages can be generated by Write to Operator macros or commands, and after they pass through the Message Processing Facility they can be automated by a number of products. When NetView is used, the message can be trapped in the message table and actions can be taken. The actions could use another interface to issue MVS commands to take a wide variety of actions.
Reuse Assets Where Possible
Since the 1990s, a lot of effort has been spent in the creation of application-monitoring software. Today, products like Tivoli Composite Application Manager for Application Diagnostics, Tivoli Composite Application Manager for Microsoft Applications, Tivoli Composite Application Manager for SOA Platform and Tivoli Composite Application Manager for Transactions allow IT departments to put special focus on ensuring the availability and performance of applications. At the same time, application-quality initiatives have IT departments implementing stress and performance testing using custom-built scripts and products like IBM Rational Performance Tester that enables emulation of multiple users to test performance.
It is documented that IT departments are using the same scripts and product-based test scenarios to provide basic application monitoring during normal operation. This is not an automatic byproduct of stress testing, as monitoring in production assumes the presence of test data and access rights of users to the applications but these challenges can be overcome. Other reuse opportunities exist where interfaces to the problem-management system have been developed. Some IT departments have expanded the use of these interfaces by the application-development community.
Expand Typical IT Measurements
Because applications have gotten more complex with components running in multiple environment simultaneously, IT departments are challenged to expand their IT measurements to more faithfully capture and depict what the end user is experiencing. Taking a component-by-component approach to these applications doesn’t work effectively.
Application Response Measurement (ARM) is an open standard published by the Open Group for monitoring and diagnosing performance bottlenecks within complex enterprise applications. ARM is hardly new but it is more useful than ever for a number of today’s application. ARM includes an API for C and Java programs that allows timing information associated with each step in processing a transaction to be recorded to a remote server for later analysis.
If you need additional incentives, consider that popular software is already instrumented with ARM calls (see Table 2).
Table 2. Already Instrumented With ARM
There are many tools and products to tackle a monitoring and management implementation. Even if you have products implemented, perhaps it is time to take a fresh look at your approach and toolset to see if you are missing an opportunity to innovate and improve your solution.
Is it time to consolidate monitoring data sources to a few centralized views to increase integration and improve insight? What about the response time of your applications: Are you gathering and consolidating that data or are you relying on component-by-component status to determine your application’s response time?
Joseph Gulla is the IT leader of Alazar Press, a publisher of children’s literature.