Monitoring and Administration Needed for CICS Effectiveness
This article focuses on CICS performance across all platforms. On the mainframe CICS is known as CICS Transaction Server (TS), but on other platforms the product is named TXSeries, and for special functions, CICS Transaction Gateway. All products are developed and supported by CICS development, and are designed to interact with each other via facilities such as CICS Intercommunication. While techniques, methodology, syntax, parameters and implementation specifics may vary, most functions, concepts and usage apply to all processing platforms. Thus, the term “CICS” in this article applies to both CICS TS and TXSeries unless otherwise stated.
Last month’s article discussed the importance of, and steps in, establishing a performance data collection, reporting and investigation infrastructure, a prerequisite to any effective performance tuning effort. This is the precursor and foundation for an efficacious performance management discipline, one of the most vital components of a responsive and effective IT system. With constant evolution of hardware, a steady stream of software enhancements and releases, shifting and increasing business volumes, and new technological innovations, only a vibrant performance monitoring and administration system can maintain IT effectiveness.
This is especially true of CICS because it’s a central component of online systems whose effectiveness is primarily driven by response time, a key factor of employee productivity and organizational profitability. Consequently, CICS is built to be tunable, populated with parameters that vary internal product mechanisms to optimize performance, adapting to differing constraints and application attributes. Additionally, CICS interacts with performance drivers like Workload Manager (WLM) and CICS Tools. CICS Tools provide system and application analysis and guidance via CICS Interdependency Analyzer and CICS Performance Analyzer.
Key questions at this point of tuning infrastructure formation are: What happens when the performance problem isn’t due to CICS deficiencies or misconfiguration? What if the total system is overloaded and/or improperly configured? Answer: CICS tuning will have little or no positive impact on performance, so a CICS performance review is substantially a waste of time. It’s not the primary culprit in the first place. Consequently, performance data now being collected must first be used to survey overall system performance and identify performance constraints.
System Performance Monitoring and Reporting
Taking a systemwide perspective of performance requires more than data collection and monitoring; the next step is to understand system performance characteristics and how different components interact. If CICS isn’t getting sufficient processing resources, a performance study begins with comprehensive scrutiny of resource consumption by all IT work, by resource type (e.g., processor cycles, real and virtual storage, I/O that provides data for transactions, bandwidth to other CICS nodes, etc.). Tuning becomes an exercise in identifying resources CICS is deprived of and who the major consumers of those resources are. Dispatching priorities, WLM settings, workload intensity, disk response time and many other performance measurements need review, and products like RMF, AIX Applications Manager, Windows Performance Analyzer or multi-platform CICS Performance Analyzer can provide guidance regarding resource consumption by component.
When system performance data indicates either too much workload is concurrently running, CICS is insufficiently prioritized or is getting inadequate resource, system tuning precedes internal CICS tuning. This tuning can take a variety of forms, such as:
- Adding more hardware resource (e.g., upgrading to a faster processor, increasing real storage, add a specialized processor for functions like cryptography or Java, adding or allocating more disk, etc.)
- Reconfiguring and reallocating hardware resources, workloads, prioritizations, dataset layouts, job scheduling, etc.
- Restructure WLM to increase CICS accessibility to highly-constrained resources
- Investigate and implement efficiencies so other system components perform better (e.g., reorganizing data sets can significantly reduce I/O data transfer time)
- Eliminate unnecessary components, facilities or processes
- Replace existing functions with more efficient ones
- Identify and implement performance changes unique to a system’s particular configuration (e.g., adding bandwidth for file transfer processes or implement an FTP server)
- Tuning of large-consumption processes contending with CICS
- Identify and eliminate serial processes or resource-locking mechanisms
Narrow It Down
When a significant body of performance data has been accumulated, more in-depth analyses can be performed. The first step after gathering a good collection of performance data should be to evaluate resource consumption and rank it in terms of performance consumption on an overall, daily and hourly basis. Not only can this be valuable information in terms of identifying primary competitors for IT resources, it can also identify times where workload can possibly be moved and times when resource consumption is most severe. Substantial workload has to stay where it is, but sometimes a relatively small reorganization can have a relatively large impact. This holds true whether the resource constraint is processor resource, real storage, virtual storage, I/O activity, network constriction or other constraints.
A key task is identification of high utilizations, because as device or path usage increases linearly, service time increases exponentially. A queue is a waiting line, and queuing theory examines the effect of wait time on system throughput and response, plus cost tradeoffs of waiting versus resources to reduce wait time. Anyone who’s driven the Kennedy Expressway into Chicago on a Monday morning has endured the queue phenomenon, much to their dismay, and the very same effect applies to computer resource usage. As utilization increases, wait time, queue length and congestion increase; the following determinants control how values change based on activity:
- Arrival rate and size: The frequency in which units of work arrive and the size of the unit of work
- Servers and server size:. These provide exits from a queue and the speed with which they can expedite the exit
- Distributions: The pattern of entrances into and exits from queues
- Arrival rates: The frequency and pattern at which units enter queues
- Service rates depend on server speed, which describe the time from when a unit leaves a queue until it moves to the next path segment
- A path may consist of one or more queues and one or more servers
- Two types of queues with different characteristics exist. A single-server queue processes one unit of work at a time and all units behind it wait, while a multi-server queue can process two or more units concurrently. When any server completes a unit, the next unit in the queue is accepted.
- The math to calculate throughput of a single-server is different—and simpler—than the math used to calculate multi-server throughput
Believe it or not, thanks to an Operations Research degree, years ago I crunched queuing numbers with a calculator while performing tuning reviews, but these days numerous calculators are on the internet. More importantly, many tuning tools include queuing logic for generating performance projections. While for the most part it’s useful to let the tool do the work, it’s a worthwhile exercise to do an Internet search on “queuing theory calculator,” several appear. It’s fascinating and amazingly informative to see how a small arrival rate reduction can result in major performance improvements and vice versa, how sometimes minor service time improvements make major performance improvements and many variations that often aren’t so obvious.
Queuing theory’s vital for performance tuning. It illustrates the nature of information processing and the complex path a transaction takes to accomplish its work. The lessons learned from it are sometimes almost counterintuitive. For example, increasing servers probably won’t cut response time much if service time is relatively high, yet a slight reduction in arrival rate may significantly improve response time when utilization is high. Queuing theory points the way to improved performance.
It Ain’t Easy
Performance tuning is a highly complex process. It’s not just a matter of product knowledge, but also Operations Research principles such as queuing theory, modeling techniques and simulations. Just as important is the acquisition and usage of performance tools, diligent monitoring and reporting, and hard-won experience. Maybe a little luck, too.
Tuning takes patience, persistence, studious attention to numbers and creative intuition, because with all the working parts and interdependencies, problems are rife and answers are conditional solutions that keep shifting and changing. While a CICS performance issue may be due to internal processing and parameter settings, outside influences are bereft, and frequently the cause of degradation. The mixture of work that goes on within a mainframe, LINUX/AIX/UNIX-based or Windows/TXSeries system is varied and complex, and CICS tuning is only part of the solution in well-performing, interactive online processing.