Tuesday, June 24, 2008

z/OS New Age Performance Monitoring : Concepts & Directions

By Norman D. Hollander
In todays complex data center environments, monitoring performance remains an important function. Methods applied in the past worked well for those old environments. This article explores some of the new age concepts and directions we must adopt to provide effective resource utilization and take the necessary corrective actions to protect our precious resources.

We need to accept several concepts that differ from the way we traditionally view performance data. Some of these fit nicely into the previously discussed historical perspectives, while some are more appropriate for how we should view performance today.

A processor (physical or logical) running at 100 percent utilization is not necessarily a problem. Workload Manager (WLM) tends to keep the processor busy. As long as Importance 1 and 2 workloads in their related service class periods are meeting their goals, why is this a problem? This does presume heterogeneous workload and a good WLM service policy.

The WLM Performance Index (PI) is a principle indicator of system performance. We need to monitor this at a system and Sysplex viewpoint. Are defined goals being met? Are the PIs appropriate for the workloads running in the associated service class periods? Is any subsystem’s performance suffering? Validating goals is an iterative process.

A system utilizing real storage at 100 percent is not necessarily a problem. WLM tends to keep the real storage full. The real issue is paging and, more important, what type? Who is paging? Many mission-critical business applications don’t tolerate any kind of paging. With real storage so inexpensive, it’s easy to add real storage to solve that kind of performance challenge. A system utilizing virtual common storage near 100 percent may be a serious problem. Many operating system components have moved their common storage requirements above the line and above the bar. System Queue Area (SQA) may be overflowing into Common Storage Area (CSA), which may not be a real problem. But when CSA becomes full, there could be a real problem.

Coupling facility performance is one of the critical metrics to overall system performance. A poorly configured, poorly tuned coupling facility will severely impact performance of the components and applications using it. The availability of multiple coupling facilities with sufficient capacity allows mission-critical applications to continue in outage scenarios. It’s important to understand who is using the coupling facilities and how well they’re servicing those users.

Cross System Communications Facility (XCF) performance is an important metric to overall system performance. Poorly defined XCF transport classes will negatively impact performance of the system components and applications using it. The definition of multiple transport classes with proper MESSAGELENGTH and buffers will ensure that CPU usage in the XCFAS address space will be optimal. It’s important to understand who is using the XCF facilities and how well they’re servicing those users.

DASD I/O should not be monitored at a device level (may be applicable to robust tape environments). With large numbers of volumes in many of today’s data centers, I/O monitoring must occur on an exception basis. Monitoring at the higher storage controller level is often beneficial. The metrics to look for include:

• Contention for specific devices
• Who is waiting?
• Where’s the time being spent to do the I/O?
• Is WLM involved? (If not, maybe it should be.)

Parallel Access Volumes (PAVs), Hiper PAVs, and I/O priority can help with I/O bottlenecks.

What’s Next?

Before moving into new directions for monitoring, we first need to incorporate the new age concepts just discussed. We need to:

• Display only relevant information and exceptions to the appropriate teams responsible for monitoring and managing our systems.
• Establish a new “expected client performance service” rating. Integrating more autonomic processes into our automation solutions will be important to the success of enterprise performance monitoring.

Tuesday, June 17, 2008

Tips & Tricks for z/OS Storage Managers - B. Curtis Hall

Most storage administrators didn’t choose that profession; recent headcount reductions have often shifted the responsibility to other technical support staff members. Often, storage administrators have had other primary jobs; they haven’t had the luxury of taking time from their daily job to concentrate on training and learning a new skillset.

But storage won’t just manage itself. A newly hired or newly appointed storage manager can soften the impact of doing more with less and can help a company work smarter. Planning and organization skills can help the new storage manager make the transition less painful, increase productivity, and gain measurable results. Time invested at the beginning of the new job can have significant payout in the coming months. This article presents several steps a new storage manager can take to become more productive in the OS/390 or z/OS environment.

1. Obtain a copy of the previous storage manager’s JCL library. This library will have most of the jobs, setup utilities, and prewritten procedures needed to successfully manage the new environment.

2. Track down and obtain a copy of the Standards and Procedures Guide for Storage Management. The documentation will be useful in making modifications to existing procedures such as System-Managed Storage (SMS), Asynchronous Communications Server (ACS) routines, Generation Data Group (GDG) builds, Volume Table of Contents (VTOC) and VSAM Volume Data Set (VVDS) sizing in the shop, storage migration schedule start and end times, restoring data sets, backing up and copying data sets, SMS Management Class retention information, and Hierarchical Storage Management (HSM) information. If this manual isn’t currently available, start organizing one. The manual will benefit you during off-hour support issues and will be your salvation when you later try to get in some vacation time.

3. Obtain mainframe access immediately. Almost all organizations provide offsite access to the mainframe. It’s a necessity to carry both a cell phone and a pager. Time is of the essence when storage problems occur. Most companies will reimburse their employee for DSL service, a pager, and a cell phone. The financial impact of downtime, or of slow response time created by storage problems, easily offsets the expense.

4. Gain the knowledge necessary to do your job well. Education classes, hands-on training, conferences, and trade magazines are the best means of gaining knowledge. Mainframe hardware and software vendors are a good source for this education. There are also companies specializing in this field. At a minimum, take a basic Storage 101 class covering storage-related system software, and vendor education on the disk hardware and software used at the site. If you’ll also be responsible for the tape system, you must understand the hardware and software involved there, too. This tape knowledge might need to encompass managing hardware such as silos, virtual systems, and extra dense drives. You must understand all the hardware and software related to your storage systems.

Networking with other storage managers is invaluable. “Picking the brain” of existing, long-term storage managers is an important way to gain knowledge. This keeps you, as a new manager, from re-inventing the wheel. When you try to absorb the knowledge of others, you’ll be amazed by the backgrounds of those who have been in the OS/390 or z/OS world for a long time. At some time in their careers, most have been in the same position as any other novice storage manager. Network locally with other companies through any introductions you can arrange, and participate in storage user groups. Most issues you’re facing are similar to those solved by others in the field.

Be sure your reading list includes these magazines and periodicals: z/Journal, Cheryl Watson’s TUNING Letter, Information Week, and ComputerWorld. IBM once published a manual titled How to Lead an Effective Storage Management Group. It was excellent; but it’s no longer in print. Additional sources from IBM include the Redbook series. All IBM information can be accessed through their Website.

5. Automate your storage procedures. Although most mainframe shops have started tasks to perform storage management, some don’t. If a batch process controls the storage management system, ensure the jobs are executed through a scheduling product. Automate other storage procedures such as capacity planning, storage volume reporting, storage migration activities, backup activities, and recall activities. HSMs have these reporting functions built into the product, but most require manual effort to extract the information. The reporting functions should be automated for quick, easy analysis.

6. Evaluate and purchase a capable storage-reporting tool. I’ve been a storage manager for almost 20 years and have evaluated several dozen storage-reporting tools. Most tools can produce the reports you need to manage a storage environment. The biggest issue for these products is scalability. Most tools aren’t capable of scaling up and growing with the storage farms. The reporting tool should be flexible and able to report on the whole DASD farm or just a subset of it. The tool should be capable of drilling down to the storage group, storage volume, and the data set levels. It should be able to execute an action at the data set level, including migrating, recalling, deleting, renaming, backing up, and reporting. The reporting tool must be capable of summarizing and reporting on all storage management activities. The tool must be flexible enough to produce ad hoc reports based on several attributes of the stored data, including data set names, size, block-size, record format, storage group names, and management class names. This tool must be fast, easily used, and support both Time Sharing Option (TSO) and ISPF. Mainframe storage reporting should be and remain a mainframe tool. This type of tool will let you access and evaluate the entire mainframe environment in less than 30 minutes daily, and frequently during the day, if necessary.

You should have knowledge of a high-level language such as SAS or Easytrieve. This type of language lets you create solutions that significantly exceed the capabilities of IBM’s ISMF tool. You also should be familiar with a non-disruptive volume mover such as Innovation’s FDRPAS or Softek’s TDMF. These tools are useful for moving entire volumes without impacting data use or the rest of the environment. No stand-alone time is needed to accomplish the entire volume move. Depending on the hardware you use at your shop, you’ll get much value from EMC Control Center or Catalog Solution, IBM utilities, and DFDSS, to name a few. You’ll find that proficiency in some of the Microsoft software products, such as Excel, is invaluable, and will give you the power to help express yourself clearly and graphically.

A challenge you may encounter in today’s environment is gaining a thorough understanding and mastery of the many iterations of Redundant Array of Independent Disks (RAID) technology, as offered by the various hardware vendors, and what each can provide in your shop. Knowledge of RAID-1, RAID-S, RAID-5 and RAID-10 technology and the pros and cons of each can make your organization a success or failure. Further knowledge of the hardware’s storage internals, such as caching, disk adapters and hard drives, will help you when making hardware purchase decisions.

Don’t forget about the mirroring capabilities of the storage currently on the market. Be sure the hardware you choose will accommodate current and future needs. Don’t buy all the “bells and whistles” unless you plan to use them. Sometimes, you can have unused extras on hardware that can cause vendor microcode problems in your storage complex; you don’t need a “Cadillac solution for a Volkswagen implementation.”

7. Join a user group. Knowledge sharing and networking within various user groups can be beneficial. IBM’s SHARE, SAS Users Group International (SUGI), and the Computer Measurement Group (CMG) are just a few of the support groups available. Members of these organizations are willing to share information freely on storage management procedures and methods. Some of these organizations require membership dues, which your employer will probably pay. IBM and EMC each have storage symposiums that meet annually and cover current and future storage technologies. Since these are two prominent vendors, it’s wise to arrange to attend their symposiums when possible. You should also attend customer presentations in which presenters often share storage solutions they developed internally to meet their shops’ specific challenges.

8. Improve your communication skills. Since you need to interact with user groups and higher management, good communication is a requirement. If this is a weak area for you, join a support group geared toward improving your communication skills. Toastmasters International is an example. Effective communication is a skill you must constantly hone anyway, so time spent here is useful no matter which direction your career takes.

As a first-time storage manager, be sure you’re ready to enjoy learning a whole new world. Everyday challenges and rewards will quickly become apparent. The longer you explore this field, the more valuable you’ll become in managing any storage complex. If you don’t learn something new each day, you need to keep digging to advance your knowledge. Welcome to the world of storage managers. Enjoy!

Tasks for a New Storage Manager

Here’s a partial list of some common tasks a storage manager should perform on a set schedule:

Daily:
Monitor space usage for the critical storage groups
Ensure SMS volumes are in the correct SMS status (enabled/disabled)
Identify volumes that have disabled VTOC indexes and correct
Monitor backup and migration jobs and identify data sets that aren’t cataloged.

Weekly:
Defrag essential storage
Monitor third-party software
Monitor DFHSM CDS file (MCDS, BCDS, and OCDS) and DFHSM tape recycle process.

Preventive measures (daily and weekly):
Identify large Keyed Sequence Data Set (KSDS) VSAM nearing the 4GB limit, data sets not using optimum block sizes, and under-allocated files
Monitor Partitioned Data Sets(s) (PDSes) nearing 16 extents or in need of file compression.

Good Luck!

Monday, June 2, 2008

My daughter started pre-school today


My daughter "Nandana" started going to junior KG today and I feel excited and sad because it seems like yesterday that she was a bundle of joy in my arms. It is hard to see our little one leave, because we are so use to them being there with us, and we are use to being there to protect them. Just the idea of them not being there for you to protect is enough to set off the attack. I am so sad right now. Yet I am so happy! It’s only the first day but it seems they grow up so fast and I m panicking! Maybe the next time will be better for me.

Well I think it's just normal that we are so concerned with our child's first day of school. We are really worried how a child will cope up in the stress of school. If she will be able to have friends and will be able to adapt to the new activities she will have from this day. Well I hope I will get relaxed the next day and hope my daughter will enjoy her stay on school.

Sunday, June 1, 2008

IBM System z10 Enterprise Class

The future runs on System z : In today’s world, IT is woven in to almost everything that a business does and consequently is pivotal to a business. Some of the key requirements today are the need to maximize return on investments by deploying resources designed to drive efficiencies and economies of scale, managing growth through resources that can scale to meet changing business demands, reducing risk by reducing the threat of lost productivity through downtime or security breaches, reducing complexity by reversing the trend of server proliferation and enabling business innovation by deploying resources that can help protect existing investments while also enabling those new technologies that can aid in business transformation.

The IBM System z10™ Enterprise Class (z10 EC) delivers a world class enterprise server designed to meet these business needs. The z10 EC provides new levels of performance and capacity for growth and large scale consolidation, improved security, resiliency and availability to reduce risk and introduces just in time resource deployment to help respond to changing business requirements. As environmental concerns raise the focus on energy consumption, the z10 EC is designed to reduce energy usage and save floor space when consolidating distributed servers. Specialty engines continue to help users expand the use of the mainframe for a broad set of applications, while helping to lower the cost of ownership. The z10 EC is at the core of the enhanced System z™ platform that delivers technologies that businesses need today along with a foundation to drive future business growth.

Benefit Feature/Function
Availability/Reliability
CICS subspace group facility
CICS subsystem storage protect
Concurrent Book Add
Concurrent ESCON, FICON, OSA-E and Coupling Link maintenance
Concurrent Hardware Management Console (HMC) and Support Element
Concurrent Licensed Internal Code (LIC) maintenance for CP, SAP, SE, PR/SM™, LPAR, HMC, OSA-E
Concurrent power and thermal maintenance
Dual Support Elements
Dynamic Change to Partition Cryptographic Coprocessor Configuration
Dynamic Channel Path Management
Dynamic I/O Reconfiguration
Dynamic memory sparing
Dynamic Oscillator Switchover
Dynamic LCSS Add
Dynamic Subchannel Set Add
Dynamic Partition Add
Enhanced Application Preservation
Enhanced Book Availability
Enhanced Driver Maintenance
Enhanced Dynamic Reconfiguration Management
Enhanced Firmware Simulation
Failure Containment for MBA
Fault Tolerant Interconnect Design
FICON Purge Path Extended
FICON Express4 Pluggable Optics for individual servicing
Fixed HSA
Frame Bolt Down Feature
Hybrid cooling
Multipath IPL—ESCON (CNC), FICON (FC) with z/OS
N+1 power supply technology
OSA-Express2 Link Aggregation Support
OSA-Express2 Network Traffic Analyzer
Partial memory restart
Point to Point SMP Fabric
QDIO Diagnostic Synchronization
Redundant I/O Interconnect
Remote operations support
Sparing for Storage Protect Preservation Keys
System Assist Processor (SAP) reassignment and sparing
System-Initiated CHPID Reconfiguration
Transparent CP Sparing

Security
Advanced encryption standard (AES) 128, 192, 256
Certified for LPAR isolation
Configurable Crypto Express2 (secure coprocessor or SSL acceleration)
CP Assist for Cryptographic Function
Crypto Express2 FIPS 140-2 Level 4 Certified
Data Encryption Standards (DES), Triple DES
Designed to meet Common Criteria Evaluation Assurance Level 5 (EAL5) certification for the security of its logical partitions
LDAP support for HMC user authentication
Open Architecture Distributed Transaction Enablement
Pseudo random number generator (PRNG)
Remote key load for ATMs
Secure hash algorithm-256 (SHA-256), -384, -512
SSL Acceleration for Linux® and z/OS
Tamper-proof Cryptographic Support

Capacity on Demand
API for Capacity Provisioning Management
API for On/Off CoD activation
Capacity Backup for both full and subcapacity CPs
Capacity Backup for zAAP, zIIP, IFL, ICF and SAP
Capacity for Planned Event (CPE)
Capacity Upgrade on Demand
Customer Initiated upgrades
On/Off Capacity on Demand
More than 200 configuration records may be staged on Support Element
Up to four active configurations stored on SEEPROM

Specialty Engines
Integrated Facility for Linux (IFL)
Internal Coupling Facility (ICF)
System z10 Application Assist Processor (zAAP)
System z10 Integrated Information Processor (zIIP)

I/O Connectivity
InfiniBand® (IFB) host bus (6 GBps each); 24 IFBs for I/O HCA2-C Fanout
IBM ESCON CTC native and basic mode
FCP support for SCSI devices by Linux, z/VM® and z/VSE (disks)
Fibre Connection (FICON) 1, 2, 4, Gbps auto-negotiation
Fibre Channel Protocol (FCP) 1, 2, 4 Gbps auto-negotiation
FICON CTC
FICON full duplex data transfer
Full fabric FCP support
IBM ESCON half duplex data transfer
Multiple Image Facility (MIF)
Multiple Subchannel Sets (MSS)
N_Port ID virtualization (NPIV) for FCP
QDIO designed for high speed networking
Up to four Logical Channel SubSystems (LCSS)

Networking
IBM HiperSockets IPv6
OSA for NCP (OSN)
OSA Layer 3 VMAC
OSA-Express2 Layer 2 Support
OSA-Express Integrated Console Controller (1000BASE-T Ethernet)
OSA-Express2 (Gigabit Ethernet, 10 Gigabit Ethernet, 1000BASE-T Ethernet)
OSA-Express3 (10 Gigabit Ethernet)5

Cluster Systems
Dynamic CF Dispatching
Dynamic ICF Expansion
Shared ICFs and CPs
Transparent ICF sparing
System-Managed CF Structured Duplexing
Sysplex Distributor
GDPS
Parallel Sysplex® clustering technology
12x IB-DDR (6 GBps)
InfiniBand Coupling links5
Internal Coupling channel (IC)
Integrated Cluster Bus-4 (ICB-4)
InterSystem Channel-3 (Peer mode only) (ISC-3)
MBA and HCA-O Fanout
ETR–Sysplex Timer attachment (standard)
Server Time Protocol (STP)
NTP Client support for STP
z/VM Virtual Parallel Sysplex

Performance
Compare-and-move extended
DB2® sort assist
Flexible Memory Options
Hardware-assisted data compression
Hardware Decimal Floating Point
HiperDispatch
Hipersorting
IBM Hiperbatch™
IEEE binary floating point support for advanced IBM Lotus® Domino® and Java™ performance
Long Displacement Facility
Modified Indirect Data Address Word (MIDAW) Facility
OSA Dynamic LAN idle
Performed Locked Operations for enhanced IP performance
Up to 1.5 TB memory

Management
(SE) maintenance
HMC Panel Wizard
HMC and SE IPv6 Support
Internal Battery Feature
Power/thermal
IBM Systems Director Active Energy Manager (AEM) for Linux on System z Support
LPAR Group Capacity Limits
ESCON sparing
Cancel I/O Requests
Power Monitoring Display
Power Estimation tool

z/Architecture
Intelligent Resource Director
Superscalar Processor
Tri-modal addressability
Up to 60 LPARs each with 64-bit central memory addressability