Master the Mainframe Global User Group

Expand all | Collapse all

How do you run z/OS on a day to day basis ? Welcome to mission control

  • 1.  How do you run z/OS on a day to day basis ? Welcome to mission control

    Posted Wed August 05, 2020 03:08 AM

    I remember going into the Operations centre of a large bank - wow - it was like mission control for the moon rocket - 100+ people, lots of desks, with big screens on the walls.  There was a map of the world, showing where major data centres were,  London, Beijing, New York were all green. If there was a problem they would change colour.  There were charts showing 20,000 transactions a second with response time under 1 second.  Another transaction was "amber" because the response time was over 1.0 seconds.

    At the back of the room was the operations management team responsible for the day to day running of the systems.  In a room off the back of the room was "the war room", for managing critical situations -  from "long response time" to "system unavailable".  I remember being involved in one "crit-sit" where the customer had hourly calls with IBM for nearly 24 hours, till they were up and running.  

    The people on the desks were the "operations staff" or "applications staff".   Their job was to keep an eye on things, and sort out any problems. Automation does most of the day to day operations, but if you are told to "move work from this system to that system", you need to know how to do it.  A good day in operations is when nothing happens!

    In the operations room there were operations desks or areas for each major functional area,  Mainframe, Disks, Networks, z/OS, CICS, WAS, MQ, IMS, DB2, TCP/IP,  Security, Automation;  There were areas for people monitoring the performance and availability of each area.  There were desks at the back for "overall availability", for example deciding on whether to fail over, or not, or when to get the vendors on the phone.

    As I worked for IBM I was there during a major upgrade of MQ - just in case anything went wrong. They had planned the upgrade for 6 months, and had 4 hours to make the upgrade.  If it was not ready after 3 hours they had to roll back the changes.  If they had to roll back, the next opportunity of doing the upgrade was 3 months later, so they wanted it to work.  They had a spreadsheet with every command they needed to issue - and another column for the command to undo the change.  They only used cut and paste, and did no typing, because typing is slower, and error rpone.  Cutting and pasting mean they could issue the command they had tested with.

    I was allowed to look at screens - but not touch a keyboard.  The local team called me over to ask a question, and I had to ask them to "scroll down" to the next page - it was very hard to resist pressing the key myself!

    We had worked through the night, so after the upgrade was successful, we left at 0700, had a slap up team breakfast,  and I went to the hotel to sleep.

    Colin Paice

  • 2.  RE: How do you run z/OS on a day to day basis ? Welcome to mission control

    Posted Thu August 06, 2020 05:18 AM
    It's fascinating how these operations rooms came and went with
    automation and containerization and the overall goldfish vs pet
    attitude that now permeates the devops space (might be a perspective
    thing from where I sit). The once fashionable large monitors we had
    hanging from the ceiling that once displayed real-time operation
    metrics now show continuous delivery pipeline issues, when they
    happen, which is not often.

    Some groups now play youtube videos of sandy beaches.

    Ricardo Bánffy

  • 3.  RE: How do you run z/OS on a day to day basis ? Welcome to mission control

    Posted Fri August 07, 2020 09:32 AM
    This could be a company size thing bu you have to be careful when discussing operations centers.   It's true that the application side moving to devops has reduced some of the babysitting.  However if you are responsible for the hardware  and software infrastructure there are still large rooms with monitors.  Not nearly as many people are in those rooms due to remote workers, even before the Covid-19 issues, but they are still there.   Our operations center monitors multiple data centers, the nation wide network, the server infrastructure, connections to business partners and yes, those z/OS mainframes that hardly get noticed until the occasional problem kicks in.   There are several lines of business each with their own devops groups but when a connection to say a credit reporting service is not working,  that moves to the main operations people to investigate and work with the partner company to get the issue resolved.   I agree with you about the goldfish / pet thing but we refer to it as cattle / pets. 

    Steven Lauretti

  • 4.  RE: How do you run z/OS on a day to day basis ? Welcome to mission control

    Posted Fri August 07, 2020 09:55 AM
    @Colin Paice -  Every IBMer should have the opportunity to live in the shoes of the clients world - especially in the mission critical data centers.  I did live this world for over 20 years before joining IBM.  I learned many IBMers do not appreciate the stresses and challenges of our clients.  IBM new CEO made a comment the other day to IBMers.  He said IBMers need to have a 'sense of urgency'.  I could not have agreed more.  Why is a 'sense of urgency' important.  Colin just articulated it.

    Paul Newton