When I was at University 40 years ago, I remember meeting someone from IBM who was "doing the milk round", where corporations tried to get students to join their graduate scheme. I was asked "would you rather be a programmer, or a systems programmer?" At the time I didnt know what a systems programmer was; but I spent my life time being on the systems programming side of things.
When I had been at IBM for 10 years, my parents finally got around to asking what I did for a job, and why I was being sent to customers. I told them all about it. Their reply was "yes dear, would you like some more cake?"
As a systems programmer, the cost of getting something wrong, can be measured in millions, or hundred of millions of dollars. It is easy to lose a corporation's reputation, and a government can issue huge fines.
At the top is the system architect. This person is responsible for the configuration of the environment. There will be a z/OS architect, CICS, architect etc. These people need to ensure
The systems programmers implement the products. How many LPARs do we need? How many coupling facilities? What work do we run in each LPAR or coupling facility. Can we move work from here to there - transparently? Do we need to setup CICS/DB2/IMS/MQ so they run in a sysplex. What do we need to share? We need to set up security. This is for people logging on to the mainframe to administer it, but also customers using the applications. What auditing do we need?
The data designer is responsible for the implementation of the database or MQ queues. For a database it may mean having accounts A to C in this table (or CF structure), accounts D to F in this table, because we need 100 disks to hold the database, and need to spread the data across the volumes. Have these databases on these disks, on a different disk subsystem to those databases. What indexes do we need to provide fast access. For MQ, the questions are where do we put the queues.
If you make a mistake - it can stay for ever. For example you define a new column in an SQL table, for all of the 100 million accounts. If you get the definition wrong (for example the field is not wide enough) there is no "delete column". Copying this table to a new table without the bad column, is impractical.
Capacity planning people monitor the usage of the systems, for example CPU usage and disk response time. They might say "if we can move this work from midnight to 0400, then our peak usage will go down, and we will pay less money for the CPU. If we upgrade our old processors with the newest ones, we can go from 8 processors down to 4 processors, and get more CPU at the same time. It will also reduce the air conditioning requirements, and power requirements. In some states in the US they state will not allow more power into the site. You have to manage with what you have, so to get more processors, you need to upgrade with systems using less power; less heat requires less air conditioning which uses less power.
Operators keep the systems running. Their biggest challenge is knowing what to do when something unusual happens.
Installers/upgraders apply fixes, or upgrade releases. Often there is a group of SMP/E experts who install fixes, and say to the Systems Programmers, here you are... over to you.
System Testers is an important role and requires a wide experience. Some customer's test systems have more capacity than production, and drive the test system and peak production throughput +25 %, so they know their system can handle it, do not run out of capacity, hit a database lock, nor hit slow down because of IO. They may drive the "site failover test" where you turn off the power to the test system, and hope the work will seamlessly flow to the alternate system, and the end user does not notice. The test designers tend to be senior people who understand the environment, and may have been a sys prog.
There may be a performance team within each subsystem ( z/OS, CICS, DB2) or one team for the whole organisation. There is some overlap in the skills - for example the CICS person needs to know about disk response times, but the z/OS person may not know how to tune CICS. The performance job is never done. If you improve one area, the bottleneck will move.
The job can range from reducing CPU, leading to saving money because you do not need to upgrade the system, to reducing transaction response time, to working with the different teams, to move resources around. For example this CF has contention, if you move this structure to there, it will smooth things out.
The security team makes the system secure. Ensuring people have enough access to do their jobs - but no more. They need to monitor violations and see if people are trying to hack in.
The network team, this can cover the hardware front end routers (and packet scanners), which can provide workload balancing, and stopping unauthorised packets. It also covers the TCP and VTAM software to cover IP addresses ( each LPAR has its own IP address, and the sysplex has its own external address), and tuning of the network. Tuning the nodes in a network to get best performance, and prevent flooding of packets.
The team managers who run the teams
The availability manager who is responsible for the availability of the systems, from planned outages to unplanned outages, and coordinating changes. I remember attending a week long session when the customer crawled through scenarios. The manager had said "if we are down for a minute it costs us a million dollars. If we are down for a day, we are out of business". They found that some servers were in a locked room, and there was only one key, which someone had put in their pocket and taken home - whoops. Having to break the door down would have extended any outage.
The incident manager who runs the incident or war room when a problem occurs which affects the services provided. This role may have to report to senior managers (or the board) about the status of the problem, and actions being taken. This person needs to be calm and see the bigger picture. "If you cannot fix it in 5 minutes we must consider switching to the backup system".
There are lots of roles, it can be technically challenging, but I enjoyed my sys prog days. If you love coding applications, a systems programmer job may not be for you, as it is a different skill set, and attitude. At the end of the day you get a feeling of "I built that".