z/TPF - Group home

The Advantages of z/TPF in a Hybrid Cloud Architecture

  


In an article published on the Servers & Storage blog, we demonstrated real-world cases where customers tried to move entire high-volume real time transaction workloads from IBM Z to the cloud. Those efforts were each abandoned for a variety of reasons: from extreme costs, to the inability to scale or meet tight SLAs, and even encountering instability in the new environment.

The right approach, of course, is progressive modernization through a hybrid cloud architecture. To better explain why, let's use a sample workload to illustrate the fundamental differences between the IBM Z environment and an exclusively cloud native environment.

Our sample workload transaction involves three applications and three databases (DBs). Figure 1 shows the flows of that sample transaction when all the applications and DBs reside in z/TPF:

These are the steps taken in order to complete the transaction when processed on z/TPF:

  1. The system of engagement (SOE) application issues a REST request that flows into the z/TPF system. Application 1 starts processing the transaction.
  2. Application 1 reads data from Database 1 on z/TPF using an exclusive lock to prevent this DB record from being updated by another transaction (on this z/TPF server or other z/TPF servers in the cluster) while the transaction is being processed.    
  3. Application 1 continues processing and calls Application 2. Because the source (Application 1) and target (Application 2) are co-located, this is just a local memory call.
  4. Application 2 reads data from Database 2 on z/TPF. Note that a cached copy of the data in memory might be used, or the data might need to be read from disk.
  5. Application 2 completes its processing and returns the results to Application 1.
  6. Application 1 continues processing and calls Application 3. Because the source (Application 1) and target (Application 3) are co-located, this is just a local memory call.
  7. Application 3 reads data with from Database 3 on z/TPF using an exclusive lock to prevent this DB record from being updated by another transaction.
  8. Application 3 finishes its processing, files the updates it made to Database 3, and releases the lock on that DB record.
  9. Application 3 completes its processing and returns the results to Application 1.
  10. Application 1 finishes its processing, files the updates it made to Database 1, and releases the lock on that DB record.
  11. Application 1 completes its processing and the REST reply message is sent back to the SOE.

Now, let's look at that sample transaction in a cloud native environment consisting of 3 different application clusters and 3 different DB clusters as shown in Figure 2.

  1. SOE issues a REST request that flows into the Application 1 in an Application Server 1 cluster node. Application 1 starts processing the transaction.
  2. Application 1 issues a DB client API to read data from Database 1 on DB Server Cluster 1 using an exclusive lock to prevent this DB record from being updated by another transaction (on this or another application server) while this transaction is being processed.   
  3. Application 1 continues processing and calls Application 2 via a REST API.
  4. Application 2, running in Application Server 2 cluster node, issues a DB client API to read data from Database 2 on DB Server Cluster 2.
  5. Application 2 completes its processing and returns the results to Application 1 in the REST API response.
  6. Application 1 continues processing and calls Application 3 via a REST API.
  7. Application 3, running in Application Server 3 cluster node, issues a DB client API to read data from Database 3 on DB Server Cluster 3 using an exclusive lock to prevent this DB record from being updated by another transaction.
  8. Application 3 finishes its processing, then issues another DB client API to file the updates it made to Database 3 and releases the lock on that DB record.
  9. Application 3 completes its processing and returns the results to Application 1 in the REST API response.
  10. Application 1 finishes its processing, then issues another DB client API to file the updates it made to Database 1 and releases the lock on that DB record.
  11. Application 1 completes its processing and the REST reply message is sent back to the SOE.

Figures 1 and 2 illustrate major differences between these architectures that impact both cost and performance:

CPU Consumed


Each network flow consumes additional CPU resources on both the client and server sides of the pipe as there are many protocol stacks involved such as TCP/IP, SSL (TLS), HTTP, REST, and DB-specific client/server protocols. For example, in the z/TPF architecture, a transaction goes through a TCP/IP stack and SSL stack two times each, whereas in the cloud native architecture, a transaction goes through a TCP/IP stack and SSL stack 26 times each. Similarly, in the z/TPF architecture, a transaction goes through an HTTP stack and REST stack two times each, whereas in the cloud native architecture, a transaction goes through an HTTP stack and REST stack 10 times each. 

Latency


Each network flow adds latency for the transaction, even if the various server clusters in the cloud native environment are all within one physical data center - if some of these servers are geographically distant from other servers (like across different public clouds), then latency can be orders of magnitude higher. In the z/TPF architecture, there is only 1 request/response pair over the network (1 REST API) versus 8 request/response pairs over the network (3 REST APIs, 5 DB APIs) in the cloud native environment. Note: network latency isn’t the only issue—there is also additional latency within each server node dispatching tasks once a task arrives. 

DB Lock Contention


The locks on DBs 1 and 3 are not held across any network calls in the z/TPF architecture. In the cloud native environment, the lock on DB 3 is held over 2 network flows, but the more concerning issue in that the lock for DB 1 is held across 12 network flows, which ties back into the latency concerns, especially if all the servers are not physically located near each other. The longer lock hold times dramatically increase the likelihood of DB lock contention, and as any DB architect knows, excessive DB lock contention makes it impossible to achieve true scalability. 

DB Caching


z/TPF environments are typically 1-8 servers in the cluster, each capable of caching terabytes (TB) of information in memory. The latest IBM Z box, the z15, supports up to 40 TB of memory. This means you could cache some entire DBs in memory and result in most DB read operations happening from memory - no physical I/O involved. So, for transactions reading dozens of DB records, that makes a huge difference on latency, requiring less context switching overall. This becomes even more important when you are reading DBs while holding locks on other DB records. In the cloud native environment, an application server cluster typically consists of hundreds—if not thousands—of server instances, making it impractical to cache data on all those nodes, let alone try and keep cached data in sync across them. This means application servers are always communicating with external DB server nodes to read DB records.         

High Availability (HA) of the Databases


While not explicitly illustrated in Figures 1 and 2, the handling of DB updates can further impact server CPU requirements, as well as latency that—in turn—effects lock hold times and, thus, DB lock contention. In the z/TPF architecture, whenever a DB record is updated, z/TPF writes the data to two places in the physical DB on separate DASD control units (CUs) such that there are no single points of failure and ensuring true High Availability (HA). z/TPF writes the data in parallel to both DASD CUs so as to not impact transaction latency. Contrast this to other DB architectures that either don't have HA, have eventually consistent HA (meaning the primary DB replicates data to other DB servers after the fact, creating the potential that some data is not available or old/stale data is read if the primary DB server fails), or the primary DB server waits for a copy of the data to harden in at least one other DB server before completing the update operation—thus increasing I/O latency.

Disaster Recovery (D/R) of Databases


In the z/TPF architecture, after an I/O operation completes, the DASD CU then replicates the data to the D/R site without the host (in this case, z/TPF) being involved, meaning zero z/TPF CPU is consumed to achieve D/R. Contrast this to most other DB architectures where the DB server is responsible for D/R, meaning DB server CPU is consumed to send copies of the data to the D/R site.     

To summarize, the provided examples clearly showcase a few stark differences between architectures: most notably, the amount of CPU consumed to process a given workload, and the difference in complexity when managing and debugging an environment using a single server cluster as opposed to multiple clusters. And it’s worth noting that our example is generous—a typical z/TPF workload utilizes dozens of applications and DBs, and attempting to scale that in a distributed environment over dozens of server clusters is drastically more complex.

What's not as obvious are the ramifications of extra latency for high volume workloads. In the sample transaction on z/TPF, let's say that DB Record 1 is locked for 1 millisecond (ms), meaning the time between Steps 2 and 10 is only 1 ms because all processing occurs locally on the same server. Thus, you can scale up to 1000 transactions per second that update that same record DB. In the cloud native environment, let's say DB Record 1 is locked for 20 ms because of the additional latency involved. You can only scale up to 50 transactions per second that update that same record DB. So, until scientists can figure out a way to cheat the speed of light, these limitations are brick walls. And while additional hardware might resolve issues with scaling your application servers, you won’t be able to fix lock contention problems this way.

Besides high lock contention, another common blocker for scaling up a DB workload is the result of a DB server spending significant time replicating data to other servers (including D/R) and processing replication data received from other DB servers. The result is that a given DB server doesn't have enough cycles to process all the transactional requests received from applications. Again, throwing more hardware (additional DB servers) at the problem only makes a bad situation worse: where you once had N servers and copies of data, you now have N+1 copies of your data—meaning you have to account for more replication and demand on your systems.

To summarize the data for this sample transaction workload:

While z/TPF does have significant advantages in performance, cost, and manageability, there are certain workloads that are more suited for a cloud native environment, so in the next part of this series we will start to explore the types of workloads that are appropriate for hybrid cloud.