In the last post, I started talking about resources required to run a software solution that may be distributed across many hosts. I mentioned that the risk a solution may run into resiliency issues increases with the number and complexity of solution resources and the complexity of the solution itself.
For a given solution, how do we know what resources are being used? The answer to this question depends completely on the solution design and implementation. For example, consider a simple solution that consists of one HTTP servlet that runs inside WebSphere Application Server. The servlet is implemented to return some HTML content to send as a reply to a user HTTP request. By default, this solution will use the WebSphere WebContainer thread pool. When the user HTTP request arrives inside WebSphere Application Server, a thread in this thread pool is used to run an instance of the servlet. WebSphere Application Server itself that provides the runtime for this servlet runs inside a Java Virtual Machine which is another resource that consumes operating system level resources, namely, CPU, RAM, disk IO, and network IO. If the operating system is virtualized as in VMWare VM or IBM Power LPAR, then there are other resources, as well, associated with the virtualization infrastructure (hypervisor) that come into play to manage virtual CPUs, memory, disk IO, and network IO.
As you can see, a very simple solution consisting of one servlet requires a lot of resources to run. A real-world solution that is distributed across many hosts, leveraging many different technologies, will have many more resources.
What we are after is not just the identification of resources required by a solution. But, we are after the identification of resources required by each transaction supported by the solution. Below are various ways that help in identifying such resources:
A. Architecture Diagrams
Although UML diagrams come to mind when we talk about architecture diagrams, I have not found many customers use them. Although these UML diagrams standardize the communication of business and technical intent of a solution, these diagrams are not really required to identify solution resources. I have found many customers use freeform diagrams to communicate many architectural concepts. If you google freeform diagrams, you will probably get hand-drawn diagrams. In many cases, that’s what I usually get in my customer engagements. In other cases, I get diagrams drawn for me on a white board as needed. Note that these diagrams will help identify coarse-grained resources, but not the fine-grained resources. For example, these diagrams will show WebSphere Application Server topology consisting of a deployment manager node, node agents and solution clusters, but will not show the fine-grained resources, such as thread pools, inside the solution cluster members that are used by the various transactions supported by the solution.
What I am really trying to build is end-to-end request-response diagrams; one for each business transaction.
1. End-to-End Request-Response Diagrams
For each business transaction, identify the entry point(s) and the various hops (resources) required to support the business transaction. For example, the traditional 3-tier web application, shown in Figure 1, typically supports transactions that are started by a user at the web browser. Then, transactions are intercepted by some load balancer which forward them to a web server. Then, the web server determines which application server JVM is to receive this request (being a request belonging to some session that lives on one of the cluster members). Once the request is inside the application server, a servlet or a JSP runs to execute this request, then makes a JDBC call to a database to make a CRUD (Create, Read, Update, or Delete) operation. Then, the servlet builds a response to return to the user waiting on the web browser.

Figure 1: request-response high-level flow
What we just describe above is the top level of a request-response flow highlighting the coarse-grained resources. Fine-grained view of resources inside each coarse-grained resource have to be identified to ensure that these resources are configured properly for resiliency. Figure 2 shows two fine-grained resources inside the App Server box. Although not clear in the diagram, a WCTP thread is used to execute the user request. This thread is busy until the user gets the reply. One of the things that the thread does is to acquire a CP connection to send some SQL statement to the database for execution.

Figure 2: request-response flow inside App Server