A common question from customers who want to replace their APM/Observability vendor, but yet must cover monolith applications and come in with a “Method Level Tracing” requirement.
Why do customers want it?
Customers will use this analysis of every method call inside the monolith to look for the longest running methods to point their performance engineers at to make the monolith run faster. As Instana product people, we would like to look at a better way to make the monolith run faster rather than copy other APM’s approach. This is a discussion we had with several customers and they like the modern approach to modernizing the monolith. Main reason is the focus of improving the performance of the monolith. Customers using the traditional tooling do not know if those improvements will impact critical end user experiences or less performance critical backend jobs that really don’t need to run faster.
Looking at a better approach with Instana
Let’s go to the pain point, a customer wants to make the monolith run faster. We suggest that we measure the specific cases where real users were impacted – either a latency was exceeded, or a high error rate was observed, etc. – then look at only those specific cases to improve. Then we can be sure that the performance engineers are focused on improving the parts of the monolith that impact end users and not improving performance on things that do not matter to the business such as overnight batch processing jobs. We can even measure the impact of the performance engineer’s work if we set up service level objectives (SLOs) against the monolith and track our improvements as performance engineering’s focus on improving the end user experience.
The set up
First, we need to describe the various functions the monolith performs for our end users. For example, a single monolith java application for banking may perform checking account functions, money transfer functions, and nightly reporting functions for audit purposes
In this example, we would want to set up Instana application perspectives for each of our key services, so one for the monolith entry point for checking account functions, another for money transfer functions, and a third for nightly reporting functions.
Now we can set up SmartAlerts for each of our application perspectives to set the boundaries where we should be alerted on when we exceed them. Let’s say latency for any checking account function should happen in 500 milliseconds, money transfers should happen in two seconds, and nightly reporting functions should happen in less than one hour. Since Instana captures 100% of our real user traffic, we will get an alert for every instance when our real traffic exceeds the boundaries we have set. We also will avoid working on performance improvements where we do not impact the end user experience with this approach.
Improve the experience
Now, after some time letting Instana observing your monolith, we can look at each application perspective where we are getting alerts and find the specific user experiences that had the slowest latency to focus on. Clicking into the trace, we will see the stack trace where the monolith was slow, including the exact time it took, and the timings of any external calls, which could be to databases or other services. Many times, this alone will point to the cause if a database is performing badly, or another service is the cause. If the problem is within the monolith itself, we can use our “always-on” Java profiler to see where the time was spent. You will see a stack trace of all the method level calls to see any hot spots. This is where a performance engineer can be pointed to improve the experience for just the outliers that actually impact the end user experience. We expect this method to work for the vast majority of the performance problems in a monolith.
Need to dig deeper?
In the rare cases we need more observability into the monolith, the Java performance engineer can use Instana’s configuration-based Java Trace SDK to instrument the code in areas that aren’t performing well, and we can get that deeper visibility directly into the traces and SmartAlerts we have set up.
Maintain this over time
Once the monolith is performing how you would like it to, you can set up a service level objective (SLO) for each of your key services the monolith provides and track for compliance over time.
Sign up for a 14 day Free Trial here