-By Gaurav Dangaich
CoWIN server crashes as thousands of Indians rush to register for COVID-19 vaccination - Business Insider
Flipkart fumbles on the big day as server fails - The Hindu Business Line
Yes, you read it right! There are many more examples where the server got crashed due to high user traffic, it does not mean it was not tested before going live. However, these servers received heavy traffic in a very short span which led to the failure on the user screen. This phenomenon either leads the user to repetitively reload resulting in increased load on the system or driving them away to the competitor.
Let me ask one question - Have you ever noticed, how much time it takes to display the number/character on your device when you type something on a keyboard or mobile? Well, I would say it's very quick. Now, let's imagine if you are typing and it takes 1-2 seconds to display on the screen. I am sure you would not prefer using that product for long and start looking for alternatives.
In the fast-paced world of software development, ensuring top-notch performance of applications is non-negotiable. This is where performance testing and performance engineering come into play.
Before we delve into the details, let's take a moment to acquaint ourselves with the fundamentals.
What is Performance Testing and Performance Engineering?
Performance Testing: Imagine performance testing as a rigorous examination for the software. It's all about understanding how well the application performs under different conditions. It is a non-functional software testing technique that determines how the stability, speed, scalability, and responsiveness of an application holds up under a given workload. At its core, performance testing answers questions like:
- How quickly does your software respond to user actions?
- Can it handle a surge in users without slowing down or crashing?
- Does it maintain a consistent level of performance when subjected to heavy data loads?
Performance Engineering: It goes a step beyond testing. It's the holistic approach of designing, developing, and optimizing software for peak performance right from the outset. Performance Engineers work alongside the development team, focusing on:
- Architecting software to be inherently scalable and efficient.
- Selecting the right technologies and tools to support performance goals.
- Identifying and mitigating potential bottlenecks before they become critical issues.
Is there any difference between the two?
Performance testing is a vital component of performance engineering, but the two are not the same. Think of testing as a part of engineering. Performance testing focuses primarily on assessing how well an application performs under specific conditions. On the other hand, performance engineering encompasses the entire software development lifecycle, with an emphasis on building performance into the software from the ground up.
Why Test System Performance?
Testing the performance of a system is essential for several reasons:
- User Satisfaction: Poor performance can lead to frustrated users, resulting in loss of customers and revenue.
- Reliability: Ensures that the system works as expected under different loads.
- Scalability: Identifies scalability bottlenecks, enabling informed capacity planning.
- Stability: Measures stability under peak traffic events.
- Cost Savings: Early issue detection reduces development and maintenance costs.
How to conduct Performance Testing?
Because testers can conduct performance testing with different types of metrics, the process can vary greatly. However, a generic process may look like this:
- Identify the testing environment. This includes test and production environments, as well as testing tools. Understanding the details of the hardware, software and network configurations helps find possible performance issues, as well as aid in creating better tests.
- Identify and define acceptable performance criteria. This should include defining the SLAs, SLOs and SLIs for performance metrics, such as response time, throughput and resource allocation.
- Plan the performance test. Test all possible use cases. Build test cases and test scripts around performance metrics.
- Configure and implement test design environment. Arrange resources to prepare the test environment, and then implement the test design.
- Run the test. While testing, developers should monitor the test and document the results for comparison.
- Analyse and retest. Look over the resulting test data, and share it with the project team. After any fine-tuning, retest to see if there is an increase or decrease in performance.
Types of Performance Testing:
- Load Testing: It helps developers understand the behaviour of a system under a specific load value. In the load testing process, an organization simulates the expected number of concurrent users and transactions over a duration of time to verify expected response times and locate bottlenecks. This type of test helps developers determine how many users an application or system can handle before that app or system goes live.
- Stress Testing: It places a system under higher-than-expected traffic loads so developers can see how well the system works above its expected capacity limits. It enable software teams to understand a workload's scalability. Stress tests put a strain on hardware resources to determine the potential breaking point of an application based on resource usage. Resources could include CPUs, memory and hard disks. Stress tests have two subcategories:
- Soak Testing: It is also called endurance testing, which simulates a steady increase of end users over time to test a system's longterm sustainability. During the test, the test engineer monitors KPIs, such as memory usage, and checks for failures, like memory shortages. Soak tests also analyse throughput and response times after sustained use to show if these metrics are consistent with their status at the beginning of a test
- Spike Testing: It assesses the performance of a system under a sudden and significant increase of simulated end users. Spike tests help determine if a system can handle an abrupt, drastic workload increase over a short period of time, repeatedly. Similar to stress tests, an IT team typically performs spike tests before a large event in which a system will likely undergo higherthan-normal traffic volumes.
- Scalability Testing: It measures performance based on the software's ability to scale performance measure attributes up or down. For example, testers could perform a scalability test based on the number of user requests.
- Capacity Testing: It is similar to stress testing in that it tests traffic loads based on the number of users but differs in the amount. Capacity testing looks at whether a software application or environment can handle the amount of traffic it was specifically designed to handle.
- Volume Testing: It is also called flood testing, is conducted to test how a software application performs with a ranging amount of data. Volume tests are done by creating a sample file size, either a small amount of data or a larger volume, and then testing the application's functionality and performance with that file size.
Performance KPIs:
Some of the key performance metrics, or key performance indicators (KPIs), can help an organization evaluate current performance of the system under test includes:
- Response Time: The time it takes for the application to respond to user actions. For instance, an e-commerce site aims for quick page load times to keep users engaged. If this time starts increasing, it could indicate a bottleneck.
- Throughput: Throughput measures how many transactions or requests the application can handle per unit of time. A drop in throughput, as seen in a customer support system during peak hours, may indicate a bottleneck.
- Error Rate: High error rates, such as frequent crashes or timeouts, signal problems in the software's stability. For instance, an online banking application should have a minimal error rate to maintain trust.
- Resource Usage: Monitoring CPU and memory usage helps identify resource-related bottlenecks. A content management system may face issues if memory usage spikes when handling concurrent content updates.
Key trends in Performance Engineering space
The pace of change has never been this fast for Performance Engineering. Before we even realize, Performance Testing transformed from independent and traditional practice to a broader and deeper arena – Performance Engineering. The rate of change further accelerated with the rise in competition and consumers getting spoiled with countless choices. Therefore now, more than ever, is the time to adopt best practices and track trends to ensure the product is optimized.
- Shift-left Testing: Testing the performance and analysing the results towards the end level of product development is too risky. The technical debt that comes with it is huge. To build a sturdy, fail-proof product, Performance Engineering needs to be baked into the software from the very beginning. The practice of developers writing performant code from the start is becoming an SOP and gaining acceptance. Since Performance Engineering is about understanding how all the parts of the system fit together, it's important to know the performance quality metrics from the first design. Therefore, going back or testing later is dangerous as it easily becomes 'too little, too late.' Simply put, the practice implies that organizations should leave behind the simple "record/playback" testing (that happens late in the product cycle) and move towards a more robust engineering approach that starts early in the cycle and takes place continuously.
- Auto-scaling: There are several benefits of introducing auto-scaling into the Performance Engineering lifecycle. First and foremost, it saves the company from the embarrassment it may face due to performance degradation or system crash. The method, associated with cloud computing, automatically adjusts the number of resources to meet the traffic requirement. Moreover, it saves costs by reducing resources when not required. However, auto-scaling comes with its cons, like increased development complexity and regional limitations. One should understand them before implementing auto-scaling.
- Chaos Testing: Chaos testing is a disciplined methodology that proactively simulates and identifies failures in a system to prevent unplanned downtime and ensure a positive user experience. By understanding how the application responds to failures in various parts of the architecture, it helps uncover uncertainties in the production environment.
The main objective is to assess the system's behaviour in the event of failures and identify potential issues. For instance, if one web service experiences downtime, chaos testing ensures that the entire infrastructure does not collapse.
This approach helps identify system weaknesses and addresses them before reaching the production stage.
- Performance Testing on Physical Mobile Devices: As mobile usage continues to rise dramatically worldwide, ensuring optimal performance on mobile devices has become a priority for many organisations. Thus, performance testing on physical mobile devices has become a significant trend in recent years.
Physical devices offer real-world testing conditions that are hard to replicate with emulators or simulators. They provide the most accurate insight into how an application will perform on a user's device, including interactions with various hardware and software components, network conditions, battery usage, and more. Moreover, with the increasing diversity in device types, OS versions, screen sizes, and network conditions, testing performance across a wide range of real devices becomes necessary. As the demand for high-performing mobile applications continues to grow, so does the emphasis on performance testing on physical mobile devices.
- Service Virtualisation: Service virtualisation (sometimes called service mocking or stubbing) is a technique used to emulate the behaviour of specific services in component-based applications. It provides a way to emulate services, databases, and systems that are not accessible for performance testing.
Service virtualisation in performance testing has emerged in recent years. It allows testers to test software in a controlled environment, even in the early stages of development, when all components might not be available. It enhances efficiency and productivity by negating the wait time for dependent components. It enables teams to replicate production-like conditions, which, in turn, leads to more reliable and accurate test results. As businesses move toward Agile and DevOps methodologies, the value of service virtualisation in performance testing is increasingly recognised, driving its adoption.
- Synthetic Transactions: When we monitor production, we will get to know how long requests will be live on the server, but it will give no idea about the customer's experience. Synthetic transactions help us understand what a user goes through as it simulates a real user.
Synthetic accounts can even simulate actual orders for eCommerce sites. When businesses track the real user experience, they stand to get a ton of data and it gives them an idea about issues, delays, and errors that customers face. It can also be used to find production problems quickly. It will help software vendors assess how their application is used by the users.
- AI, Machine Learning, and Sentiment Analysis: To obtain accurate trend assessment, performance engineers are required to work with the right data. Artificial intelligence makes the rendering of reliable data easier and faster. Furthermore, Machine learning algorithms are applied to predict user patterns, curate high-quality data, and filter the information as per the business requirements.
Sentiment Analysis is an underrated practice where customers' tickets and feedback are examined to understand user perception. Sentiment Analysis tells companies what the user definition of "slow" is and allows them to set the SLA at an appropriate level so they don't end up spending time on unnecessary amendments.
Overall, AI-powered techniques enhance the quality of performance monitoring and testing and spare humans from tedious tasks.
Developing a Performance Engineering Culture
Performance Engineering is widening and deepening in scope and scale. Businesses are recognizing Performance Engineering's contribution to winning clients with a blink-of-an-eye digital experience and preventing users from negative exposure. With the rising significance, experts are seeing the development of a culture where product performance responsibility falls past the QA team.
Everyone in the business, from developer to product owner, is taking their fair share of responsibility to address the evolving needs of end-users. The collaborations smoothen the process as the Performance Engineers can easily coordinate between teams, tools, and processes for maintaining the continuous feedback loop.
In order to bring the Performance Engineering culture, it's important to push the agile team to think about the Performance as early as possible. It helps companies deliver value at a rapid pace as compared to traditional set-ups.
Conclusion
Performance testing and performance engineering are the backbone of delivering high-quality, high-performance software. Understanding their nuances, leveraging various testing types, monitoring critical KPIs, and embracing emerging trends will help us not only ensure that the software works but also that it thrives in an ever-evolving digital landscape.
Happy Testing!!
Cheers until next time!!