Enterprise Linux

 View Only

Linux testing made better using a dynamic test suite

By NAGESWARA SASTRY posted Tue March 09, 2021 06:26 AM

  

Authors: Nageswara Sastry (nasastry@in.ibm.com) Geetika Moolchandani (geetika.moolchandani1@ibm.com)

 

Introduction:

Regression testing is re-running functional and non-functional tests to ensure that previously developed and tested software still function correctly after a change. If not, that would be called a regression. Changes that may require regression testing include bug fixes, software enhancements, and configuration changes. Regression test suites tend to grow with each found defect and addition of new test cases. Test automation is frequently involved while running the regression testing. Regression testing requires running a large number of test cases to be executed. Regression testing is considered to be the most expensive in terms of human and machine time. Reduction in the regression testing suite saves cost. Several techniques are used by researchers, testers, and developers to reduce regression testing cost. There are different methods to deal with test suites such as minimization, selection, and prioritization. Test suite minimization or reduction aims to reduce the number of tests to run. In the blog, we are providing details about the proof of concept (PoC) that helps in creating a dynamic test suite.

Background

Current regression test suites do not change dynamically according to the fix pack that we want to test. In some of the cases, there is a need to run only few tests. However, due to lack of data (for example, with no clear idea about the minimum set of test cases to run) testers and developers end up running a lot of unnecessary tests, which is inefficient.

Answers to the following questions can help in gathering the required data that can provide the set of minimum test cases to be run. In addition, there are many other benefits, such as prioritizing test cases, identifying redundant test cases, and identifying test cases that need improvisation.

  • How well a particular code has been tested?
  • What is the code coverage percentage?
  • What is the code coverage efficiency of a test case?
  • Which parts of the code are not being hit by any test case?
  • What is the turnaround time for the code fix test results?
  • What test case/suite to run for a particular fix

The current regression suites take a day or even three days, based on what test suites are chosen to be run. Such an enormous turnaround time is unacceptable, given that there are many other code fixes to be tested. Some of the test suites do not even cover all the areas of the code. Further, there are test cases that target only a certain area or functionality of code and may be unnecessary to run for each change in code. There is no known concrete data based on which we can omit certain test cases and shrink the size of the test suite.

Currently, developers and testers find it difficult to run quick and optimized test cases to validate a specific fix.

 

Objective

The primary aim of this PoC was to create a regression test suite that can dynamically shrink to contain only such test cases that cater to a specific use case.

For our dynamic test suite PoC, we chose the 'coverage-based' method.

Using this method, the reduction rate of the test cases can reach up to 99%.

In this approach, though the number of test cases can be reduced from thousands to tens there is no reduction in fault finding capability.

We are using the Path Coverage for Filtering (PCF) method for our work. But our work is not an exact implementation of this method.

The procedure of this method has the following steps:

  1. Run the test cases.
  2. Identify and calculate the coverage.
  3. Remove the test cases with minimum coverage value.

Some of the challenges faced with other alternate techniques:

  • Take a long time to implement
  • Are mostly complex
  • Have less fault-detection capabilities
  • Need time to train models thus consuming more time
  • Need a lot more experimentation to customize.

A few examples of the alternative techniques and their drawback are as follows:

  • Program slicing, genetic algorithm
    • Reduces the number of required test cases to run and reduces time, but they have less fault detection capabilities.
  • Greedy algorithm
    • Needs manual intervention in optimizing large scale test suites.
  • Fuzzy logic
    • Requires more experiments and studies.
  • Requirement base
    • Helps in reducing the number of redundant test cases.
  • Hybrid algorithm
    • Involves high complexity in creating the model
  • Clustering
    • Provides less fault detection capability

 

The PoC helps in achieving the following,

Run only the required test case/suite for a fix pack/fix. This improves the turnaround time for a fix pack/fix. And we have confidence that we are not going to miss bugs, if any.

  • Create dynamic regression test suites with only the required test cases that cater to a specific use case.
  • Provide input to testers on:
    • The areas of the code that need test cases.
    • Coverage efficiency of the present test cases (this gives insight on what test cases need improvisation).
    • The new files that are added (by knowledge of which testers can write or modify test cases to cover the new additions).

 

Figure 1 shows the flow chart with different components and control flow of PoC

 

Figure 1: Overview of the PoC

Overview of the PoC

The PoC is aimed at reducing the number of test cases using the code coverage method.

The basic idea was to set up two databases: the source database and the test database, which contained information such as code coverage, test case name, and tag containing file name and function name. The incoming code fix was processed and based on the file it changes, the necessary tags were set. These tags were then mapped to the test cases.

Setting up the source database

The source code was tagged using Ctags (which is a tool that generates an index file for a variety of language objects found in the list of files given as input). The tags generated in this way are then filtered to keep the functional tags, that is, –the file name and the function name. This refined data is then imported into the source database.


Figure 2: Schema of the source database

The schema of the source database consists of a field named, tag, which is in the form of functioname:filename.c as shown in figure-3.


Figure 3: Sample records from source database

 

Setting up the test database

The gcov tool is used to generate functional coverage for each test case available. It analyzes how much of a program is exercised by the test suite. Functional coverage information such as coverage percentage, function name, file name (and the corresponding test name thus obtained) is then populated in the test database.



Figure 4: Schema of test database

The schema of the test database consists of a field named tag, which is in the form of functioname:filename.c,coverage_percent (which ranges from 0 to 100), and test_name (which is the test case name). Sample records are shown in Figure 5.


Figure 5: Sample records from test database

Setting up the project model

The input of the project model is a code fix or fix pack.

The code fix is processed and the names of the files that this code fix changes are extracted. If this file name is not already presented in the source database, a notification is issued as output. If the file name is already present, then the corresponding tag is picked up from the source database and the corresponding test cases are searched for in the test database. The test cases that hit only the changed part of the code are then retrieved and run in order to test a given fix or fix pack. Along with the test cases, information such as functional coverage percentage, file name, and function name are listed as output.

Eliminating redundant test cases

Querying for a particular tag and displaying records from a test database helps in identifying the redundant test cases. Figure 6 shows the sample records from the test database for a tag named, /home/linux_src/linux/arch/powerpc/lib/vmx/vmx-helper.c:exit_vmx_ops. There can be different test cases such as tm_tm-tar, tm_tm-sigreturn, and so on that give the same code coverage percentage (for example, 85.71). So, instead of running all the test cases that give the same code coverage percentage, pick one test case. This is how redundant test cases can be eliminated.



Figure 6: Example records that help in identifying redundant test cases

Improve test cases

To identify the test cases that need to be improved, query the test database for a tag and pick the record with the highest code coverage percent. After gathering such information, the output of the records look as shown in Figure 7.
This list helps in improving the test cases based on the code coverage it provides.


Figure 7: Example data derived from the test database that help in identifying the test cases to improve

 

Results from the analysis

We ran kernel self tests from the Linux source code and avocado test cases on the same test environment and observed the following results:

  • 129 Test cases covers most of the code in kernel for IBM PowerPC architecture.
  • 361 Files out of 673 (54%) have code coverage of 100% using kernel self tests and avocado tests.
  • 312 Files out 673 (46%) have code coverage ranging from 96% to 4%, and these tests need to be improved.


Contacting the Enterprise Linux on Power Team
Have questions for the Enterprise Linux on Power team or want to learn more? Follow our discussion group on IBM Community Discussions.

0 comments
95 views

Permalink