Framework for Mutation Testing

1. Introduction

Mutation Testing allows us to evaluate the quality of tests prepared for a software application. It aims to determine test set thoroughness by measuring the extent to which a test set can detect slight variations in the program. If we change the source code in a way that changes its behavior and the prepared test suite doesn't detect it, then the quality of tests is probably not good enough.

There are three approaches to mutation testing:

- lexical-based, in which the source code is modified in a raw, lexical form, which in some cases may lead to producing uncompilable code

- syntax-based, in which the source code is modified while in form of a syntax tree, which is much more reliable than the lexical-based approach, but the code still needs to be compiled

- bytecode-based, in which the code is changed after compilation, on a bytecode level

2. Chosen approach

The prepared framework is based on the bytecode approach, as it allows introducing mutations without the need to recompile the code over and over again. Therefore it is much faster than the other two approaches. We’ve also chosen the ASM library (instead of BCEL) to modify bytecode, as it is smaller and faster than BCEL.

3. Design specification

One of the aims of our project was to fully integrate the mutator with Ant, so that users familiar with ant tasks may easily use our mutator. Everyone capable of configuring ant to run JUnit tests will also be capable of configuring it to run our mutation testing framework. The only additional required parameters are:

- path to jad.exe, so that the changed bytecode might be decompiled to the java source

- path to directory where report should be placed

- path to project to mutate

- path to directory for temporary files

Figure 1: Example Mutator engine configured in ANT’s build file

Figure 2: Equivalent JUnit configuration in ANT’s build file

Another asset of this approach is the possibility of including standard junit reports in mutation framework report, so that the user can easily navigate the results and see what tests failed because of the mutation.

From a developer point of view, it helped us that we didn’t have to implement timeouts (in case the mutation resulted in an infinite loop) – it was already present in the junit ant task we used.

Reports are generated by Apache Velocity templates, which might be easily adjusted to user needs. The architecture allows adding new mutations in a very straightforward way – by simply adding one class for every new mutation.

A UML diagram of developed architecture is presented on the next page.

4. Mutations description

a. Arithmetic operators replacement

All addition operators are replaced with subtraction operators and vice-versa. All multiplication operators are replaced with division operators and vice-versa. This mutator is quite simple; in projects more computational than the “dbutils” however, it may be very effective.

On a bytecode level, this means changing operators:





where X is I (for integers), D (for doubles), L (for longs) or F (for floats)

b. Numeric literals replaced

All appearances of “0” number were replaced with “1”. All other byte numbers >=1 were replaced with 0. Note that this implies replacing boolean value false by true value and vice-versa. Therefore it is a very effective mutator (producing many mutants), as boolean values are very popular in any type of code.

On a bytecode level, this means changing:


· ICONST_1 to ICONST_5 and BIPUSH x (where x is a byte number) -> ICONST_0

c. Equality and non-equality changes

Equality operators ( == ) were changed into non-equality operators ( != ) and vice-versa. This mutation was applied to both object and primitive value comparisons.

On a bytecode level, this means changing:





d. Null objects and zero number returned

In all methods returning numbers 0 was returned instead of the proper value. In all methods returning objects instead of the object, a null value was returned. In every object-oriented program this should be quite an effective mutation.

On a bytecode level, this means that for any of the codes ARETURN, IRETURN, DRETURN, FRETURN, LRETURN instead of one mnemonic we put a sequence:

· POP (the value that was to be returned is dropped)

· ACONST_NULL, ICONST_0, DCONST_0,FCONST_0, LCONST_0 (inserting 0 or null on stack)


e. Relational operators changes

Relational operators were replaced as follows:

< by >=

<= by >

\> by <=

\>= by <

Note that this mutation may lead to code falling into an infinite loop.

On a bytecode level, this means changing:





As all aforementioned mutations were applied at bytecode-level, some of them were rather unreadable after decompiling – e.g.:

+    static Class class$java$lang$Long; /* synthetic field */
[ …]

-        if(((Object) (type)).equals(((Object) (Long.TYPE))) && (java.lang.Long.class).isInstance(value)) {
+       if(((Object) (type)).equals(((Object) (Long.TYPE))) && (class$java$lang$Long == null ? class$java$lang$Long : (class$java$lang$Long = _mthclass$("java.lang.Long"))).isInstance(value)) {
             return true;

However, on the whole we find applied mutations quite useful. Although they may seem trivial, they still leave many mutants alive which may be helpful in improving prepared test suite. For such a simple project as dbutils they are completely sufficient.

5. Evaluation for Jakarta Commons-DBUtils project

The results of running implemented mutators are presented in the following table:

Mutator nameTotal number of mutationsDead mutantsLive mutantsScore [%]

Execution time of whole suite of mutators lasts 9 minutes 10 seconds.

ChangeSUB2ADDVisitor, ChangeDIV2MULVisitor and ChangeMUL2DIVVisitor do not produce any mutations, as mutated source does not contain any subtraction, division nor multiplication operators.

From the presented results we can see, that probably the most effective mutators are EqualityVisitor (implementing mutation c) and ZeroOneVisitor (implementing mutation b). They have the lowest scores (not including ChangeAdd2SUBVisitor, but it has only 4 mutations so it is not very representative). This means that they can indicate the weaknesses of tests and find carelessly designed and/or implemented parts of code.

Of course, as implemented mutations are applied at bytecode level and they are context-free, not all of them produce readable code. Therefore, for some of them it will be difficult to check if they meet all three required conditions: reachability, necessity and sufficiency.

After analysing not dead mutants we made the following observations:

- Inner class BasicRowProcessor$CaseInsensitiveHashMap is probably not well tested, as changes introduced by mutation b are not detected.

- Method “private boolean isCompatibleType(Object value, Class type)” in class BasicRowProcessor leaves many mutants alive. However, code produced by mutator c is difficult to read and it might happen that produced mutants are equivalent to proper code. On the other hand, other mutations (from mutators b and d) are also not dicovered by tests, so probably the method is not tested well.

- Other mutations on methods of BasicRowProcessor class show, that probably this class is generally lacking good tests

- Tests for classes DbUtils and QueryLoader do not detect invertion of equality operators

- Tests for QueryRunner do not detect changes introduced by most of mutators

- Other classes tests also fail to detect some of the introduced mutations, but it seems that the aforementioned 4 classes are the ones that lack tests the most.

Authors: Liliana Ziołek & Marcin Zduniak