How Not to Write Unit Tests

2022-04-01 • 10 minute read • David Handermann

Introduction

Most software development projects involve some level of testing. Many factors influence testing strategies, and opinions vary when it comes to the relative value of automated testing. Evaluating code quality using metrics such as test coverage is popular, but not without pitfalls. Some place great emphasis on unit testing, while others minimize its usefulness in comparison to integration testing or functional reviews. Regardless of the motivation, writing unit tests is common task. Instead of adding to available information on best practices for authoring tests, perhaps more attention should be given to the opposite approach. Why focus on optimal unit test implementation when there are more pressing priorities?

Standard Rules

Unit test development should follow a set of rigorous rules to ensure the desired results. These standards should be enforced across the project, and should serve as the basis for project goals, design discussions, and code reviews. Similar to code formatting conventions, test development rules provide a common pattern to follow and a simple method for evaluating new contributions.

Rules for unit tests can be divided into the basic categories of positive and negative requirements. Positive requirements can be applied to most test code, although some standards have a limited scope based on the type of code being tested. Negative requirements should be applied to all test components, ensuring that nothing violates test development principles.

Negative Requirements

Negative requirements serve as a checklist for evaluating unit tests. Although some standards involve manual review, others can be configured as part of a continuous integration process.

1. Never Follow Standard Naming Conventions

Unit test code is different from algorithm implementation or business logic, and for this reason, standard naming conventions are not applicable to test classes, methods, and variables. Most test engines support configurable pattern matching for runnable class and method determination, which enables a flexible naming approach. Sometimes prefixing a class name with a particular word is useful, sometimes appending a word might be better. On some occasions, having one test class associated with one implementation class can be limiting; flexible naming opens up a variety of options. Every test is unique, which should be reflected through element naming.

Most unit test frameworks render results based on method names, which can be difficult to read. Rather than restricting method names to standard language conventions, methods can be named with a great variety of capitalization and additional characters. For projects leveraging automated code style checking, excluding tests from evaluation enables flexible method naming based on the nature of the particular test class.

2. Never Evaluate Exceptional Conditions

With validation of component behavior being the purpose of unit testing, test code should avoid exercising exceptional conditions. Applications can fail for many reasons, most of which cannot be anticipated, so attempting to test failure scenarios distracts from other project goals. Test methods should focus on verifying a minimum number of successful paths, confirming expected functionality. This requirement can appear contrary to achieving higher test coverage goals, but this concern can be mitigated through refactoring to use more generalized exception handling. Methods that return specific errors can be adjusted to declare general exceptions, promoting error handling at a higher level. This strategy allows components to focus on implementation code that works when all required collaborators operate as expected.

3. Never Organize Common Operations in Shared Methods

Although programming languages support organizing operations in methods that can be reused, unit test methods should be self-contained. Standard setup operations for components under test can make it difficult to evaluate different conditions. Operations that run before and after test each method introduce limitations and should be avoided. Individual test methods often evaluate several conditions following business logic invocation, and duplicating these assertions across methods allows each test to maintain independence from other methods. Grouping common assertions into shared methods can lead to small adjustments impacting multiple tests, which can be avoided through simple copying and pasting.

4. Never Define Static Variables for Inputs and Outputs

Given the goal of keeping individual unit test methods self-contained, static variables should never be defined for inputs and outputs. Defining local input variables eliminates the need for scrolling when reviewing unit tests, and also allows different test methods to introduce slight variations when necessary. The same principle applies to output variables. Using the same static variable for both input and output values often involves some amount of input formatting, which can be difficult to read. Independent local variables for input values and output assertions promotes loose coupling between test fixtures and expected outputs, which maintains a narrow scope for assertion declarations.

5. Never Define Formatted Inputs in Reusable Methods

Some unit tests require structured inputs, which incorporate discrete elements that influence expected results. Examples include formatted text, such as JSON or XML, as well as binary files such as CBOR or MessagePack. Although libraries exist for defining structured data programmatically, introducing external files to version control reduces the size of unit test methods. Defining expected outputs based on information encapsulated in binary files maintains clean separation between input values and test assertions. For structured text, concatenated strings support enhanced visualization. Defining structured input using libraries and methods creates more code, but storing large binary files in version control is much easier to maintain.

6. Never Use Mocking Frameworks

Mocking frameworks introduce customizable behavior for collaborating interfaces or input arguments. Some mocking libraries support capturing method arguments for subsequent verification. Using external libraries to mock component operations should be avoided for the same reason that shared methods should not be used. Writing custom implementations for testing enables complete control over test components. Mocking hides behavior behind framework conventions, requiring developers to read external documentation. Rather than depending on the assumptions of a mocking framework, creating a complete class for each referenced interface allows for variation between test classes. For languages that support dependency injection through annotations, mocking framework introduce additional layers of indirection that require understanding code outside the test itself. These capabilities prefer trade simplicity for complexity, preferring apparent convenience to method encapsulation.

7. Never Refactor

Test methods should provide a stable foundation for automated software evaluation. Each unit test method should be considered as essential element of the application. Once implemented, a test method serves as a rule against which all future development must be measured. Tests provide important example behavior, and new application features should be implemented together with new test methods. In situations where new features impact multiple aspects of application behavior, similar tests should be maintained as an illustration of initial behavior and current expectations. Refactoring creates the potential for losing important behavioral details, which is another reason to retain tests even when most aspects overlap.

Positive Requirements

Positive requirements provide standards that can be applied to most testing scenarios. Some requirements can be applied without reference to the particular type of component being tested, while others are specific to certain situations.

1. Always Ignore Method Visibility

Some programming languages support method visibility constraints using standard keywords, others depend on naming conventions. In the context of unit tests, method visibility falls into the category of rules made to be broken. Even languages with strict enforcement of method access provide mechanisms to work around visibility limitations. Although differentiating public and private methods can be helpful when defining a programming interface, unit tests should focus on testing each method through direct invocation. Writing tests for each method, regardless of visibility, makes it easier to achieve higher code coverage, which is an important measurement of software quality. Ignoring or overriding method access restrictions can be inconvenient in certain languages, but it is an important part of a robust testing strategy.

2. Always Adjust Method Visibility for Overriding

For languages with strict limitations on method access, changing method visibility is another important approach to consider. Implementation classes can become larger over the course of application development, introducing a number of methods that are not part of the public contract. Instead of refactoring, test classes should grow together with implementation classes. Expanding method visibility to a wider scope provides a simple way to develop additional test methods. This approach is also useful when unit testing components interact with network services. Wider method access allows tests to exercise more component behavior with minimal impact on the remainder of the implementation class. Although overriding some methods disables the option to test those methods, configuration behavior is often simple enough not to need direct testing.

3. Always Incorporate Logging

Logging is a core capability for any application, and logging in unit tests is no less important. Although test input values may be apparent to the original developer, logging input values allows others to read the details associated with each test method. In addition to standard assertions, logging output values enables detailed manual review of test results. When running tests on a continuous integration platform, extensive test logging can be interleaved with regular application logging, providing traceability from test method to application method. In order to keep test logging as simple as possible, using print statements to write logs to standard output streams is the most straightforward approach.

4. Always Create Subclasses for Different Collaborating Services

Object-oriented programming is a powerful approach to test implementation. Using a base class to define common tests and a subclass to define specialized tests enables comprehensive coverage for a variety of scenarios. Even when the impact of a collaborating service is limited to a specific method, creating a subclass ensures that the same test methods will be invoked multiple times, thus preventing the possibility of unexpected consequences. Creating subclasses also avoids the potential problems associated with refactoring.

5. Always Repeat Input and Output Variables in Each Method

As outlined in several other requirements, each test method should be self-contained. Although code duplication should be minimized in application components, repeating input and output values in each method should not be a concern. Duplicating test inputs enables localized adjustments based on the requirements of the particular method. Through the application lifecycle, duplicated input and output variables provide a means of observing application growth within the context of a particular test class. Version control systems can be difficult to understand, so through a combination of preserving original test methods and repeating variables, changes to application functionality can be evaluated directly in the most recent revision of the test class.

6. Always Include Binary Files

Placing binary files under version control provides a simple way to support complex test scenarios. From testing parsing algorithms to evaluating cryptographic operations, using binary files provides a stable support strategy. Certain types of information, such as certificates and tokens, incorporate specific properties defining the validity duration, which can create challenges. Selecting arbitrary expiration dates for these types of files creates opportunities for future review. An optimal expiration date should coincide with the expected maintenance lifecycle of the project. Incorporating binary files with a more limited validity duration allows other developers to gain troubleshooting experience, promoting greater understanding of the application. Although binary files increase the overall size of the source code repository, continual increases in storage and bandwidth should mitigate such concerns.

7. Always Annotate Unstable Methods

In spite of methodical development and careful review, some tests prove to be unstable in certain environments. Methods that may have worked for months or years can fail suddenly for no apparent reason. Troubleshooting test failures can be difficult, particularly with intermittent problems that seem impossible to reproduce through local invocation. Differences in system resources, operating system, and language runtime version can contribute to intermittent failures. Test failures related to software race conditions are among the most challenging to track down. For test methods that appear prone to inexplicable errors, using annotations or comments to disable such methods is a reasonable strategy. Disabling and preserving unstable methods provides an opportunity for manual invocation and potential future review. Removing unreliable tests from the current repository revision requires maintainers to review the history of changes to recover historical information. Although some test frameworks provide annotations to disable failing tests, using a simple block comment is the most straightforward approach to move on to more important priorities.

Conclusion

Writing thorough, consistent, and reliable unit tests is a difficult endeavor. Spending development cycles on unit testing can be a distraction from feature delivery. Although every software development project has particular requirements and deadlines, establishing fundamental rules for how not to write unit tests enables greater focus on other important measures of success, such as source lines of code or implementation velocity. Following these simple rules can increase code coverage, track project history, and produce traceable test reports. Adopting a strict approach on how not to write unit tests is bound to produce exceptional results.