I’m going to start this series of articles by stating, for the record:

Note

I am very opinionated when it comes to testing code. Writing tests may not be my least favorite part of software engineering, though I absolutely acknowledge that tests in general, and unit tests in particular are a critical part of writing good, solid, usable code. To be clear, I don’t dislike writing tests, I just like that process less than writing documentation for code, and that less than writing the code itself, and I want the test processes themselves to do certain things to make writing tests easier, or at least more effective.

This series of articles is, ultimately, about a Python package — goblinfish-testing-pact — that I’ve written to try and improve my quality of life with respect to writing unit test suites. Along the way, I’ll discuss my discoveries and thought processes that led to the functionality provided by the package. I’m going to focus on unit testing here, though some aspects of the discussion here may also apply to other types of tests.

Why unit test anyway?

The simplest answer to this question is some variation of “ensuring that each executable component in a codebase performs as expected, accepting expected inputs, and returning expected results.” The majority of the time, that will map one-to-one with the idea of eliminating bugs from the code being tested, or at least reducing their likelihood. It’s still possible for code that is well and thoroughly tested to have bugs surface in it, though — it’s not a panacea for bug-free code. Writing tests so that they can be executed on demand, especially during whatever build/deploy processes — regression testing — also helps considerably to ensure that when changes are made to a codebase, no new bugs get introduced.

For me, that equates to spending less time hunting down bugs, freeing up more time to write new and interesting code, which aligns nicely with my preferences about how I spend the time I have available for development.

When I write tests, I want to be able to iterate over sets and collections of good and bad values both, and I want the results of those tests to be “non-blocking” for other tests against those collections. That is, if there are sixteen variations of arguments to pass, and variations 7 and 9 are going to fail, I want to see both of those failures at the same time, rather than having to run a test, fix #7, then run the test again to discover that #9 is failing too, and fix that. The built-in unittest package that ships with Python supports this, by allowing subTest context managers to run multiple related but independent tests within an iteration. While I prefer pytest for its test-discovery, which makes running test-suites quite a bit easier, unittest.TestCase.subTest is, to my thinking, a much better mechanism for organizing larger groups of related tests than anything that is offered by pytest out of the box.

I do not like the idea of having arbitrary code-coverage metrics that can impact the pass/fail of a build process. At the same time, the coverage package provides some very useful reporting capabilities on what lines of code in the source were not exercised by a test-suite, and I do like having that available as a sanity-check, to provide visibility into testing gaps. That segues neatly into an idea that I’ve been working on for many years now, that I’ve come to think of as prescribing active contract testing: Prescribing in the sense of stating, as a rule, that the active contracts of callables (their input parameter expectations) should all be tested. To that end, I’ve been working off and on for several years towards writing a package that implements those prescriptions, with a fairly simple basic rule:

Goal

Every code-element in a project’s source tree should have a corresponding test-suite element.

In practice, that can be elaborated a bit into some more specific rules for different types of code elements:

Every source-code directory that contributes to an import-capable namespace should have a corresponding test-suite directory.
Every source-code module should have a corresponding test-suite module.
Every source-code callable module-member (function or class) should have a corresponding test-suite-module member.
Every source-code class-member that either is a method, or has methods behind it that make it work (standard @property implementations, for example, as well as any classes that implement the standard data-descriptor interface, see the Properties implementation of the Descriptor HowTo Guide for an example) should have corresponding test-methods in the test-suite.
Test-methods for source-code elements that accept arguments should include tests for both happy path scenarios, and for unhappy paths for each parameter.

These are, I feel, just testing based on logical extensions of the object-oriented idea of contracts (or interfaces) into the overall code-structure: a set of defined rules that a given code-element is expected to conform to. Each of the code-elements noted above has a contract of sorts associated with it:

Modules (and their parent packages, where applicable) can be imported, as can their individual members. That implies, to my thinking, a contract that those packages, modules, and testing for those module-members should be accounted for.
Every function accepts some number of parameters/arguments. That is another contract, expressing the input expectations for each of those code-elements, and test-entities for those expectations should be accounted for.
Members of classes — methods and properties are, under the hood, just special cases of, or collections of special cases of functions, and follow the same test-entity accountability rules.

Note, if you will, my use of accounted for in that last list. Even a fairly small codebase whose test-suite follows these prescriptions could easily yield a very large number of test-entities. A single module, with two functions in it, that have, say, half a dozen parameters used between them, lands on none test-entities: One test-module, containing two test-case classes, two happy-path test-methods between them, and six unhappy-path test-methods, one for each of the arguments for either of those functions. I’ve been writing code for decades now, and testing it for more than half of that time, and in that time I’ve seen, firsthand, that not all tests are equally important from a product or service delivery perspective. Requiring that test-entities can be verified as existing, even if they are actively skipped provides what I believe to be a near-optimal balance between “testing everything” and real-world priorities. Even if test-methods or whole test-cases are skipped, they at least exist, and provided that the mechanism for skipping them requires some documentation — a simple string saying we chose not to test this because {some reason} — that encourages making conscious decisions about testing priorities.

So, what I’ve really been working towards might be described as a meta testing package: Something that analyzes the structure of the code (and of the corresponding test-suite code), and asserts that the “accounted for” rules for each entity are being followed, without having to care whether the test-entities involved are even implemented – just that they are accounted for.

The balance of the articles in this series will dive in to the implementations of those meta-testing processes at various levels. Here is what my plan is, broken out by version number:

v.0.0.1 will contend with the source- and test-entities whose existence is a function of the structure of files and directories in the project: Packages (directories) and modules (files).
v.0.0.2 will focus on determining whether the members within a given source module — the functions and classes contained — have corresponding test-case classes defined in the corresponding test-modules.
v.0.0.3 will implement the accountability for test-methods within the test-case classes that were verified in the previous version.

Past that, I had originally planned for the v.0.0.4 version to incorporate the test-processes that the package provides as a logical next step, but after actually putting the package in to use, I started discovering minor little tweaks that needed to be made. Additional tweaks that surfaced as I was putting the package into use as part of the code for a book I’m writing — the second edition of my Hands-On Software Engineering with Python book — led to more tweaks, which I made and released as the v.0.0.5 version. After that, as I was working through some writing for the v.0.0.2 release, I noticed some copypasta-level mistakes, a general lack of attention that I’d paid to documentation within the code, and that I had never actually written, which were all added to the v.0.0.6 release.

That’s where things stood as I was writing this update to the original article post on LinkedIn. My next steps, after I get all the article content from the broken LinkedIn attempts copied over here (and updated where necessary) is going to pick up where I’d originally intended to go with the v.0.0.4 version. While it may be possible to test incrementally as I go, my previous efforts with this package concept over the years has shown me that I will almost certainly spend a lot more time revising existing tests than I would spend writing them all from scratch once all the package source-elements are in place. I’m tentatively expecting that v.0.0.7 will really just involve implementing unit tests for the package, along with any changes that need to be accounted for as tests reveal issues.

After that, as things surface that need attention, I will revise the code and post new articles as needed. At some point in the foreseeable future, there will be a v.1.0 release, which may or may not warrant a dedicated article. At some point after that, after I’ve had some time to noodle on things, I’m planning to issue v1.1 and v.1.2 releases that will include a command-line tool for stubbing out tests that follow the strategies and expectations of the module, and support for testing Pydantic models, not necessarily in that order.

Goblinfish Code

Wednesday, April 23, 2025

Prescribing Active Contract Testing (PACT) in Python

Why unit test anyway?

No comments:

Post a Comment

Local AWS API Gateway development with Python: The initial FastAPI implementation

Report Abuse

Labels