Goblinfish Code: Next steps after v.0.0.3

With the release of v.0.0.3 of the package, the main functionality that I’d had in mind was complete. I’d also gotten the build-process worked out to my satisfaction, built as a Bitbucket Pipeline controlled by a definition file in the repo. I’d originally planned for v.0.0.4 to be spent with implementing the unit tests that I’d only stubbed out, and making alterations as needed when or if those actual tests surfaced issues that needed to be corrected. However, when a real-world opportunity to apply PACT testing to code that I hadn’t been involved with surfaced, I changed those plans in order to try to make that viable.

The v.0.0.4 release that fulfilled that desire focused mainly on adding the last pieces needed for the build-process to publish the package to PyPI, where it can be found today as goblinfish-testing-pact. Previous experience with PyPI-compatible repositories (JFrog’s Artifactory, specifically) had, some years back, left me with the impression that there was no guarantee that any PyPI repository would prevent overwriting of an existing package-version, and that if such an overwrite occurred, it would mess with the SHA-256 hashes that were attached to a package, making an existing installation invalid. In order to at least try to prevent those sorts of complications from arising, I decided to implement a test-module (tests/packaging/check_pypi.py) that would actively look in a specified PyPI repository for the current package version, using the pyproject.toml file to identify the package name and version. If the test’s execution failed it would prevent an overwrite of an existing version of the package. The relevant pieces of the pyproject.toml file were the project’s name and version:

# Taken from the v.0.0.4 pyproject.toml file
[project]
name = "goblinfish-testing-pact"
version = "0.0.4"

These data-points are used to build the PyPI URL, read the content from that page, and search for a package name and version that match in the response. If any matches are found, that indicates that a version of the package with the current version specification has already been published, and a test failure is raised, which will terminate the execution of the pipeline. The test-code itself is quite simple, bordering on painfully brute force, but it works:

PROJECT_ROOT = Path(__file__).parent.parent.parent
PROJECT_TOML = PROJECT_ROOT / 'pyproject.toml'

project_data = loads(PROJECT_TOML.read_text())

def test_package_new_in_pypi():
    """
    Tests that the current version of the package does not exist
    in the public pypi.org repository.
    """
    root_url = 'pypi.org'
    pypi_site = HTTPSConnection(root_url)
    pypi_name = project_data['project']['name'].replace('_', '-')
    package_name = project_data['project']['name'] \
        .replace('-', '_')
    pypi_version = project_data['project']['version']
    url = f'/simple/{pypi_name}/'
    pypi_site.request('GET', url)
    response = pypi_site.getresponse()
    package_list = response.read().decode()
    search_string = f'{package_name}-{pypi_version}'
    assert package_list.find(search_string) == -1, \
        f'Found {search_string} in the versions list at '
        f'{root_url}{url}'

The circumstances around applying the package’s test prescriptions brought to my attention that there were other types of data descriptors that needed to be accounted for than just the basic property types that were in place. My initial efforts were limited to identifying property object class members using the inspect.isdatadescriptor function., which I expected would satisfy the need to identify both property objects and any other objects that implemented the descriptor protocol. However, I encountered an odd bit of code that isdatadescriptor did not identify, so I made changes to accommodate that, checking for both the standard property methods (fget, fset, and fdel) and implementations that had any of the descriptor-protocol methods (__get__, __set__, and __delete__), even if those objects did not have all of them. As an interim solution to the odd case I encountered, that seemed to do the trick, but I wasn’t happy with where that left things, and so I planned to re-examine that later.

I cannot disclose the specific details of the code that raised this concern (it was work done under an NDA), but I can provide an example of the sort of thing I encountered. If a complete descriptor class is defined as:

class CompleteDescriptor:

    def __set_name__(self, owner, name):
        self._prop_name = name
        self._name = f'_{owner.__name__}__{name}'

    def __get__(self, obj, type=None):
        if obj is None:
            return self
        if hasattr(obj, self._name):
            return getattr(obj, self._name)
        raise RuntimeError(
            f'{owner}.{self._prop_name} has not been set'
        )

    def __set__(self, obj, value):
        setattr(obj, self._name, value)

    def __delete__(self, obj):
        if hasattr(obj, self._name):
            delattr(obj, self._name)
            return None
        raise RuntimeError(
            f'{owner}.{self._prop_name} has not been set'
        )

… and a class is defined that uses that plus some variations that omit the __delete__ and/or __set__ methods of the descriptor:

class Thingy:

    get_only = GetOnlyDescriptor()
    get_set_only = GetSetDescriptor()
    get_del_only = GetDelDescriptor()
    get_set_del = CompleteDescriptor()

…then the following inspect.isdatadescriptor-based code:

from inspect import isdatadescriptor

print(
    'isdatadescriptor(get_only) '.ljust(34, '.')
    + f' {isdatadescriptor(Thingy.get_only)}'
)
print(
    'isdatadescriptor(get_set_only) '.ljust(34, '.')
    + f' {isdatadescriptor(Thingy.get_set_only)}'
)
print(
    'isdatadescriptor(get_del_only) '.ljust(34, '.')
    + f' {isdatadescriptor(Thingy.get_del_only)}'
)
print(
    'isdatadescriptor(get_set_del) '.ljust(34, '.')
    + f' {isdatadescriptor(Thingy.get_set_del)}'
)

…yields the following output:

isdatadescriptor(get_only) ....... False
isdatadescriptor(get_set_only) ... True
isdatadescriptor(get_del_only) ... True
isdatadescriptor(get_set_del) .... True

The two key pieces of information that were missed in the original code that prompted this investigation, from the Descriptor HowTo Guide are:

If an object defines __set__() or __delete__(), it is considered a data descriptor. Descriptors that only define __get__() are called non-data descriptors (they are often used for methods but other uses are possible).

and

To make a read-only data descriptor, define both __get__() and __set__() with the __set__() raising an AttributeError when called. Defining the __set__() method with an exception raising placeholder is enough to make it a data descriptor.

The issue that originally sparked this was an attempt to create a data descriptor that did not implement either __set__ or __delete__ — making it a non-data descriptor, as noted above. This was a distinction that I wasn’t aware of until it surfaced in the code I was poking around at.

Important

It’s also important to note that a class that implements the descriptor protocol is not, itself a data descriptor. A data descriptor is an instance of a class that implements the protocol!

The v.0.0.4 release also completed the stubbing out of unit tests, providing the prescribed test methods for the properties that were not tested prior to that in v.0.0.3.

The v.0.0.5 release was really nothing more than adding the license and some manual setup instructions, showing how to start a PACT test-suite, and how to iterate against the various types of test failures that would occur until all the prescribed test-entities were accounted for.

The v.0.0.6 release was mostly concerned with cleaning up and adding items to the code-base in preparation for starting this series of articles.

I noticed a few other random things that I want to fix, or at least think about in v.0.0.7 as well, and a couple of items that I didn’t discuss in the post for v.0.0.3. The one item to definitely be fixed is the handling of test-method name expectations for dunder methods in the source entities, like __init__. Somehow, while working through the name-generation process for test-methods, I missed accounting for the combination of the test prefix (test_) being added to a dunder-method name, leading to three underscores in the expected test-method name: test___init__, for example, instead of test__init__. That’s not a functional issue; the test-methods are still being prescribed, but it’s one of those little things that will annoy me if I don’t fix it.

The item that I’ll need to think on is whether or not to prescribe _unhappy_paths test-methods for methods and functions that have no arguments, or at least no arguments that aren’t part of an object scope (self and cls). Based on some common patterns that I’ve seen and used in work projects, I would plan for relevant _happy_paths test-methods to check for things like cached returns and returns of appropriate types, but I have this nagging suspicion that I missed something in the potential for unhappy-path results to occur as well.

The first of the two ideas that I have implemented but didn’t think to discuss is how inherited member tests play out. I did set things up so that test-methods in a given class would not include members that were inherited from some other class. The logic here is that if the PACT processes are applied as expected, there would be a prescribed test for every class-member in the related test-case for the class that they are defined in. If that source class is used as a parent class for a different class, and the members of that parent class are not overridden, then the prescribed tests of the parent class’ members would provide the required testing. If a parent class member was overridden, then the prescribed test requirements on the child class would detect that, and require a new test-method in the test case class that relates to the child. The function used to determine inheritance status is already in the code, and is quite simple — Its documentation is longer than its code:

def is_inherited(member_name: str, cls: type) -> bool:
    """
    Determines whether a member of a class, identified by its
    name, is inherited from one or more parent classes of a
    class.

    Parameters:
    -----------
    member_name : str
        The name of the member to check inheritance status of
    cls : type
        The class that the member-name is checked for.

    Returns:
    --------
    True if the named member is inherited from a parent.
    False otherwise
    """
    parents = cls.__mro__[1:]
    result = any(
        [
            getattr(parent, member_name, None) \
                is getattr(cls, member_name)
            for parent in parents
        ]
    )
    logger.debug(f'is_inherited({member_name}, {cls}): {result}')
    return result

While I’m on the topic of test prescriptions and their relationship to inheritance, it also feels worthwhile to note that abstract members of classes will have prescribed test requirements too. That was conscious decision made during a previous pass at the idea that this package implements, and though I didn’t make a conscious decision about it in this iteration, the logic behind the previous decision still holds true, I think: If the PACT processes are intended to assure that the contracts of source entities are tested, and given that an abstract member of a class is part of that class’ interface, it follows that there should be test-methods associated with abstract members of classes. What has not been given any consideration yet, in that area, is whether the prescribed tests should include all of the requirements expectations for a concrete member. Doing so would increase the number of test-methods to no good end, I feel, so I’m inclined to add detection of abstract state for a class-member wherever possible, and limit the test prescription accordingly, but I will have to think more on that before I settle on that as the preferred path. For example, given this abstract class:

from abc import ABC, abstractmethod

class BaseThingy(ABC):

	@abstractmethod
	def some_method(arg, *args, kwdonly1, *kwargs):
		pass

…the current test-prescription process would require test_some_method_happy_paths, test_some_method_bad_arg, test_some_method_bad_args, test_some_method_bad_kwdonly1, and test_some_method_bad_kwargs test-methods, all of which would really only be able to test that an object of a class deriving from BaseThingy could not be instantiated without implementing some_method.

Those notes get this series of posts up to date with respect to the package version that’s in the repo and installable from PyPI as of the beginning of May, 2025. With everything I’ve covered in this post in mind, the work for the v.0.0.7 release, whenever I can get to starting it, will include:

Implement all unit tests.

Rework the property/data-descriptor detection to handle properties specifically, and other descriptors more generally:

: Member properties can use isinstance(target, property), and can be checked for fget, fset and fdel members not being None.
: Other descriptors can use inspect.isdatadescriptor, but are expected to always have the __get__, and at least one of the __set__ and __delete__ members shown in the descriptor protocol docs.
: Set test-method expectations based on the presence of get, set and delete methods discovered using this new breakout.
: Update tests as needed!

Correct maximum underscores in test-name expectations: no more than two (__init__, not ___init__).

Think about requiring an _unhappy_paths test for methods and functions that have no arguments (or none but self or cls).

Give some thought to whether to require the full set of test-methods for abstract class members, vs. just requiring a single test-method, where the assertion that the source member is abstract can be made.

The full implementation of the existing unit tests is, to my thinking, a critical step in this. After all, part of the purpose that they serve is to provide regression testing, and it just feels like the balance of the code is stable enough that regression makes sense to add now, rather than waiting for another round of changes. The balance of the items in that list may take a while to work through to my satisfaction: As I’m writing this, there are 90 test-methods that I have yet to examine and potentially implement just to get that first item checked off, and since that needs to happen first, I may not post about the PACT stuff again for a bit.

Goblinfish Code

Monday, May 5, 2025

Next steps after v.0.0.3

No comments:

Post a Comment

Local AWS API Gateway development with Python: The initial FastAPI implementation

Report Abuse

Labels