Thursday, May 29, 2025

First thoughts on reworking managed attribute test-expectations

When last I wrote about the goblinfish-testing-pact package, I had a checklist of things that I was going to pursue:

Implement all unit tests.
Rework the property/data-descriptor detection to handle properties specifically, and other descriptors more generally:
Member properties can use isinstance(target, property), and can be checked for fget, fset and fdel members not being None.
Other descriptors can use inspect.isdatadescriptor, but are expected to always have the __get__, and at least one of the __set__ and __delete__ members shown in the descriptor protocol docs.
Set test-method expectations based on the presence of get, set and delete methods discovered using this new breakout.
Update tests as needed!
Correct maximum underscores in test-name expectations: no more than two (__init__, not ___init__).
Think about requiring an _unhappy_paths test for methods and functions that have no arguments (or none but self or cls).
Give some thought to whether to require the full set of test-methods for abstract class members, vs. just requiring a single test-method, where the assertion that the source member is abstract can be made.

I had originally intended to address these items in the order they were listed, but as I started thinking through the implications of completing the entire unit test suite first, it occurred to me that going down that path would almost certainly end up creating more work than I really needed. Specifically, while the idea of having a complete test-suite for the package before making changes had its appeal — providing a regression testing mechanism is, generally speaking, a good thing — but I was all but certain that I was going to have to completely re-think how I was dealing with detecting and analyzing @property members and custom data-descriptor members of classes. That would, I felt, put me in the position of writing a bunch of tests that were likely to go away in very short order. Writing tests is rarely a waste of time, but in this particular case, I expected that I would be spending a fair chunk of time writing them, then time revising the processes behind them, then reconciling the new code against the old tests, fully expecting that many of them would have to be rewritten to a significant degree.

The thinking behind the expectation that I would need to rewrite the property- (and property-like) test-name-expectations processes hinged on various future states that I wanted to support for the package. To start with, all I was concerned about was generating expected test-method names for class attributes that were managed using one of the built-in options for attribute management: the @property decorator, and custom solutions that used the built-in Descriptor protocol (which also include method-sets built with the @property decorator). A property is a data-descriptor, but not all data-descriptors implement the property interface. That is worth showing in some detail, so consider the following class and code:

from inspect import isdatadescriptor

class Person:
    @property
    def first_name(self):
        return self._first_name

    @first_name.setter
    def first_name(self, value):
        self._first_name = value

    @first_name.deleter
    def first_name(self):
        try:
            del self._first_name
        except:
            pass

    def __repr__(self):
        return (
            f'<{self.__class__.__name__} at {hex(id(self))} '
            f'first_name={self.first_name}>'
        )

inst = Person()
inst.first_name = 'Brian'
print(repr(inst))
print(
    '• isinstance(Person.first_name, property) '.ljust(48, '.')
    + f' {isinstance(Person.first_name, property)}'
)
print(
    '• isdatadescriptor(Person.first_name) '.ljust(48, '.')
    + f' {isdatadescriptor(Person.first_name)}'
)

If this is dropped into a Python module and executed, it will output something along the lines of:

<Person at 0x102247650 first_name=Brian>
• isinstance(Person.first_name, property) ...... True
• isdatadescriptor(Person.first_name) .......... True

This is expected behavior: The built-in property class implements the descriptor protocol mentioned earlier, which is really no more, apparently, than implementing __get__, __set__, and __delete__ methods. A property object also has fget, fset and fdel methods, which are where the property methods in the code are stored, and that are called by the __get__, __set__, and __delete__ methods of the descriptor. That is, when the @property is applied to first_name in the class above, that method is stored in the fget method of the resulting property object, and is called by the __get__ method. Similarly the @first_name.setter and @first_name.deleter decorations attach the methods they decorate to the fset and fdel methods of the property, which are called by the __set__ and __delete__ methods, respectively.

Important
All properties are data descriptors.

So, what happens if we add a data-descriptor? Here's a bare-bones implementation of one, added to the same Person class, and with updates to show the results:

from inspect import isdatadescriptor

class Descriptor:

    def __set_name__(self, owner, name):
        self.__name__ = name
        self.__private_name__ = f'_{name}'

    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        try:
            return getattr(obj, self.__private_name__)
        except Exception as error:
            raise AttributeError(
                f'{obj.__class__.__name__}.{name} '
                'has not been set'
            )

    def __set__(self, obj, value):
        setattr(obj, self.__private_name__, value)

    def __delete__(self, obj):
        try:
            delattr(obj, self.__private_name__)
        except Exception as error:
            raise AttributeError(
                f'{obj.__class__.__name__}.{name} does '
                'not exist to be deleted'
            )

class Person:
    @property
    def first_name(self):
        return self._first_name
    @first_name.setter
    def first_name(self, value):
        self._first_name = value
    @first_name.deleter
    def first_name(self):
        try:
            del self._first_name
        except:
            pass

    last_name = Descriptor()

    def __repr__(self):
        return (
            f'<{self.__class__.__name__} at {hex(id(self))} '
            f'first_name={self.first_name} '
            f'last_name={self.last_name}>'
        )

inst = Person()
inst.first_name = 'Brian'
inst.last_name = 'Allbee'
print(repr(inst))
print(
    '• isinstance(Person.first_name, property) '.ljust(48, '.')
    + f' {isinstance(Person.first_name, property)}'
)
print(
    '• isdatadescriptor(Person.first_name) '.ljust(48, '.')
    + f' {isdatadescriptor(Person.first_name)}'
)

print(
    '• isinstance(Person.last_name, property) '.ljust(48, '.')
    + f' {isinstance(Person.last_name, property)}'
)
print(
    '• isdatadescriptor(Person.last_name) '.ljust(48, '.')
    + f' {isdatadescriptor(Person.last_name)}'
)

Running this updated module code results in:

<Person at 0x1009bea10 first_name=Brian last_name=Allbee>
• isinstance(Person.first_name, property) ...... True
• isdatadescriptor(Person.first_name) .......... True
• isinstance(Person.last_name, property) ....... False
• isdatadescriptor(Person.last_name) ........... True

From that output, it is apparent that:

Important
Not all data descriptors are properties

This is also expected behavior, based on the Descriptor protocol, which notes that:

Define any of these methods* and an object is considered a descriptor and can override default behavior upon being looked up as an attribute.

* __get__, __set__ and __delete__

That does mean, though, that there are two distinct interfaces to contend with just in the built-in options. The fact that those interfaces overlap was mildly annoying to me, since it meant that a check for whether a given class-member is a property has to happen before checking whether it is a non-property data-descriptor, but in the grander scheme of things, that is not really a big deal, I thought.

Earlier I mentioned that there were future states that I also wanted the package to support. One of those is Pydantic, which I find I use directly or through the Parser extra of the Lambda Powertools package. Pydantic is, at a minimum, a data validation library, that allows a developer to define structured data (as classes) with attributes (Fields) that perform runtime type- and value checking for objects. The other is the BaseModel and Field functionality provided by Django. Django is a high-level Python web framework that encourages rapid development and clean, pragmatic design, that has been around since 2010, and is a very popular framework for developing web applications using Python.

While I wasn't going to worry about the actual implementation for those just yet, I needed to have a solid idea of whether their respective Field objects followed one of the interfaces that I'd already accounted for. The short answer to that question, for both, was no, unfortunately. In Pydantic's case, the fields, defined explicitly using a full Field call as shown in the documentation, do not even exist as named members of the class. That is, given a module with:

from pydantic import BaseModel, Field

class Person(BaseModel):
    first_name: str = Field()
    last_name: str = Field()

print(Person.first_name)

…running that module raises an AttributeError:

File
  "/.../throwaways/pydantic-test/model-test.py",
  line 10, in <module>
    print(Person.first_name)
          ^^^^^^^^^^^^^^^^^
  File "/.../pydantic-test/.../site-packages/pydantic/...
    /_model_construction.py", line 271, in __getattr__
      raise AttributeError(item)
AttributeError: first_name

As it turns out, Pydantic's model fields are stored in a class attribute, __pydantic_fields__, where they are tracked as a dictionary of field-name/field-object key/value pairs. This could be verified by adding the following code to the same module:

inst = Person(first_name='Brian', last_name='Allbee')
print(repr(inst))
from pprint import pprint
pprint(Person.__pydantic_fields__)

for field_name, field in Person.__pydantic_fields__.items():
    print(
        f'isinstance(Person.{field_name}, property) '
        .ljust(48, '.') + f' {isinstance(field, property)}'
    )
    print(
        'inspect.isdatadescriptor(Person.{field_name}) '
        .ljust(48, '.') + f' {inspect.isdatadescriptor(field)}'
    )

…which yielded this output:

Person(first_name='Brian', last_name='Allbee')
Brian Allbee
{'first_name': FieldInfo(annotation=str, required=True),
 'last_name': FieldInfo(annotation=str, required=True)}
isinstance(Person.first_name, property) ........ False
inspect.isdatadescriptor(Person.{field_name}) .. False
isinstance(Person.last_name, property) ......... False
inspect.isdatadescriptor(Person.{field_name}) .. False

So, at least for the purposes of supporting test-method expectations for test suites that test Pydenatic model fields, there is yet another interface involved just finding the fields and their names. That, though, is a problem for another day, though it did show me that I would need to implement a very flexible solution.

Django's Field objects are a little easier to deal with — at a minimum, they can be referenced in a more normal fashion, for example ClassName.FieldName. The exploration code used to determine how those field object behaved was:

import inspect
import sys

from django.db import models
from django.db.models.fields import Field

class Person:
    first_name = models.CharField(max_length=30)
    last_name = models.CharField(max_length=30)
    birth_date = models.DateField()

    class Meta:
        ordering = ['last_name', 'first_name']
        verbose_name = 'Person'
        verbose_name_plural = 'People'

    def __repr__(self):
        return (
            f'<{self.__class__.__name__} at {hex(id(self))} '
            f'first_name={self.first_name} '
            f'last_name={self.last_name} '
            f'birth_date={self.birth_date}>'
        )

    def __str__(self):
        return f'{self.first_name} {self.last_name}'

inst = Person()
inst.first_name='Brian'
inst.last_name='Allbee'
print(repr(inst))
print(type(Person.first_name))
print(type(Person.birth_date))
print(
    'isinstance(Person.first_name, property) '.ljust(48, '.')
    + f' {isinstance(Person.first_name, property)}'
)
print(
    'isinstance(Person.first_name, Field) '.ljust(48, '.')
    + f' {isinstance(Person.first_name, Field)}'
)
print(
    'inspect.isdatadescriptor(Person.first_name) '.ljust(48, '.')
    + f' {inspect.isdatadescriptor(Person.first_name)}'
)

When run, that generated

<Person at 0x101ddbe10
    first_name=Brian last_name=Allbee
    birth_date=<django.db.models.fields.DateField>
>
<class 'django.db.models.fields.CharField'>
<class 'django.db.models.fields.DateField'>
isinstance(Person.first_name, property) ........ False
isinstance(Person.first_name, Field) ........... True
inspect.isdatadescriptor(Person.first_name) .... False

They still are not property objects, or recognized as data-descriptors with inspect.isdatadescriptor, but could be extracted with a similar custom function that uses isinstace to check class-members against the Field base class that appears to be the lowest common denominator for all of Django's specific field classes. That should handle all the different variations like the CharField and DateField shown in the example above.

So, knowing now what I do about how the different variations of managed attributes (properties, descriptors, and model-field implementations for Pydantic and Django) behave, the goal is fairly straightforward: Figure out a way to scan all members of a target class, checking whether they fall into one of those categories, and building out an expected test-case-name set accordingly. Since Pydantic and Django aren't the only third-party packages out there that might have similar constraints, that implementation needs to be highly extensible as well — there's no telling how some other package might implement things, though I would hope that they would tend to gravitate towards using the built-in data-descriptor protocol, since it is built in. That implementation strategy will be the topic for my next post relating to this package.

Thursday, May 15, 2025

When not to mock or patch

Despite taking a break from working on the unit test stubs forthe goblinfish-testing-pact package, it hasn’t been far from my thoughts, for several reasons. I want to get this package done, first off, and I won’t consider it to be done until it has a test-suite that executes the vast majority of its code. That’s particularly true, I feel, for a package that’s intended to help make other testing packages easier to work with to achieve the kinds of test goals that it’s intended to accomplish.

Secondly, and part of the reason I want to get to that done state, I want to be able to use it in my other personal project efforts. Technically, I could proceed with those, and add PACT testing in later, but that would ultimately put me in the same kind of position with those other projects as I’m in now with this one: scrambling to get tests written or reconciled with whatever I might already have in place, rather than working those tests out as I go. I’d much prefer the latter.

The main blocker for me on getting the PACT tests written for the package that provides them was figuring out how I would (or in some cases could) test things. Writing code with an eye towards it being testable is a skill, and something of an art at times. I tried to keep that in mind, but I made a fundamental assumption in a lot of cases: That I would be able to usefully apply either a patch or some variant of a Mock to anything that I needed to. That proved not to be the case in at least one of the tests for the ExaminesModuleMembers class, where the built-in issubclass function is used to determine whether a named test-entity in the test-module is a unittest.TestCase. That proved to be problematic in two ways. The first was in figuring out how to specify a patch target name. For example, using an approach like this:

import unittest
from unittest.mock import patch

class test_SomeClass(unittest.TestCase):

    @patch('issubclass')
    def test_patch_issubclass(self, patched_issubclass):
        patched_issubclass.return_value=True
        self.assertTrue(issubclass(1, float))

…raised an error indicating that the target name was not viable:

TypeError: Need a valid target to patch.
    You supplied: 'issubclass'

Fair enough, I thought. I know that issubclass is a built-in, and after confirming that issubclass.__module__ returned a module name (builtins), I tried this:

class test_SomeClass(unittest.TestCase):

    @patch('builtins.isinstance')
    def test_patch_isinstance(self, patched_isinstance):
        patched_isinstance.return_value=True
        self.assertTrue(isinstance(1, float))

…only to be faced with this error instead:

AttributeError: 'bool' object has no attribute '_mock_name'
Note

Just to see what would happen, I tried a similar variation with the built-in isinstance function. That raised a recursion error, because one of the first things that the patch decorator does is determine whether the target specified is a string… using isinstance.

So, with those discoveries in hand, I’ve added a new item to my list of things to keep in mind when writings tests for Python code:

Tip

Don’t assume that a built-in function, one that lives in the builtins module of a Python distribution, will be patch-able.

There is at least one way around that particular challenge: wrapping the built-in function in a local function that can be patched in the test code. For example, defining a wrapper function for issubclass like this:

def _issubclass(cls, classinfo) -> bool:
    """
    Wrapper around the built-in issubclass function, to allow a
    point for patching that built-in for testing purposes.
    """
    return issubclass(cls, classinfo)

…would allow the code that checks for that subclass relationship to be re-written like this:

    ...
    # Check that it's a unittest.TestCase
    self.assertTrue(
        _issubclass(test_case, unittest.TestCase),
        f'The {test_case.__name__} class is exepcted to be a '
        'subclass of unittest.TestCase, but is not: '
        f'{test_case.__mro__}'
    )
    ...

and the related test’s code could then patch the wrapper _issubclass function, allowing that code to take control over the results wherever needed:

class ExaminesModuleMembers(HasSourceAndTestEntities):
    ...

    @patch('modules._issubclass')
    def test_source_entities_have_test_cases(self, _issubclass):
	    ...

This, and similar variations that would implement the same wrapper logic as a @classmethod or as a @staticmethod, would also be viable; only the scopes of the calls would change, really.

I’m not a big fan of this approach, in general, though I’ll use it if there are no better options. My main concern with it, trivial though it might be, is that it adds more code, and thus more tests. I’ve seen it asserted various places that “every line of code written is a liability,” and there’s a fair amount of truth to that assertion. This function, one-liner though it is, still feels like it would be cluttering things up, even if only by that handful of added lines of code.

Another option would be to re-think the implementation of the target code that the test relates to. In this particular case, there is at least one option that I could think of that would remove the need for applying a patch to the main test-method: Adding another test-method to the mix-in that simply asserts that the test-case class is a subclass of unittest.TestCase. That would be a simple test-method to add, though I’d want to keep it from being duplicated across all the different classes that it applies to. Assuming another hypothetical mix-in class built for that purpose, that new class wouldn’t have to be much more than this:

class RequiresTextCaseMixIn:
    """
    Provides a common test that asserts that a derived class
    also derives from unittest.TestCase
    """
    def test_is_test_case(self):
        self.assertTrue(
            issubclass(self.__class__, unittest.TestCase),
            f'{self.__class__.__name__} is expected to be a '
            'subclass of unittest.TestCase, but is not '
            f'({self.__class__.__mro__})'
        )

For this particular scenario, testing the PACT package, I like this better than the patched/mocked wrapper function. It doesn’t add any new (and trivial) code that adds new test requirements, for starters, and this test is quite simple. What I didn’t like about it, at least on my first consideration of it, is that it moves the checking of the test-case class’ requirements into those individual classes, instead of the testing provided by ExaminesModuleMembers. My initial thoughts about going down this path centered around being concerned that test-failures would be raised later in the current manual process than I wanted, since they would be raised as the tests for those test-cases executed. However, after thinking through it again, I decided that this was not as bad a scenario as I’d initially thought. This rearrangement was, to my thinking, acceptable.

All of the options mentioned so far are predicated on keeping with an assumption that I made early on, but that may not hold true: that using patch and/or Mock objects would be the preferred way to deal with writing tests that are, at their heart, a collection of module files. The original thought behind that was that if I didn’t need to create a fake project-structure, it would be better: there wouldn’t be an additional collection of files to manage, I wouldn’t have to contend with the additional imports, and so on. In retrospect, though, particularly as I’ve really seen just how extensive the collection of mocked/patched values would likely need to be, the more I grew to like the idea of actually having a fake project in place to test against.

My initial concerns that led me away from that decision really boiled down to one thing: A concern about how I would make sure that everything was tested, across a combination of required test-entities in the context of the package, and for a reasonably realistic project representation at the same time. Until I had all of the PACT processes implemented, I wasn’t sure that the test-code would be able to manage both without getting far more complicated than I wanted it to be. However, after seeing how the stubs of the tests worked out, I am much more confident about that concern being significant. Knowing now what I do about how the processes involved took shape, I anticipate that a simple fake project structure will actually simplify things by allowing the happy-paths tests to operate against code structures that are a reasonable and direct simulation of an actual project. I also anticipate that generating unhappy-path data using patch decorators or contexts will keep those tests more manageable.

At a high level, how I expect things to unfold as I progress with the revised tests is:

  • The actual test-entities expectations will be determined by their correlation with the relevant members of the package itself.
  • The actual test executions will use classes set up to derive from the class being tested for any given test-case class, but pointing at a fake-project entity, allowing:
    • Creation of any necessary elements in the scope of the fake project in order to test the behavior of the package source-element.
    • Isolation of the code whose behavior is being tested from the package entities themselves.

I struggled with trying to come up with a better explanation than that, but couldn’t come up with one, so an example would probably be good. Assume a module at fake-project/module.py that starts with no code in it at all, and that the test in question is for the module_members.ExaminesSourceFunction class, responsible for checking that a happy-paths test-method and one unhappy-path test for each parameter in a function exist. The expected test-methods will be defined by inspection of the ExaminesSourceFunction class, and would include five happy-paths tests, one for each instance method and property. The unhappy-path tests, initially, will follow the expectations that were in place as of the v.0.0.3 release, with tests for invalid setter values, invalid setter instances, and invalid deleter calls, but I plan to revise the expectations as noted in my earlier post.

The actual test executions, though, will use in-test classes that point at the fake-project elements that relate. Each test-case will be concerned more with making sure to minimize the number of lines of code that are not executed in the PACT source target element, and the processes for that minimization will rely on adding whatever source-entity stubs are needed to make sure that the PACT methods are thoroughly tested. Using the same ExaminesSourceFunction mix-in as an example, that would include fake-project functions that include:

  • A function with no parameters.
  • Functions with one and two required positional parameters.
  • A function with one required and one optional positional parameter.
  • A function with an *args argument-list.
  • Functions with one and two required keyword-only parameters.
  • A function with one required and one optional keyword-only parameter.
  • A function with a **kwargs keyword-arguments list.

Those function variants may be combined. The tests, in this particular case, could patch the test_entity property of the class, so that actual test-methods are not required, or those test-methods could be stubbed out in the classes defined within their tests.

Monday, May 12, 2025

Break in my cadence

Between various obligations, a (happily) high level of job-search activities late last week, and a holiday over the weekend, I have not had time to complete the writing I had planned for today. I expect that my normal Monday/Thursday posting cycle will resume on Thursday, 15-May.

Thursday, May 8, 2025

Local AWS API Gateway development with Python

I’m going to take a break from writing posts about the goblinfish-testing-pact package for a bit — The work on it is still going on in the background, but it’s going slowly because of constraints on my time (and, if I’m being honest, because I’m not looking forward to trudging through the remaining unit tests there). I needed to change things up a bit, and write about something different in order to have something to post to meet my Monday/Thursday posting plans.

What I opted to write about is the first iteration of another package that I came up with over the course of a technical challenge for a recent job prospect. I won’t go too deeply into the specifics of the challenge — the company in question might pose it to other candidates — but my solution for it got me to thinking about how, at my previous position, we handled developing APIs using AWS’ API Gateway, Lambda Functions, and the Lambda Proxy Integration between the two. We defined our infrastructure using AWS SAM, and testing it locally was not really an option without using the sam local command. By the time I was part of the team where local development and testing would have been most useful, other ways of handling it had been devised that did not involve using sam local. I wasn’t part of the discussions that led to the approach that team used, but I would guess that the decision was made to avoid using sam local because it was slow. When I looked into sam local for my own ends, it looked like it had to build and spin up local Docker containers for every Lambda for every API request, and did so even if one had already been created.

That, then, got me to thinking about how to provide a better way. Essentially, what I was aiming for was a way to set up a local API that would:

The Python ecosystem is not lacking for packages that provide locally-executable HTTP API functionality. Setting aside Django, which is more an application-development environment (though there is an add-on, the Django REST Framework that provides REST API functionality), the two that seem to be the most popular are Flask and FastAPI.

Flask

Flask is a lightweight WSGI web application framework. It is designed to make getting started quick and easy, with the ability to scale up to complex applications. It began as a simple wrapper around Werkzeug and Jinja, and has become one of the most popular Python web application frameworks.

Flask offers suggestions, but doesn’t enforce any dependencies or project layout. It is up to the developer to choose the tools and libraries they want to use. There are many extensions provided by the community that make adding new functionality easy.

FastAPI

FastAPI is a modern, fast (high-performance), web framework for building APIs with Python based on standard Python type hints.

The key features are:

  • Fast: Very high performance, on par with NodeJS and Go (thanks to Starlette and Pydantic). One of the fastest Python frameworks available.
  • Fast to code: Increase the speed to develop features by about 200% to 300%.
  • Fewer bugs: Reduce about 40% of human (developer) induced errors.
  • Intuitive: Great editor support. Completion everywhere. Less time debugging.
  • Easy: Designed to be easy to use and learn. Less time reading docs.
  • Short: Minimize code duplication. Multiple features from each parameter declaration. Fewer bugs.
  • Robust: Get production-ready code. With automatic interactive documentation.
  • Standards-based: Based on (and fully compatible with) the open standards for APIs: OpenAPI (previously known as Swagger) and JSON Schema.

Both offer a fairly simple decorator-based approach to providing API endpoints: Write a function, apply the appropriate decorator, and that’s all that the function needs to handle an incoming request and return a response. Both also offer a local server, allowing someone working on the API code to run and debug it locally. Both of those local servers can also pay attention to at least some of the local project-files, allowing a change to a relevant file to restart the local server. Even in cases where a change to a Lambda Function file are not picked up automatically, restarting the local API is much faster than waiting for the sam build and sam local processes to complete, and the resolution of a local API request, assuming that it can simply call the relevant Lambda handler function, is immediate, not requiring a Docker container to spin up first.

There are trade-offs, to be sure. The SAM CLI presumably supports other Serverless Application Model resources that may not have local-API equivalents. In particular, GraphQLApi, SimpleTable and StateMachine resources, if they are needed by an application, are likely to need special handling from a local development and testing perspective. All of the other API types, though, can be represented at a very basic level, accepting requests and returning responses, and Lambda Layers are just a code-import problem to be solved. The remaining SAM resource-types I cannot speak to, having never needed to use them, and any additional resources defined in a SAM template using standard CloudFormation are almost certainly not able to be represented in a local API implementation.

For the sake of this post, I’m going to start with Flask as the API provider, not because it’s necessarily better, but because I’m more familiar with it than any of the other options. A “toy” project layout will be helpful in describing what I’m trying to accomplish:

toy-project/
├─ Pipfile
│  │  # Packages managed in categories
│  ├─ [api-person-rest]
│  │  └─ ...
│  └─ [local-api]
│     └─ Flask
├─ Pipfile.lock
├─ .env
├─ src/
│  │  # The modules that define the Lambda Handler functions
│  └─ api_person_lambdas.py
│     │  # The functions that handle {HTTP-verb} requests
│     ├─ ::get_person(event, context)
│     ├─ ::post_person(event, context)
│     ├─ ::put_person(event, context)
│     ├─ ::patch_person(event, context)
│     └─ ::delete_person(event, context)
├─ local-api/
│  └─ api_person_rest.py
│     │  # Flask() object, accepts methods (e.g., 'GET', 'POST'),
│     │  # app provides a 'route' decorator.
│     ├─ ::app
│     │  # These are decorated with app.route('path', methods=[]),
│     │  # and Flask provides a request object that may be used
│     │  # in each.
│     ├─ ::api_get_person()
│     ├─ ::api_post_person()
│     ├─ ::api_put_person()
│     ├─ ::api_patch_person()
│     └─ ::api_delete_person()
└─ tests/

In this project, the Flask application lives entirely under the local-api directory, and its api_person_rest module defines a fairly typical set of CRUD operation functions for HTTP GET, POST, PUT, PATCH and DELETE requests. Each of those functions is decorated according to Flask standards; the bare bones of the code in api_person_rest.py would start with something like this, assuming a common /person route, and no other parameters defined at this point:

from flask import Flask, request

app = Flask(__name__)

@app.route('person', methods=['GET'])
api_get_person():
    """Handles GET /person requests"""
    # Needs to call get_person(event, context)
    ...

@app.route('person', methods=['POST'])
api_post_person():
    """Handles POST /person requests"""
    # Needs to call post_person(event, context)
    ...

@app.route('person', methods=['PUT'])
api_put_person():
    """Handles PUT /person requests"""
    # Needs to call put_person(event, context)
    ...

@app.route('person', methods=['PATCH'])
api_patch_person():
    """Handles PATCH /person requests"""
    # Needs to call patch_person(event, context)
    ...

@app.route('person', methods=['DELETE'])
api_delete_person():
    """Handles DELETE /person requests"""
    # Needs to call delete_person(event, context)
    ...

When the local API is actually running, requests to any of the /person-route endpoint functions would be received based on the HTTP verb/action involved. From there, what needs to happen is a series of steps that is simple to describe, but whose implementation may be quite a bit more complex:

  • The API function needs to know to call the appropriate function from src/api_person_lambdas. For example, if a GET /person request is received by the API, the routing defined will tell the API to call the api_get_person function, and that function will need to call the api_person_lambdas::get_person function.
  • Before actually making that function-call, the incoming request needs to be converted into a Lambda Proxy Integration input data-structure. The Lambda Powertools Parser package could be installed and leveraged to provide a pre-defined data-model, complete with validation of the data types, to that end.
  • Since the Lambda handler also has a context argument, and that may or may not be used by the handler, creation of a Lambda context object; also needs to happen.
  • Once the event and context have been created, the API function can call the Lambda handler: api_person_lambdas::get_person(event, context).
  • The Lambda handler is expected, at least in this case, to return a Lambda Proxy Integration output (which may also be represented in the Lambda Power Tools models, the naming of those models isn’t clear enough to say with any certainty whether that is the case or not).
  • The response from the Lambda handler will need to be converted to a Flask Response object, possibly using the make_response helper-function that Flask provides.
  • That response will be returned through normal Flask response processes.

With those in mind, the to-do list for this package effort boils down to these items, I think:

  • Figure out how to map API (Flask) endpoint-function calls to their corresponding Lambda Function handlers.
  • Figure out how to convert an API request into a Lambda Proxy Integration event structure.
    • Take a deeper look at the Lambda Power Tools parsing extra to see if it provides both input/request and output/response models for that integration.
  • Figure out how to generate a meaningful, realistic LambdaContext object from an API request.
    • If it’s not possible, or not a realistic expectation for complete LambdaContext objects to be populated, define a minimum acceptable basis for creating one.
  • Determine the best approach for having a route-decorated API function call the appropriate Lambda handler. Some possibilities to explore include:
    • An additional decorator between the app.route decorator provided by Flask and the target function.
    • Extending the Flask Application object to add a new route-equivalent decorator that handles the process.
    • Overriding the existing decorator to handle the process.
    • Manually dealing with it in some fashion is acceptable as a starting-point, but not where it should end up by the time the package reaches a Development Status :: 5 - Production/Stable / v.1.0.0 release.
  • Figure out how to convert a Flask Response object to a Lambda Proxy Integration output object.
  • Implement anything that wasn’t implemented as part of the discovery above.
  • Test everything that wasn’t tested during previous implementation.
  • Release v.1.0.0
  • Figure out how to read a SAM Template file to automate the creation of endpoint-function to Lambda-handler function processes, and implement it.
  • Release v.1.1.0

And with that roadmap all defined, at least for now, this post is done, I think. As I write more on this idea, and get the package moving and released, the relevant posts will all be tagged with “local.lpi_apis” (Lambda Proxy Integration APIs) for ease of following my stream of thoughts and work on it.

Monday, May 5, 2025

Next steps after v.0.0.3

With the release of v.0.0.3 of the package, the main functionality that I’d had in mind was complete. I’d also gotten the build-process worked out to my satisfaction, built as a Bitbucket Pipeline controlled by a definition file in the repo. I’d originally planned for v.0.0.4 to be spent with implementing the unit tests that I’d only stubbed out, and making alterations as needed when or if those actual tests surfaced issues that needed to be corrected. However, when a real-world opportunity to apply PACT testing to code that I hadn’t been involved with surfaced, I changed those plans in order to try to make that viable.

The v.0.0.4 release that fulfilled that desire focused mainly on adding the last pieces needed for the build-process to publish the package to PyPI, where it can be found today as goblinfish-testing-pact. Previous experience with PyPI-compatible repositories (JFrog’s Artifactory, specifically) had, some years back, left me with the impression that there was no guarantee that any PyPI repository would prevent overwriting of an existing package-version, and that if such an overwrite occurred, it would mess with the SHA-256 hashes that were attached to a package, making an existing installation invalid. In order to at least try to prevent those sorts of complications from arising, I decided to implement a test-module (tests/packaging/check_pypi.py) that would actively look in a specified PyPI repository for the current package version, using the pyproject.toml file to identify the package name and version. If the test’s execution failed it would prevent an overwrite of an existing version of the package. The relevant pieces of the pyproject.toml file were the project’s name and version:

# Taken from the v.0.0.4 pyproject.toml file
[project]
name = "goblinfish-testing-pact"
version = "0.0.4"

These data-points are used to build the PyPI URL, read the content from that page, and search for a package name and version that match in the response. If any matches are found, that indicates that a version of the package with the current version specification has already been published, and a test failure is raised, which will terminate the execution of the pipeline. The test-code itself is quite simple, bordering on painfully brute force, but it works:

PROJECT_ROOT = Path(__file__).parent.parent.parent
PROJECT_TOML = PROJECT_ROOT / 'pyproject.toml'

project_data = loads(PROJECT_TOML.read_text())

def test_package_new_in_pypi():
    """
    Tests that the current version of the package does not exist
    in the public pypi.org repository.
    """
    root_url = 'pypi.org'
    pypi_site = HTTPSConnection(root_url)
    pypi_name = project_data['project']['name'].replace('_', '-')
    package_name = project_data['project']['name'] \
        .replace('-', '_')
    pypi_version = project_data['project']['version']
    url = f'/simple/{pypi_name}/'
    pypi_site.request('GET', url)
    response = pypi_site.getresponse()
    package_list = response.read().decode()
    search_string = f'{package_name}-{pypi_version}'
    assert package_list.find(search_string) == -1, \
        f'Found {search_string} in the versions list at '
        f'{root_url}{url}'

The circumstances around applying the package’s test prescriptions brought to my attention that there were other types of data descriptors that needed to be accounted for than just the basic property types that were in place. My initial efforts were limited to identifying property object class members using the inspect.isdatadescriptor function., which I expected would satisfy the need to identify both property objects and any other objects that implemented the descriptor protocol. However, I encountered an odd bit of code that isdatadescriptor did not identify, so I made changes to accommodate that, checking for both the standard property methods (fget, fset, and fdel) and implementations that had any of the descriptor-protocol methods (__get__, __set__, and __delete__), even if those objects did not have all of them. As an interim solution to the odd case I encountered, that seemed to do the trick, but I wasn’t happy with where that left things, and so I planned to re-examine that later.

I cannot disclose the specific details of the code that raised this concern (it was work done under an NDA), but I can provide an example of the sort of thing I encountered. If a complete descriptor class is defined as:

class CompleteDescriptor:

    def __set_name__(self, owner, name):
        self._prop_name = name
        self._name = f'_{owner.__name__}__{name}'

    def __get__(self, obj, type=None):
        if obj is None:
            return self
        if hasattr(obj, self._name):
            return getattr(obj, self._name)
        raise RuntimeError(
            f'{owner}.{self._prop_name} has not been set'
        )

    def __set__(self, obj, value):
        setattr(obj, self._name, value)

    def __delete__(self, obj):
        if hasattr(obj, self._name):
            delattr(obj, self._name)
            return None
        raise RuntimeError(
            f'{owner}.{self._prop_name} has not been set'
        )

… and a class is defined that uses that plus some variations that omit the __delete__ and/or __set__ methods of the descriptor:

class Thingy:

    get_only = GetOnlyDescriptor()
    get_set_only = GetSetDescriptor()
    get_del_only = GetDelDescriptor()
    get_set_del = CompleteDescriptor()

…then the following inspect.isdatadescriptor-based code:

from inspect import isdatadescriptor

print(
    'isdatadescriptor(get_only) '.ljust(34, '.')
    + f' {isdatadescriptor(Thingy.get_only)}'
)
print(
    'isdatadescriptor(get_set_only) '.ljust(34, '.')
    + f' {isdatadescriptor(Thingy.get_set_only)}'
)
print(
    'isdatadescriptor(get_del_only) '.ljust(34, '.')
    + f' {isdatadescriptor(Thingy.get_del_only)}'
)
print(
    'isdatadescriptor(get_set_del) '.ljust(34, '.')
    + f' {isdatadescriptor(Thingy.get_set_del)}'
)

…yields the following output:

isdatadescriptor(get_only) ....... False
isdatadescriptor(get_set_only) ... True
isdatadescriptor(get_del_only) ... True
isdatadescriptor(get_set_del) .... True

The two key pieces of information that were missed in the original code that prompted this investigation, from the Descriptor HowTo Guide are:

If an object defines __set__() or __delete__(), it is considered a data descriptor. Descriptors that only define __get__() are called non-data descriptors (they are often used for methods but other uses are possible).

and

To make a read-only data descriptor, define both __get__() and __set__() with the __set__() raising an AttributeError when called. Defining the __set__() method with an exception raising placeholder is enough to make it a data descriptor.

The issue that originally sparked this was an attempt to create a data descriptor that did not implement either __set__ or __delete__ — making it a non-data descriptor, as noted above. This was a distinction that I wasn’t aware of until it surfaced in the code I was poking around at.

Important

It’s also important to note that a class that implements the descriptor protocol is not, itself a data descriptor. A data descriptor is an instance of a class that implements the protocol!

The v.0.0.4 release also completed the stubbing out of unit tests, providing the prescribed test methods for the properties that were not tested prior to that in v.0.0.3.

The v.0.0.5 release was really nothing more than adding the license and some manual setup instructions, showing how to start a PACT test-suite, and how to iterate against the various types of test failures that would occur until all the prescribed test-entities were accounted for.

The v.0.0.6 release was mostly concerned with cleaning up and adding items to the code-base in preparation for starting this series of articles.

I noticed a few other random things that I want to fix, or at least think about in v.0.0.7 as well, and a couple of items that I didn’t discuss in the post for v.0.0.3. The one item to definitely be fixed is the handling of test-method name expectations for dunder methods in the source entities, like __init__. Somehow, while working through the name-generation process for test-methods, I missed accounting for the combination of the test prefix (test_) being added to a dunder-method name, leading to three underscores in the expected test-method name: test___init__, for example, instead of test__init__. That’s not a functional issue; the test-methods are still being prescribed, but it’s one of those little things that will annoy me if I don’t fix it.

The item that I’ll need to think on is whether or not to prescribe _unhappy_paths test-methods for methods and functions that have no arguments, or at least no arguments that aren’t part of an object scope (self and cls). Based on some common patterns that I’ve seen and used in work projects, I would plan for relevant _happy_paths test-methods to check for things like cached returns and returns of appropriate types, but I have this nagging suspicion that I missed something in the potential for unhappy-path results to occur as well.

The first of the two ideas that I have implemented but didn’t think to discuss is how inherited member tests play out. I did set things up so that test-methods in a given class would not include members that were inherited from some other class. The logic here is that if the PACT processes are applied as expected, there would be a prescribed test for every class-member in the related test-case for the class that they are defined in. If that source class is used as a parent class for a different class, and the members of that parent class are not overridden, then the prescribed tests of the parent class’ members would provide the required testing. If a parent class member was overridden, then the prescribed test requirements on the child class would detect that, and require a new test-method in the test case class that relates to the child. The function used to determine inheritance status is already in the code, and is quite simple — Its documentation is longer than its code:

def is_inherited(member_name: str, cls: type) -> bool:
    """
    Determines whether a member of a class, identified by its
    name, is inherited from one or more parent classes of a
    class.

    Parameters:
    -----------
    member_name : str
        The name of the member to check inheritance status of
    cls : type
        The class that the member-name is checked for.

    Returns:
    --------
    True if the named member is inherited from a parent.
    False otherwise
    """
    parents = cls.__mro__[1:]
    result = any(
        [
            getattr(parent, member_name, None) \
                is getattr(cls, member_name)
            for parent in parents
        ]
    )
    logger.debug(f'is_inherited({member_name}, {cls}): {result}')
    return result

While I’m on the topic of test prescriptions and their relationship to inheritance, it also feels worthwhile to note that abstract members of classes will have prescribed test requirements too. That was conscious decision made during a previous pass at the idea that this package implements, and though I didn’t make a conscious decision about it in this iteration, the logic behind the previous decision still holds true, I think: If the PACT processes are intended to assure that the contracts of source entities are tested, and given that an abstract member of a class is part of that class’ interface, it follows that there should be test-methods associated with abstract members of classes. What has not been given any consideration yet, in that area, is whether the prescribed tests should include all of the requirements expectations for a concrete member. Doing so would increase the number of test-methods to no good end, I feel, so I’m inclined to add detection of abstract state for a class-member wherever possible, and limit the test prescription accordingly, but I will have to think more on that before I settle on that as the preferred path. For example, given this abstract class:

from abc import ABC, abstractmethod

class BaseThingy(ABC):

	@abstractmethod
	def some_method(arg, *args, kwdonly1, *kwargs):
		pass

…the current test-prescription process would require test_some_method_happy_paths, test_some_method_bad_arg, test_some_method_bad_args, test_some_method_bad_kwdonly1, and test_some_method_bad_kwargs test-methods, all of which would really only be able to test that an object of a class deriving from BaseThingy could not be instantiated without implementing some_method.

Those notes get this series of posts up to date with respect to the package version that’s in the repo and installable from PyPI as of the beginning of May, 2025. With everything I’ve covered in this post in mind, the work for the v.0.0.7 release, whenever I can get to starting it, will include:

Implement all unit tests.
Rework the property/data-descriptor detection to handle properties specifically, and other descriptors more generally:
Member properties can use isinstance(target, property), and can be checked for fget, fset and fdel members not being None.
Other descriptors can use inspect.isdatadescriptor, but are expected to always have the __get__, and at least one of the __set__ and __delete__ members shown in the descriptor protocol docs.
Set test-method expectations based on the presence of get, set and delete methods discovered using this new breakout.
Update tests as needed!
Correct maximum underscores in test-name expectations: no more than two (__init__, not ___init__).
Think about requiring an _unhappy_paths test for methods and functions that have no arguments (or none but self or cls).
Give some thought to whether to require the full set of test-methods for abstract class members, vs. just requiring a single test-method, where the assertion that the source member is abstract can be made.

The full implementation of the existing unit tests is, to my thinking, a critical step in this. After all, part of the purpose that they serve is to provide regression testing, and it just feels like the balance of the code is stable enough that regression makes sense to add now, rather than waiting for another round of changes. The balance of the items in that list may take a while to work through to my satisfaction: As I’m writing this, there are 90 test-methods that I have yet to examine and potentially implement just to get that first item checked off, and since that needs to happen first, I may not post about the PACT stuff again for a bit.

Thursday, May 1, 2025

The PACT for classes

Note

This version of the package was tagged as v.0.0.3 in the project repository, and can be examined there if desired.

The test-expectations and goals for members of classes are, fundamentally, identical to those for functions. Specifically:

Goal
  1. Every method of a class should have a corresponding happy-paths test method, testing all permutations of a rational set of arguments across the parameters of the method.
  2. Every method should also have a corresponding unhappy-path test-method for each parameter.

The main differentiator is not in what kinds of expectations are present, but in how those expectations are defined and applied to members of classes. With the exception of the main test-method (test_source_class_has_expected_test_methods), the implementation of the expected_test_entities and source_entiteis properties of the ExaminesSourceClass class, and the addition of an INVALID_DEL_SUFFIX used to identify test methods for invalid property and data-descriptor delete-methods, the relationships between the members of the ExaminesSourceClass class are pretty much identical to the members and relationships between them of the ExaminesSourceFunction class from the previous article. Diagrammed, those members’ relationships to each other are:

The key reason behind the differences between setting expectations between classes and functions is that classes have members, while functions do not. The members of classes that can be meaningfully tested to the extent that setting a testing expectation for them are limited to methods and data-descriptors. Methods, when it comes right down to it, are just functions that may have a common scope parameter: self for instance methods, indicating which object instance the method should act in relation to, or cls for methods that the @classmethod decorator has been applied to, indicating which class the method should act in relation to. There is also the @staticmethod decorator, which is used to attach a method to a class without either an instance or a class scope expectation. In all of these cases, the code for the method looks like a simple function, for example:

class MyClass:

    def instance_method(self):
        pass

    @classmethod
    def class_method(cls):
        pass

    @staticmethod
    def static_method():
        pass

The identification of any method in the target_class by the source_entities property uses the built-in inspect.getmembers function to find members, using the built-in callable function to detect callables, and inspect.isclass to filter out any callables that are classes rather than functions or methods. Similarly, the source_entities property uses inspect.getmembers in conjunction with the inspect module’s isdatadescriptor function to identify members defined as properties with the property decorator.

Warning

As I was writing this article, and doing the relevant review, I noticed a discrepancy between how source_entities behaves in comparison with expected_test_entities. I’m not sure if it is significant or not as I’m writing this, but since this article is being written about v.0.0.3 and the current repo version is v.0.0.6, it won’t really be addressed until v.0.0.7 if it is a problem.

Properties, and data descriptors in general, are objects attached to the classes they are members of, following an interface structure that is hinted at in the Pure Python Equivalents: Properties example of the Descriptor HowTo Guide. Descriptors have __get__, __set__ and __delete__ methods, and property objects augment that with fget, fset and fdel members that contain the getter-, setter-, and deleter-methods that are written in user-generated code like so:

class MyClass:

    @property
    def name(self):
        ...

    @name.setter
    def name(self, value):
        ...

    @name.deleter
    def name(self):
        ...

Because a data-descriptor may be a property or a custom descriptor type, and thus may or may not have the property-specific fget, fset and fdel members, the expectations for test-methods for properties and descriptors may not be able to reliably determine if the related actions need to be tested. That is, a property might be defined with a getter and setter, but no deleter, in which case its fdel member will be None. A non-property descriptor is expected to always have the __get__, __set__ and __delete__ member methods, but they might not be implemented, and there may not be a good way to make that distinction without simply requiring that all non-property descriptors test all three methods’ actions. The fact that all of those members and methods are, themselves, defined as methods still allows the same parameter/argument expectations to be determined: the example name property above could be expected to have test_name_happy_paths and test_set_name_bad_value test-methods, testing the happy-path set/get processes, and unhappy set-scenarios, respectively. It would also need a test_name_invalid_del test-method, assuming that the property has a deleter. If name were defined as a more generic data-descriptor.

Tip

There are other implementations that behave in ways similar to property and general descriptor objects that may not implement the descriptor interface. I know of one example offhand: the various Field types provided by Pydantic. Those will have to be handled with more specific functionality later, but that won’t be a consideration until the v.1.0.0 version of this package is complete.

With those properties added to the MyClass class, running the test-suite before adding any of the expected test-methods results in the expected failures:

================================================================
...
AssertionError: False is not true : 
  Missing expected test-method - test_class_method_happy_paths
================================================================
...
AssertionError: False is not true :
  Missing expected test-method - test_name_set_bad_value
================================================================
...
AssertionError: False is not true :
  Missing expected test-method - test_name_invalid_del
================================================================
...
AssertionError: False is not true :
  Missing expected test-method - test_instance_method_happy_paths
================================================================
...
AssertionError: False is not true :
  Missing expected test-method - test_static_method_happy_paths
================================================================
...
AssertionError: False is not true :
  Missing expected test-method - test_name_happy_paths
----------------------------------------------------------------

After adding all of the expected test-methods reported as missing, the example project’s structure looks like this:

project-name/
├─ Pipfile
├─ Pipfile.lock
├─ .env
├─ src/
│   └─ my_package/
│      └─ module.py
│         ├─ ::MyClass
│         │  ├─ ::instance_method() # instance method
│         │  ├─ ::name              # property
│         │  ├─ ::class_method()    # class method
│         │  └─ ::static_method()   # static method
│         └─ ::my_function()
└─ tests/
    └─ unit/
       └─ test_my_package/
          ├─ test_project_test_modules_exist.py
          │  └─ ::test_ProjectTestModulesExist
          └─ test_module.py
             ├─ ::test_MyClass
             │  │  # Property tests
             │  ├─ ::test_name_happy_paths
             │  ├─ ::test_name_invalid_del
             │  ├─ ::test_name_invalid_del
             │  │  # Method tests
             │  ├─ ::test_instance_method_happy_paths
             │  ├─ ::test_class_method_happy_paths
             │  └─ ::test_static_method_happy_paths
             └─ ::test_my_function

And, finally, the complete module_memberes class-diagram looks like this:

At this point, the package does everything that the most basic interpretation of my initial desires required: With adequate inclusion of the project- and module-level tests, and some short but relatively tedious manual effort to stub out the test-suite for the package itself, there are 108 test methods defined, 96 of which are still pending implementation. The results of the test-suite show that clearly:

===================== test session starts ======================
tests/unit/test_goblinfish/test_testing/test_pact/test_abcs.py 
                                              ..ssssssss  [  9%]
tests/unit/test_goblinfish/test_testing/test_pact/
    test_module_members.py 
           .ssss.sssssssssssssssssssss.ssssssssssssssss.  [ 50%]
tests/unit/test_goblinfish/test_testing/test_pact/
    test_modules.py              .ssss.sssssssssssssssss  [ 72%]
tests/unit/test_goblinfish/test_testing/test_pact/
    test_pact_logging.py                               .  [ 73%]
tests/unit/test_goblinfish/test_testing/test_pact/
    test_project_test_modules_exist.py                 .  [ 74%]
tests/unit/test_goblinfish/test_testing/test_pact/
    test_projects.py                ss.sssssssssssssssss  [100%]
================= 12 passed, 96 skipped in 0.05s ===============

Running a coverage report shows about what I’d expect at this point as well: a fair bit of the code is being called in most cases, but nowhere near what I’d like. The module_mambers.py missing-lines report is most of the source file, modules.py has 7 substantial chunks identified as currently untested, and projects.py has 4. Even just the simple percentages reported are a good indicator:

Name Stmts Miss Cover
src/goblinfish/testing/pact/abcs.py 13 3 77%
src/goblinfish/testing/pact/module_members.py 131 129 2%
src/goblinfish/testing/pact/modules.py 74 44 41%
src/goblinfish/testing/pact/pact_logging.py 18 5 72%
src/goblinfish/testing/pact/projects.py 63 15 76%

Still, as I noted in the first article in this series, I hadn’t originally planned to get tests actually implemented until after this point had been reached. After the little bits of chaos that interfered with that original plan, which I’ll get into more detail about in the next article, actual test-implementations and the fixes that would come of that were deferred until v.0.0.7, which I’ll get into in the article after next.

Local AWS API Gateway development with Python: The initial FastAPI implementation

Based on some LinkedIn conversations prompted by my post there about the previous post on this topic , I feel like I shoul...