When last I wrote about the goblinfish-testing-pact
package, I had a checklist of things that I was going to pursue:
- Implement all unit tests.
- Rework the property/data-descriptor detection to handle properties specifically, and other descriptors more generally:
- Member properties can use
isinstance(target, property)
, and can be checked forfget
,fset
andfdel
members not beingNone
. - Other descriptors can use
inspect.isdatadescriptor
, but are expected to always have the__get__
, and at least one of the__set__
and__delete__
members shown in the descriptor protocol docs. - Set test-method expectations based on the presence of get, set and delete methods discovered using this new breakout.
- Update tests as needed!
- Correct maximum underscores in test-name expectations: no more than two (
__init__
, not___init__
). - Think about requiring an
_unhappy_paths
test for methods and functions that have no arguments (or none butself
orcls
). - Give some thought to whether to require the full set of test-methods for abstract class members, vs. just requiring a single test-method, where the assertion that the source member is abstract can be made.
I had originally intended to address these items in the order they were
listed, but as I started thinking through the implications of completing
the entire unit test suite first, it occurred to me that going down that
path would almost certainly end up creating more work than I really needed.
Specifically, while the idea of having a complete test-suite for the package
before making changes had its appeal — providing a regression testing
mechanism is, generally speaking, a good thing — but I was all but certain that
I was going to have to completely re-think how I was dealing with detecting
and analyzing @property
members
and custom data-descriptor members of classes. That would, I felt, put me in
the position of writing a bunch of tests that were likely to go away in very
short order. Writing tests is rarely a waste of time, but in this particular
case, I expected that I would be spending a fair chunk of time writing them,
then time revising the processes behind them, then reconciling the new code
against the old tests, fully expecting that many of them would have to be
rewritten to a significant degree.
The thinking behind the expectation that I would need to rewrite the property-
(and property-like) test-name-expectations processes hinged on various future
states that I wanted to support for the package. To start with, all I was
concerned about was generating expected test-method names for class attributes
that were managed using one of the built-in options for attribute management:
the @property
decorator, and custom solutions that used the built-in Descriptor
protocol (which also include method-sets built with the @property
decorator). A property
is a data-descriptor, but not all data-descriptors
implement the property
interface. That is worth showing in some
detail, so consider the following class and code:
from inspect import isdatadescriptor
class Person:
@property
def first_name(self):
return self._first_name
@first_name.setter
def first_name(self, value):
self._first_name = value
@first_name.deleter
def first_name(self):
try:
del self._first_name
except:
pass
def __repr__(self):
return (
f'<{self.__class__.__name__} at {hex(id(self))} '
f'first_name={self.first_name}>'
)
inst = Person()
inst.first_name = 'Brian'
print(repr(inst))
print(
'• isinstance(Person.first_name, property) '.ljust(48, '.')
+ f' {isinstance(Person.first_name, property)}'
)
print(
'• isdatadescriptor(Person.first_name) '.ljust(48, '.')
+ f' {isdatadescriptor(Person.first_name)}'
)
If this is dropped into a Python module and executed, it will output something along the lines of:
<Person at 0x102247650 first_name=Brian> • isinstance(Person.first_name, property) ...... True • isdatadescriptor(Person.first_name) .......... True
This is expected behavior: The built-in property
class
implements the descriptor protocol mentioned earlier, which is really
no more, apparently, than implementing __get__
, __set__
,
and __delete__
methods. A property
object
also has fget
, fset
and fdel
methods, which are where the property
methods in the code are
stored, and that are called by the __get__
, __set__
,
and __delete__
methods of the descriptor. That is, when
the @property
is applied to first_name
in the
class above, that method is stored in the fget
method of
the resulting property
object, and is called by the __get__
method. Similarly the @first_name.setter
and @first_name.deleter
decorations attach the methods they decorate to the fset
and fdel
methods of the property, which are called by the
__set__
and __delete__
methods, respectively.
ImportantAll properties are data descriptors.
So, what happens if we add a data-descriptor? Here's a bare-bones
implementation of one, added to the same Person
class,
and with updates to show the results:
from inspect import isdatadescriptor
class Descriptor:
def __set_name__(self, owner, name):
self.__name__ = name
self.__private_name__ = f'_{name}'
def __get__(self, obj, objtype=None):
if obj is None:
return self
try:
return getattr(obj, self.__private_name__)
except Exception as error:
raise AttributeError(
f'{obj.__class__.__name__}.{name} '
'has not been set'
)
def __set__(self, obj, value):
setattr(obj, self.__private_name__, value)
def __delete__(self, obj):
try:
delattr(obj, self.__private_name__)
except Exception as error:
raise AttributeError(
f'{obj.__class__.__name__}.{name} does '
'not exist to be deleted'
)
class Person:
@property
def first_name(self):
return self._first_name
@first_name.setter
def first_name(self, value):
self._first_name = value
@first_name.deleter
def first_name(self):
try:
del self._first_name
except:
pass
last_name = Descriptor()
def __repr__(self):
return (
f'<{self.__class__.__name__} at {hex(id(self))} '
f'first_name={self.first_name} '
f'last_name={self.last_name}>'
)
inst = Person()
inst.first_name = 'Brian'
inst.last_name = 'Allbee'
print(repr(inst))
print(
'• isinstance(Person.first_name, property) '.ljust(48, '.')
+ f' {isinstance(Person.first_name, property)}'
)
print(
'• isdatadescriptor(Person.first_name) '.ljust(48, '.')
+ f' {isdatadescriptor(Person.first_name)}'
)
print(
'• isinstance(Person.last_name, property) '.ljust(48, '.')
+ f' {isinstance(Person.last_name, property)}'
)
print(
'• isdatadescriptor(Person.last_name) '.ljust(48, '.')
+ f' {isdatadescriptor(Person.last_name)}'
)
Running this updated module code results in:
<Person at 0x1009bea10 first_name=Brian last_name=Allbee> • isinstance(Person.first_name, property) ...... True • isdatadescriptor(Person.first_name) .......... True • isinstance(Person.last_name, property) ....... False • isdatadescriptor(Person.last_name) ........... True
From that output, it is apparent that:
ImportantNot all data descriptors are properties
This is also expected behavior, based on the Descriptor protocol, which notes that:
Define any of these methods* and an object is considered a descriptor and can override default behavior upon being looked up as an attribute.
*__get__
,__set__
and__delete__
That does mean, though, that there are two distinct interfaces to contend with just in the built-in options. The fact that those interfaces overlap was mildly annoying to me, since it meant that a check for whether a given class-member is a property has to happen before checking whether it is a non-property data-descriptor, but in the grander scheme of things, that is not really a big deal, I thought.
Earlier I mentioned that there were future states that I also wanted the
package to support. One of those is Pydantic,
which I find I use directly or through the
Parser extra of the Lambda Powertools package.
Pydantic is, at a minimum, a data validation library, that allows a
developer to define structured data (as classes) with attributes (Field
s)
that perform runtime type- and value checking for objects. The other
is the BaseModel
and Field
functionality
provided by Django.
Django is a high-level Python web framework that encourages rapid
development and clean, pragmatic design,
that has been around since
2010, and is a very popular framework for developing web applications
using Python.
While I wasn't going to worry about the actual implementation for those
just yet, I needed to have a solid idea of whether their respective
Field
objects followed one of the interfaces that I'd already
accounted for. The short answer to that question, for both, was no,
unfortunately. In Pydantic's case, the fields, defined explicitly using a
full Field
call as shown in the
documentation, do not even exist as named members of the class.
That is, given a module with:
from pydantic import BaseModel, Field
class Person(BaseModel):
first_name: str = Field()
last_name: str = Field()
print(Person.first_name)
…running that module raises an AttributeError
:
File "/.../throwaways/pydantic-test/model-test.py", line 10, in <module> print(Person.first_name) ^^^^^^^^^^^^^^^^^ File "/.../pydantic-test/.../site-packages/pydantic/... /_model_construction.py", line 271, in __getattr__ raise AttributeError(item) AttributeError: first_name
As it turns out, Pydantic's model fields are stored in a class attribute,
__pydantic_fields__
, where they are tracked as a dictionary
of field-name/field-object key/value pairs. This could be verified by
adding the following code to the same module:
inst = Person(first_name='Brian', last_name='Allbee')
print(repr(inst))
from pprint import pprint
pprint(Person.__pydantic_fields__)
for field_name, field in Person.__pydantic_fields__.items():
print(
f'isinstance(Person.{field_name}, property) '
.ljust(48, '.') + f' {isinstance(field, property)}'
)
print(
'inspect.isdatadescriptor(Person.{field_name}) '
.ljust(48, '.') + f' {inspect.isdatadescriptor(field)}'
)
…which yielded this output:
Person(first_name='Brian', last_name='Allbee') Brian Allbee {'first_name': FieldInfo(annotation=str, required=True), 'last_name': FieldInfo(annotation=str, required=True)} isinstance(Person.first_name, property) ........ False inspect.isdatadescriptor(Person.{field_name}) .. False isinstance(Person.last_name, property) ......... False inspect.isdatadescriptor(Person.{field_name}) .. False
So, at least for the purposes of supporting test-method expectations for test suites that test Pydenatic model fields, there is yet another interface involved just finding the fields and their names. That, though, is a problem for another day, though it did show me that I would need to implement a very flexible solution.
Django's Field
objects are a little easier to deal with — at
a minimum, they can be referenced in a more normal fashion, for example
ClassName.FieldName
. The exploration code used to determine
how those field object behaved was:
import inspect
import sys
from django.db import models
from django.db.models.fields import Field
class Person:
first_name = models.CharField(max_length=30)
last_name = models.CharField(max_length=30)
birth_date = models.DateField()
class Meta:
ordering = ['last_name', 'first_name']
verbose_name = 'Person'
verbose_name_plural = 'People'
def __repr__(self):
return (
f'<{self.__class__.__name__} at {hex(id(self))} '
f'first_name={self.first_name} '
f'last_name={self.last_name} '
f'birth_date={self.birth_date}>'
)
def __str__(self):
return f'{self.first_name} {self.last_name}'
inst = Person()
inst.first_name='Brian'
inst.last_name='Allbee'
print(repr(inst))
print(type(Person.first_name))
print(type(Person.birth_date))
print(
'isinstance(Person.first_name, property) '.ljust(48, '.')
+ f' {isinstance(Person.first_name, property)}'
)
print(
'isinstance(Person.first_name, Field) '.ljust(48, '.')
+ f' {isinstance(Person.first_name, Field)}'
)
print(
'inspect.isdatadescriptor(Person.first_name) '.ljust(48, '.')
+ f' {inspect.isdatadescriptor(Person.first_name)}'
)
When run, that generated
<Person at 0x101ddbe10 first_name=Brian last_name=Allbee birth_date=<django.db.models.fields.DateField> > <class 'django.db.models.fields.CharField'> <class 'django.db.models.fields.DateField'> isinstance(Person.first_name, property) ........ False isinstance(Person.first_name, Field) ........... True inspect.isdatadescriptor(Person.first_name) .... False
They still are not property
objects, or recognized as
data-descriptors with inspect.isdatadescriptor
, but could
be extracted with a similar custom function that uses isinstace
to check class-members against the Field
base class that
appears to be the lowest common denominator for all of Django's
specific field classes. That should handle all the different variations
like the CharField
and DateField
shown in the
example above.
So, knowing now what I do about how the different variations of managed attributes (properties, descriptors, and model-field implementations for Pydantic and Django) behave, the goal is fairly straightforward: Figure out a way to scan all members of a target class, checking whether they fall into one of those categories, and building out an expected test-case-name set accordingly. Since Pydantic and Django aren't the only third-party packages out there that might have similar constraints, that implementation needs to be highly extensible as well — there's no telling how some other package might implement things, though I would hope that they would tend to gravitate towards using the built-in data-descriptor protocol, since it is built in. That implementation strategy will be the topic for my next post relating to this package.