Summary
A discussion of the "Single Purpose" intent of a class and how to confirm a class complies with it. See the "S" in SOLID.
How to determine if a class has a single purpose
A class should have a single purpose. See the "S" in SOLID: https://www.digitalocean.com/community/conceptual-articles/s-o-l-i-d-the-first-five-principles-of-object-oriented-design
The idea is that the class encapsulates a single entity with a single purpose. Having a single purpose has strong benefits:
- The class has one responsibility and therefore one and only one reason to change. Fewer changes means less bugs.
- The class will be more stable. This assumes that the domain for that class is stable and therefore the class is stable and therefore changes less often. Fewer changes means less bugs.
- Single-purpose classes are more likely to be reused in other applications.
The problem is how do we know the class is defined correctly?
First steps
A class contains a set of member variables and a set of functions.
An initial observation is that each of the member variables and each of the functions relate to each other and to the class' purpose.
- define an initial definition of the class' purpose
- check each member variable and determine if that variable is associated with the class purpose
- check each function and determine if that function is associated with the class purpose
But the word "associated" is hazy. And there is a risk of redefining the class purpose to be "bigger" and thereby incorporate all the variables and functions.
Boyce-Codd Normal Forms
A new perspective on this is to use the concept of a "key" from database theory. The BC Normal Forms indicate the concept of a prime-key and non-prime attributes. The idea is that the values held in a prime-key uniquely identify the database table row.
Apply that idea to a class. Is there one (or more) member variables that uniquely identify an instance of the class.
For example:
class Person:
self.first_name = None
self.last_name = None
self.ssn = None
self.street = None
self.city = None
self.state = None
self.zipcode = None
Can self.street1 be a unique identifier (a prime-key) of a Person? No. There can be multiple Persons who live in a house.
Can self.first_name, self.last_name or a combination of the two be a unique identifier of a Person? No. There can be multiple Persons with the same name.
Is the self.ssn is a unique identifier of a Person? Yes.
Attributes
Once the idea of prime-key is laid out, then BC Normal Forms also provides the idea of attributes.
Each attribute is somehow related to the prime-key values. Or better yet, each attribute is strongly associated to the prime-key.
Refine the steps
- define an initial definition of the class' purpose
- determine the variables of the prime-key.
- double-check: these variables are strongly associated with the class purpose.
- double-check: every instance of the class will have specific values that uniquely identify that instance and no others
- check all other member variables and determine if that variable is strongly associated with the prime-key
- check each function and determine if that function is associated with the class purpose
How to check if an attribute is strongly associated
Each attribute is related to the prime-key. This has to be true. If it isn't then that attribute does not belong in this class i.e. class is actually two merged into one.
Each attribute can be directly related/associated or indirectly associated or weakly associated with the prime-key. To determine which one that is:
- check each variable against all the other variables and the prime-key
- track the "strength" of the association
- if the variable's strongest association is with the prime-key then it should be a member of this class
- if the variable's strongest association is with another variable,then those two members are part of another class, and an instance of that class should be a member of this class
- if the variable is weakly associated with the prime-key and weakly associated with other member
variables, then it can
be a member of this class or not. The decision to split it off into a separate class involves:
- long-term maintenance efforts i.e. what are the odds in the app's domain that the member variable will stay the same over the long-term
- reuse benefits.
- do the functions that use this member variable isolated i.e. do they ONLY operate on this variable? If so then it is likely this variable and those functions are a separate class
What about the functions?
The same basic idea works for the functions as well. Each function is somehow related to the prime-key. This has to be true. If it isn't, that function does not belong in this class i.e. again, the class is actually two merged into one.
- check if each function uses the prime-key.
- if a function doesn't use the prime-key, then it should use an attribute member variable
- if a function doesn't use the prime-key or an attribute member variable, then check other classes in the domain to see if that function has a stronger association with one of them.
Refine the steps
- define an initial definition of the class' purpose
- determine the variables of the prime-key.
- double-check: these variables are strongly associated with the class purpose.
- double-check: every instance of the class will have specific values that uniquely identify that instance and no others
- check all other member variables and determine if each variable is strongly associated
- check each variable against all the other variables and the prime-key
- track the "strength" of the association
- if the variable's strongest association is with the prime-key then it should be a member of this class
- if the variable's strongest association is with another variable, then those two members are part of another class, and an instance of that class should be a member of this class
- if the variable is weakly associated with the prime-key and weakly associated with other
member variables, then it
can be a member of this class or not. The decision to split it off into a separate class
involves:
- long-term maintenance efforts i.e. what are the odds in the app's domain that the member variable will stay the same over the long-term
- reuse benefits. Is it possible to use this variable and functions in another app?
- do the functions that use this member variable isolated i.e. do they ONLY operate on this variable? If so then it is likely this variable and those functions are a separate class
- check each function and determine if that function is associated with the class purpose
- check if each function uses the prime-key.
- if a function doesn't use the prime-key, then it should use an attribute member variable
- if a function doesn't use the prime-key or an attribute member variable, then check other classes in the domain to see if that function has a stronger association with one of them.
Applying the Steps
class Person:
def __init__(self):
self._first_name = None
self._last_name = None
self._ssn = None
self._street = None
self._city = None
self._state = None
self._zipcode = None
- the SSN is a unique id per person and has strong associations with the first and last name, i.e. the intent of this class. We can declare it to be the prime-key
- The first and last names are strongly associated with each other
- the street, city, state and zipcode more strongly associated with other than with the first and last names, but there an association between those two groups.
Applying these:
class Address:
def __init__(self):
self._street = None
self._city = None
self._state = None
self._zipcode = None
# TODO add country? this will make it reusable in other apps?
class Name:
def __init__(self):
self._first = None
self._last = None
# TODO add the ability to have one or more middle names, or none
class Person:
def __init__(self):
self._ssn = None
self._name = Name()
self._address = Address()
Can Person be simplified any further? At this point, no. In the future as more information is placed into it to it, there can be additional attributes that are added to Address, Name or Person itself, and potentially additional class attributes.
Another tactic
To push the idea of "strong associations" try this. Mentally remove an attribute from the class.
- does the class still make sense?
- can you update/trim the purpose of the class to make it make sense?
- is there a gap in the semantics of the class? Is that gap significant or just a nit?
If removing the attribute is say catastrophic to the semantics or purpose of the class, then there is clearly a strong association between that attribute and the class. It's also a strong indication the attribute may be the prime-key or a part of the prime-key.
If removing it is a minor pain or a gut-feel pause moment, then it is a weak association, i.e. still a necessary attribute by not a prime-key
If removing it is a shrugging of the shoulders, a "who cares?" moment, then that attribute may belong somewhere else. It may pay to leave it in this class but if other classes have already been extracted, it may belong in one of them.
But Wait!
This intuitive, general notion of association and the strength of association is good. It will lead to clean, tight classes with a minimal but necessary set of attributes in a class.
But there is a potential for it to be misused. For example, because of lack of discipline or for the sake of time constraints, an "association" can be declared or a real association can be declared stronger or weaker than it actually is. In the same way that inheritance has been misused, association can be misused too.
Also, intuition can lead you astray. It is possible to make a mistake in the evaluation of the strength of an attribute's association, but emotionally you feel confident it's real.
Extracting out secondary classes seems like an extra effort, but it has paid off (at least in my experience) by clarifying the architecture of the class. It clarifies the "micro-architecture" of a group of classes, and it simplifies how they relate and interact with each other. All of this leads to simpler code that is much easier to maintain and extend when necessary.
There are other kinds of classes as well. For example the main line of an app should be encapsulated in a class.
class App:
def __init__(self):
# top-level attributes
pass
def init(self):
# initialization or setup code to ensure the environment the App runs in is correct
# splitting off initialization from creation (i.e. __init__)
# also prevents cyclic dependency issues
pass
def run(self):
# the mainline
pass
def term(self):
# cleanup or teardown code used to ensure the App leaves everything in a good state
# whether the run was completed, ctrl-C, or aborted
pass
The steps do work in these kinds of classes, but the associations in the top-level attributes are less clear. Apps tend to have multiple overlapping behaviors and therefore the purpose is not as "single" as other kinds of classes.
For example, an MVC GUI has a data component ("Model") has the Graphic components ("View") and an overall handler of responses from the User ("Controller"). But the Controller populates the Model and is responsive to the View's interactions with the User. And... so what is the purpose of the App class? To do a bit of everything.