Notes on Refactoring - John Arrizza

Summary

Some notes on refactoring.

Why refactor?

A useful guideline when talking about any proposed process is to keep your discussion grounded to the answer to "Why do this?" So, why refactor?

In a nutshell, we refactor to make "the internal structure of software easier to understand and cheaper to modify (Fowler)."

"Easier to understand" means things like:

make the code self-documenting
break it down into easily understood chunks
organize the various bits into cohesive semantic entities
view the code as a communication device to a future maintainer

“Cheaper to modify” means things like:

decoupling classes
simplifying class interfaces
simplifying methods
etc.

In the end, your code should say what it's doing, not how it's doing it.

Goal: choose meaningful, self-descriptive names

Choose clear, meaningful, self-descriptive names at all times.

meaningful names

There is a variable named ‘i’. What is ‘i’? What does it do? OK, it might iterate. Iterate what then?

Sometimes we can tell immediately that ‘i’ iterates over a list. While that might not be a problem in this case, in more real-world scenarios these questions routinely arise. The variable declaration and its first use could potentially be pages apart.

self-descriptive names

You always want to use self-descriptive names for everything. This means to "semantically load" the names.

Goal: ensure comments are self-explanatory

Comments need to answer most, hopefully all, questions that a reader has about the code, and about the comments themselves.

If the code is not self-documenting, then add comments.

If the comments are not self-documenting, then rewrite or add text to ensure they are stand-alone.

Ensure these:

correct spelling and grammar in comments
the comments are about what the block does rather than how it does it
the comments explain "why" for the code. These tend be more useful to readers.
Rule of thumb: if it takes longer than a few seconds to figure out what a piece of code is doing, and it cannot be explained simply with better names, use comments to clarify

Goal: functional simplicity

Write your code so simple it HAS to work. A red flag that the code needs refactoring, is the internal question "wait, how does it do that?".

The simplicity of a function can be informally tested by explaining it. If, at that higher level, you need to use complicated if-then-else descriptions, then it probably makes sense to break that function up into smaller, simpler ones.

Another red flag is not being able to easily answer "how can the function fail?"

The goal in this case, is to simplify the function and then ensure that simpler behavior handles all of the possible failure modes. The callers can then rely on that function to do what it claims to be doing. And that means that if a bug occurs, then it unlikely it is in that function.

In other words, functional simplicity reduces bugs.

Goal: focus on extract method

Use ExtractMethod as the main tool. This helps ensure each method is:

simpler
smaller
single purpose

Some characteristics to help identify which methods to refactor:

look for larger methods. More lines generally means more complexity.
many if/else statements
multiple for/while loops

Goal: focus extract class

A class has a single purpose. So to create a new class from the of methods available, look through the existing set of methods.

Look for a set of methods that operate on the same attribute. These are methods that have create, update, or delete behaviors all associated with that specific attribute.
These just don't simply use the attribute, they are behaviors that ensure the attribute's value is correct. They belong together as a single purpose class.

Look for a set of methods that operate on the same type of attribute. These are methods that belong together because the have a single purpose: to abstract the behaviors for that type. For example, a set methods that operate on a particular file. Or a set of methods that operate on a particular kind of file.

If you find similar methods, extract them into a class. This new class will become a lightning rod for moving other methods into it. Eventually the single purpose of that class becomes clearer as the methods are moved into it.

Note: that lead to an additional split of that new class into smaller, more atomic classes.

Note the extract method and extract class refactorings are symbiotic. Extract method breaks the code down. Then extract class organizes the new methods into semantic entities. And that in turns, helps clarify which methods belong to this class, and which methods need to be extracted into another new class.

Goal: classes are single purpose

Concentrate on making existing and new classes have singular purposes. This helps ensure that the code becomes static. Simple, non-changing code has fewer bugs and since it has fewer bugs, it's callers tend to have fewer bugs as well.

Acid test: a one sentence description of the class should not contain the words 'or' or 'and'.

If a class's description has conjunctions in it, then it probably should be split up.

When a class has become singular, check that its name is appropriate, rename if necessary. Rename the methods if necessary. Rename the member variables if necessary.

Goal: try to remove setters and getters

A class with many setters and getters is a red flag. Basically the class is being used as a namespace rather than behavioral entity.

To fix this, find the callers of the setter/getter and see what that class does with it. Can that behavior be moved into this current class? That encapsulates that variable into the current class and in turn provides outside callers the behavior they need.

Goal: make the code style/formatting consistent

Consistent spacing, indentation, parenthesization, brackets, spaces/tabs, comment style, naming, etc. makes reading much easier. Use a automated tool if possible.

Goal: other

Replace deeply nested if statements with methods
Replace nested if-then-else-if-then-else-etc logic with appropriately named methods. Typically there is one if-then-else per method, or at least make the "then" and "else" blocks their own methods.
For loops containing big blocks of code, make those blocks of code into methods.
Replace loop bodies with methods
Look for a few lines of code with a comment above them. Convert that comment into a function name and those lines of code into the function. The original author of that code mentally grouped those lines together, so do it for real.
Replace "instanceof" tests with polymorphism
Replace instanceof() and other explicit checking of types with polymorphisms.