We need to talk about refactoring. Or rather, about the frequent misuse of the word, the dangers this encompasses and what we can do about it. Time to reclaim what is an intrinsic part of software development!
As with many words in the English language, people use the word refactoring for different concepts. It’s a human thing to do, but it keeps surprising me how open for interpretation most terms are in a “scientific” field like computer science. Just try to Google an exact definition for the concept of a unit test: you’ll find about as many interpretations as there are people writing about it.
The term refactoring actually has a clear definition, as coined in the actual book on the topic. Well, actually there are two definitions since you can use the term both as a noun and a verb:
- Refactoring (noun): a change made to the internal structure of software to make it easier to understand and cheaper to modify without changing its observable behaviour.
- Refactor (verb): to restructure software by applying a series of refactorings without changing its observable behaviour.
There’s something left implicit in these definitions which you will definitely pick up on if you read the book, but which might be overlooked otherwise. A single refactoring encompasses a really small change to the software. Extracting a variable. Renaming a method. Some “composite” refactorings have a larger effect on the design of a system, but they are always composed of a series of really small steps where in between you can verify you haven’t broken anything. This is why I like the definition on the official site refactoring.com a bit better:
Refactoring is a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behaviour. Its heart is a series of small behaviour preserving transformations. Each transformation (called a “refactoring”) does little, but a sequence of transformations can produce significant design changes in your code. Since each refactoring is small, it’s less likely to go wrong. The system is kept fully working after each small refactoring, reducing the chances that a system can get seriously broken during the restructuring.
What refactoring is NOT
Having defined what we mean with the term, we can also point out some things that aren’t technically refactorings according to this definition:
- Changing code without continually verifying you haven’t changed the observable behaviour. Ideally you have a suite of self-checking unit tests at your disposal. If not, you can manually check the system every couple of minutes. If you cannot compile your code and verify you haven’t changed any observable behaviour every couple of minutes, you are not refactoring.
- Rewriting a component from scratch. Sometimes this might be more cost-effective than trying to make sense of some especially gnarly legacy code. But if your new code isn’t hooked into the larger system from the start or you cannot verify whether you broke some existing functionality every five minutes or so, this does not classify as refactoring.
- Rewriting an entire system from scratch. I regularly hear teams talk about “our ongoing refactoring” when actually they mean “that big rewrite we are sinking all our time and money in”. As I stated in a previous post, a big rewrite is often a costly mistake. So we should just be honest about it and call it what it is. Expectation management in action.
No big deal, right?
I’ve heard the term refactoring used for all of the above situations. No harm done as long as everyone around the table is talking about the same thing, right? That’s where I beg to differ. By associating these (often negatively connoted and costly) situations with the word refactoring we are actively harming ourselves as software developers. The term should not scare your project manager. It should not be open for discussion during scope meetings (“Do we really have to refactor this sprint?”).
Refactoring is as much a part of software development as the act of writing new code. Consequently, whether or not a team should refactor should not be up for debate. Sure, the desired level of code quality is in the end an economic decision and can be discussed with stakeholders. Are we prototyping a new idea in an unknown market? Or are we building mission critical software that will support the business for years to come? But that’s not what I’m talking about here.
Refactoring should be part of how you write software as an individual developer and as a team. Refactoring should be about safety, achieving a state of flow, receiving constant feedback on your work, constantly being in control. It’s a continuous stream of micro-improvements we make to the code.
Let me make that last point clear with a couple of examples: Say you’re working through some part of the codebase and a chunk of code isn’t immediately clear to you. Perform an extract method refactoring to give it an intent-revealing name. Can’t find a specific piece of code where you expect it? Perform the move method refactoring so you’ll be able to find it more easily in the future. Can’t easily fit that new feature in the current design? Take small steps refactoring the current design while continually verifying you haven’t changed the current behaviour, making it open for the new requirement in the process.
That’s refactoring. That’s something that should not be open for debate with managers. That’s just how we keep a codebase (and ourselves) sane as professional software developers.
How did we end up in this situation? It’s our own fault for abusing the term. We made people conflate refactoring with unnecessary gold-plating and/or costly rewrites.
Training your refactoring muscle
As with almost anything, it takes time and practice to really master a micro-skill like refactoring. I’m a fan of deliberate practice to learn technical skills like these. At first, it might seem weird to try out techniques on little toy problems. My personal experience is that training on these little problems better prepares you for when you face the exact same thing in the real world.
My favourite exercise to practice refactoring is the Gilded Rose kata. You can find a full description in the link, but in short this kata consists of two parts:
- Build a safety net of automated tests around existing code so we can safely refactor it
- Refactor the hairy code in order to implement a new feature more easily
The first part is an interesting exercise on its own, but it is really the second step where you can flex your refactoring muscles.
If you like some extra constraints to up further increase the challenge:
- Try changing the code by only using your IDE’s automated refactoring support. No manual edits! You’d be amazed by how easy you can make design changes in the large by performing a series of automated micro refactorings.
- Try refactoring by hand, following the recipes in Martin Fowler’s book.
- Try refactoring by hand, but try not to change more than a single line at a time, all the while keeping all your tests green!
That last constraint might sound like overkill for such a toy problem. You can do this in 5 minutes, with your eyes closed and one hand behind your back, right? Now imagine we’re back in the real world. You have to change the public API of a monster of a legacy class. Furthermore, this class is being called at hundreds of different call sites. In scenarios like that, being able to work in small, safe, incremental steps while continually receiving feedback you haven’t broken anything makes the difference between steady progress and hours of code that does not even compile. It’s a real life saver for both your productivity and mental health. If you want a hint on how to tackle the last constraint effectively, take a look at the parallel change pattern. If you want full spoilers, I wholeheartedly recommend Sandi Metz and Katrina Owen’s latest book, 99 bottles of OOP.
In this post I wanted to address two points:
First, refactoring leaves a bad taste in many a project manager’s mouth and it’s our own fault. We’ve been abusing the term. So let’s stop abusing the term refactoring when we actually mean rehacktoring or worse yet, big rewrite.
Next, if you haven’t experienced the power, sense of calm and control you feel when taking small refactoring steps first-hand, set aside an hour of your practice time and try it out yourself. Try refactoring some code by only changing one line at a time. Try taking smaller steps than you’re used to right now. You’ll be surprised how it might change the way think about code!
Do you need some help really grokking this low-level yet essential skill? Maybe you want to organize a workshop with your team at work to practice together? Contact me!
- If you want a real-life example of a bigger refactoring session that results in significant changes to the design of a system, take a look at Martin Fowler’s blog posts. This one on refactoring code that accesses an external service is an awesome example.
- If you haven’t already, read the book on the topic by the same author: Refactoring: improving the design of existing code. It’s almost 20 years old by now and most basic refactorings are automated by your IDE so the manual recipes are not that useful anymore, but don’t let that stop you from reading this gem. It’s surprisingly funny for a technical book and contains a lot of examples and insights into the design of software.
- One of the most accessible and fun to read books on the topic is Sandi Metz and Katrina Owens’ 99 bottles of OOP. It not only discusses how you can safely refactor your way to a better design using really small steps, it’s one of the best books on object oriented programming. Period.