Monday morning, 6 AM. You vault out of bed and into the shower, pick out your nicest shirt and leave for work. Finally, a new project. A breath of fresh air. The excitement of starting on a new team. The joys of digging into an unfamiliar codebase!
In this post I provide some techniques I use to get a grasp on a new codebase fast. It’s always fun to start a new green-field project, but let’s be honest for a moment and acknowledge the fact that most projects you will work on as a software developer will start from an existing codebase. These tips and tricks will help you hit the ground running, even when you land in a muddy brown-field mess.
One of the perks of working as a consultant is that you get to see a lot of different codebases early on in your career. This also means that you constantly need to familiarize yourself with new technology stacks and adapt to different ways of doing things all the time. I’ll leave the soft skills for another post, for now I would like to focus on how you can get a grip on an existing codebase and become productive in it as fast as possible.
Pair with your team members
This is the number one tip. The big one. In an ideal world, a senior team member takes you under her wings and gently shows you the ropes by pair programming with you for a while. This is an awesome way to get started and gets you up to speed on how the codebase is structured and how things are done around here. In reality however the project is usually way past its deadline, seriously overbudget and your new manager has never heard of Brooks’ law. The following tips will help you get started either way.
Get to know your problem domain
Are you starting on a financial trading platform? Watch the Wolf of Wall Street. Hired to fix bugs in a tax calculation engine? Read a book on the topic. My point: you cannot effectively solve the problems a business faces without having actual knowledge on how the business works. Also, it’s a good idea to make your code reflect the problem domain as closely as possible, which is taken to its extreme in practices like Domain-Driven Design.
Make implicit knowledge explicit
While we’re still on the topic of getting familiar with business terms, all projects have tons of implicit tribal knowledge. Few projects I’ve worked on had an explicit glossary and/or domain model available that explained the most important business terms and their relationships. For example, when I first started working in the trading sector I had no idea what a short sell was. As I started asking around I noticed that some of the developers didn’t know either. Google around, perhaps there’s a thesaurus available for the sector you are working in. Better yet, find a domain expert on your project and keep asking those questions.
As the new guy on the team, you are in the perfect position to point out areas of knowledge that are implicit and can start making them more explicit.
You’re the new guy. People won’t bat an eye if you spend half an hour a day working on a glossary or a domain model or jotting down the intricate details of how to set up a new development machine. Share this information with the entire team. Ask for feedback and corrections. Keep them alive and up-to-date. All of your team members -present and future- will thank you for it. If you or your team are not really into writing prose, spend a few hours mapping user stories and keep the map on a wall in the room to maintain a shared understanding of what you are actually building.
Start using the software
Working on a todo-list app? Start using it every day. If the software is not something you’d use in your day-to-day life, at least go through some scenarios (whether you find them in automated specs or handwritten test scenarios does not matter) to familiarize yourself with the application you will be working on. I’ve worked in teams where members of the development team had no clue on how to achieve certain tasks through the User Interface of the application they had been working on for years. That’s not the best place to be in, least of all from a User Experience perspective.
As a fresh-from-college developer I spent my first few months performing manual regression testing. As much as I hated it then (and still do if it’s used as your main way of detecting regression in your codebase), it was one of the best ways to get up to speed with how the application was typically used. So do yourself a favor and start playing around with the app you’re working on in the first couple of days.
Explain the software architecture in 5 concepts or less
Have a team member draw the entire software architecture of the system you will be working on in one diagram. Ask them to use at most 5 elements to describe the system. This forces them to really boil down to the high-level architecture of the system and not wander of into a forest of details. Don’t have them bother with messaging protocols, interface declarations or anything else that would clutter up the diagram, the goal is to have an end-to-end overview of the system on a single page. This is just what you need to find your bearings when you’re exploring the codebase the first couple of weeks. For example, the diagram below illustrates the high-level architecture of the system I’m currently working on and helped me immensely when I was trying to find my way around the huge codebase.
Many modern software systems are written in object-oriented languages. This supposedly makes them easier to maintain than the procedural beasts that preceded them. One small drawback of OO systems might be that the massive amount of loosely coupled, tiny objects that comprise an unknown codebase can be daunting at first. Dependencies are injected, events are observed, interfaces are everywhere. I was recently surprised to find that many developers try to grok an unfamiliar codebase from a static viewpoint, i.e. by reading the code and just plain thinking really hard about it. If you feel like your head is going to explode, there is an alternative: just run the program and sprinkle the code with breakpoints. It’s easier to initially understand the runtime composition of an OO system when you’re, actually running it and stepping through the execution flow.
Note that I’m not advocating that we stop using OO altogether (although the use of functional programming languages seems to be on the rise). I’m a big fan of loosely coupled, real OO programs instead of the procedural-code-written-in-an-OO-language-so-we-will-call-it-OO we encounter out in the wild all too often.
I’m also not a big fan of debugger-driven development in general. But in context of trying to understand the basic flow of an unknown piece of OO code, stepping through it in the debugger might be a faster way to get started than statically analyzing it. If the codebase happens to have a test suite that documents how components should be used, that’s even better of course.
Dipping your toes safely
So it’s time to make your first commit to the code. Again, if your team members have time to spare to help you, ask them to pair with you for your first couple of tasks (or simply don’t pick up any tasks yourself and just tag along with someone of the team for a while). If that’s not the case, or if you are in the unfortunate situation of having inherited a codebase solely by yourself, it’s important you make changes safely. Don’t take on a large-scale feature at first. Pick a small task like a bugfix or a trivial change and prepare to start coding. Finally!
Making changes safely can be done in different stages. For example, it’s always a good idea to check a piece of code’s history in version control so you can trace back what people were thinking when they created and/or changed it. The rate of change can also be a good indicator as to how bug-prone the code has proven to be in the past.
If you don’t immediately understand the code, perhaps performing some scratch refactoring will help. When performing scratch refactorings the point isn’t to clean up the code permanently but rather to make it more understandable for you. In the end, you can just throw away your changes if you’re not happy with them.
When you have localized the piece of code that warrants a change, cover it with characterization tests to make sure you don’t break any existing behaviour. This is generally hard to do for code that wasn’t written with tests in mind as I stated in a previous blog post, but it pays off big time. While you’re at it, cover your new code with tests.
After you have implemented your changes, drag a team member in front of your screen and ask for an honest to God and brutal code review. Developers tend to be more than happy to give feedback, but I have found that you often have to ask for it first.
As a new member on the team, you necessarily have a limited view of the system. This will often lead you in the direction of suboptimal design decisions, something I like to call band-aid-driven-development. What I mean by that is that typically, a developer that is unfamiliar with a codebase and its big picture is tempted to make local corrections to fight the symptoms of a problem rather than tackle the actual cause. Let me explain with an example:
When you’re looking for the location of a bug, chances are you’re using the debugger of your IDE. If you’re hunting a NullPointerException for example, your IDE will point you to the exact line where the exception is thrown. Now you have two choices to proceed: either add behaviour that deals with the null in that particular place or investigate why that pesky null suddenly shows up in this context. Perhaps the bug lies further up the call stack and a Null Object might be an elegant solution to this bug and quite possibly a boatload of others.
There’s a great quote by Richard Pattis to illustrate this point:
When debugging, novices insert corrective code; experts remove defective code.
Whenever you are unsure of a design decision, ask a senior team member how he would solve it. This will help you get to know the architectural guidelines and the team’s way of doing things faster.
Don’t go against the grain
Slowly you start getting familiar with the codebase. You pick one of those easy-to-fix bugs from the backlog. You mutter to yourself: “Simple pagination problem. Piece of cake, I’ve fixed one of those just yesterday in a different area of the system.” You start honing in on the problem area but then BOOM. This part of the codebase looks nothing like the parts you’re already familiar with.
This is probably a symptom of some past developer’s “improvement” that did not quite make it through to the entire codebase. There’s a cool name for these partially applied improvements: lava-layer architectures. We’ve all seen them. We all hate them.
The solution? Prefer architectural consistency over your personal preferences. Apply the principle of least astonishment, even on a large scale. If you have a choice between some radically different approach and one that’s already being used in the codebase, go for the known solution.
This doesn’t mean you cannot propose any improvements at all, mind you. Make improvements you want to introduce to the code or the process a conscious decision of the entire team rather than just trying to leave your mark on the codebase in secret, all by yourself. If it’s something the whole team should be involved in, consider setting up an agreed-upon and time-boxed experiment. “I propose we’ll try to unit test all new code and bugfixes for three months. Afterward we’ll look at the defect reduction rate and evaluate if we want to keep doing this in the future”.
Be a nag
Last, but not least: constantly ask questions about everything you don’t quite understand. It could be some exotic language construct like lambda functions, a design pattern you have never seen before or some domain knowledge that does not make sense. Question anything that surprises you and doesn’t make immediate sense from your perspective. Often, you’ll have hit on something you can learn from. If the team members find some part of the code hard to explain, that might be a smell and potential room for improvement. As an added bonus: people tend to let down their walls much faster if you are genuinely interested in their work and interests. Asking questions is the easiest way to show interest and learn something new in the process. Developers love to explain things, you just have to ask.
Also: don’t stop asking questions just because you’re the “senior” on the team. We’re all constantly learning and sharing knowledge. That’s what makes our jobs so much more satisfying and meaningful!
In this post I shared some of the techniques I use to get familiar with an unknown codebase fast. It all comes down to asking a lot of questions, working in safe, predictable, small steps and soaking up as much knowledge as possible from your team members. Share and discuss your new insights with the team as you learn. As you mature in a team, become a mentor and coach the new kids that arrive on your project. Who said developers weren’t team players?
Whenever you encounter a new piece of code, how do you get familiar with it? Sound off in the comments below!
Featured image by tableatny