Working with legacy code is a challenge that every team faces.
November 13, 2018

8 Tips For Working With Legacy Code

Coding Best Practices
Static Analysis

Dealing with legacy code can be a chore. Some developers even call it “legacy code hell”.

But, unless you’re starting a project from scratch, legacy code is inevitable. And that means you need a better way to work with it.

What Is Legacy Code?

Legacy code has a different definition, based on who you ask. Some say code is legacy code as soon as it’s written. Others assume legacy code is old. Or that it’s spaghetti code.

The Classic Definition of Legacy Code

Legacy code refers to any code maintained by someone who didn’t write it.

New Definitions of Legacy Code

Today, legacy code is code that you don’t understand and that’s difficult to change.

This can refer to source code inherited from someone else. Or it can refer to source code inherited from an earlier version of the software. Or it can refer to code without unit tests.

Working With Legacy Code Is a Challenge

There are challenges in working with legacy code. And the biggest challenge might be your assumptions about it.

You might think the code is bad. Whoever wrote it didn’t know what they were doing. You could have done a better job.

But the truth is, there usually is a reason why legacy code is how it is. And if you didn’t write it, you might not know that reason.

That’s why care needs to be taken when improving legacy code. You can’t just put a quick fix on one area. There might be some dependencies you’re unaware of.

And that’s why it’s important to know when to maintain legacy code and when to change it.

How to Improve Legacy Code

You can’t improve legacy code overnight. But you can take gradual steps to improve it.

Whether you’re just getting started with a legacy codebase — or you’ve been working on one for a while — here are eight tips for improving it.

1. Test the Code

One way to understand the code is to create characterization tests and unit tests. You can also run a static analyzer over your code to identify potential problems.

This will help you understand what the legacy code actually does. And it will reveal any potentially problematic areas. Once you understand legacy code, you can make changes with greater confidence.

2. Review Documentation

Reviewing documentation of the original requirements will help you understand where the legacy code came from.

Having that documentation handy will help you improve the code — without compromising the system. Without this information, you could accidentally make changes that introduce undesirable behavior.

3. Only Rewrite Code When It’s Necessary

Rewriting legacy code is tempting. But it’s usually a mistake.

It takes too much time and too many programmers to rewrite everything. And even if you do it, rewriting code can introduce new bugs. Or it can remove hidden functionality.

4. Refactor It Instead

It’s better to refactor your legacy code than rewrite it. And it’s best to do it gradually.

Refactoring is the process of changing the structure of the code — without changing its functionality.

This cleans the code and makes it easier to understand. It also eliminates potential errors.

To refactor (without risking mistakes), it’s best to:

  • Refactor code that has unit tests — so you know what you have.
  • Start with the deepest point of your code — it will be easiest to refactor.
  • Test after refactoring — to make sure you didn’t break anything.
  • Have a safety net — e.g., Continuous Integration — so you can revert to a previous build.

5. Make Changes in Different Review Cycles

Don’t make too many changes at once. It’s a bad idea to refactor legacy code in the same review cycle as functional changes.

Plus, this makes it easier for code reviews. Isolated changes are much more obvious to the reviewer than a sea of changes.

6. Collaborate With Other Developers

You may not know the legacy codebase very well. But some of your fellow developers probably do. It’s much faster to ask questions from those who know the codebase best.

So, if it’s possible, collaborate with someone who knows the legacy code better than you do. A second set of eyes on the code may help you understand it better.

7. Keep New Code Clean

There’s a way to avoid making legacy code more problematic. And that’s by ensuring new code is clean. It ought to be written to adhere to best practices.

You can’t control the quality of legacy code. But you can make sure that the code you add is clean.

8. Do Further Research

Working with legacy code gets easier with time. A junior developer may not understand why a codebase hasn’t been refactored (and may be keen to refactor it). But a senior developer will know when to leave it alone.

Learning more about legacy code will help you improve it.

A good starting point is “Working Effectively With Legacy Code” by Michael C. Feathers. In this book, you’ll get some good examples of how to make changes to legacy code.

Another good source is “Refactoring: Improving the Design of Existing Code” by Martin Fowler. This book offers many tips for effectively refactoring code.

Tools for Working With Legacy Code

You’ll always need to work with legacy code — or work around it. After all, legacy code is there for a reason. It works. And its results may be good enough that you can let known issues go.

There are good reasons for changing legacy code, too. You might be adding a feature, fixing a bug, or improving design.

In a perfect world, you’d continually rewrite that legacy code until it’s fully debugged. But chances are, that won’t be practical.

So, what you need to do is figure out what you can change — and leave the rest alone.

Static Code Analysis and Legacy Code

One way to do this is by using a static code analysis tool. You can set a baseline at the legacy code — and then run analysis on the new code to make sure it’s clean. And you can suppress results from your legacy code.

Helix QAC, for example, makes this very easy to do.

Analyze Legacy Code

Helix QAC can check your legacy code against rules, typically from a coding standard. You’ll get diagnostics of violations. And you can prioritize them by severity. This means you can focus your attention on fixing the most error-riddled pieces of legacy code first.

Explore coding standards >>

Set Baselines

You can also set your legacy code as a baseline. Maybe your legacy code is fine as-is, and you want to leave it alone. Setting a baseline means that legacy code won’t be pulled into your diagnostics. Instead, you can focus on finding issues in new code — and ensuring that’s clean.

An example of a baseline for legacy code in Helix QAC.
Helix QAC makes it easy to set baselines.

 

Use Suppressions

You can also use suppressions to create exceptions for your legacy code. So, you can essentially dismiss violations in legacy code. You might set your suppressions on specific rules or violations within a particular category.

An example of a baseline with suppressions for legacy code in Helix QAC.
Helix QAC lets you suppress diagnostics on legacy code.

 

Ensure Compliance

In some cases, you may be reusing legacy code from one project to another. But some legacy codebases weren’t developed with coding standards. And if you need to achieve compliance (e.g., with MISRA), this can create problems. By using Helix QAC with MISRA, it’s easy to see where the errors in your legacy code are.

Improve Legacy Code With Helix QAC