What Is Differential Analysis?
Differential analysis is an emerging solution in static analysis. Here, we break down what is differential analysis and how it works in static analysis.
Read along or jump ahead to the section that interests you the most:
- What Is Differential Analysis?
- Incremental vs Differential Analysis
- Why Is Differential Analysis Important?
- Use Cases for Incremental and Differential Analysis
- How Differential Analysis Works with Klocwork
- Examples of Differential Analysis with Klocwork
What Is Differential Analysis?
Differential analysis is a form of “fast feedback” static code analysis. It uses system context data from previous analysis builds to analyze only the new and changed files. At the same time, it provides results as if the entire system had been analyzed.
This approach provides the shortest possible analysis times for the new and changed code — while maintaining the accuracy and detail of the analysis data.
Incremental vs Differential Analysis
Incremental Analysis encompasses the idea of performing an "update analysis", much like a compiler toolchain will perform an "update build"; as opposed to a full, clean, rebuild. By maintaining "analysis objects" (which contain the analysis data for a module and timestamps or checksums for the files that it depends upon) from a prior analysis build, incremental analysis can initiate a ‘dependency check’ to determine which, if any, of the source files have changed and need to be re-analyzed. In incremental analysis, the whole application is essentially re-analyzed with the minimum effort. Done correctly, incremental analysis is much faster than a full analysis with no impact on the accuracy of the results.
However, in a worst case – imagine that the code changes, for a badly designed system, include an update to a central module or interface that is used (or included), unlitmately, by all other modules in the system – the incremental analysis time will then be equivalent to the full analysis time. Sadly, software systems suffering from poor modularity, with unwanted dependencies and couplings, are common and refactoring isn’t always possible. When combined with very large, complex codebases, with very long build and analysis times, this compounds the issues of runtimes for our CI/CD pipelines.
Put simply, Differential Analysis is an enhanced form of incremental static code analysis, desinged for use in CI/CD pipelines, where codebases are large and complex, and build and analysis times are otherwise impractical for “fast feedback” for development teams.
By using system context data from previous full analysis builds on remote build servers, the local static analyzer examines only the files that are new or that have changed as reported by the version control system (e.g. Git, or P4). Using this system context data, shared from the servers, the local static code analyzer is able to provide analysis results "as if" your entire system had been analyzed, even though only a small fraction of the system has actually been analyzed locally. Thus, differential analysis provides you with the shortest possible analysis times while reporting new issues in the changed code, and can facilitate “fast feedback” to development teams, regardless of the overall scale of the codebase, or the lack of modularity in its design.
For tools like Klocwork, differential analysis will also only report the new issues that were detected since the last build, or with respect to the current main branch, to ensure that their focus is maintained on the ‘new issues’ that will degrade the quality of the codebase and lead to issues with compliance.
Why Differential Analysis Is Important
Differential analysis is important to ensure shorter analysis times and faster feedback.
Differential Analysis Accelerates Code Analysis
When you write code, you likely need to comply with quality, security, or standards. In general, this involves running a local or pre-commit analysis of your code.
But what if the code is already committed when the analysis is performed? You may have already have moved on to another task. Fixing those issues could be pushed back further. It could even be pushed back as far as the final release cycle when the backlog of open issues is presented to the development team.
A local or pre-commit analysis generally involves running an analysis of the entire codebase — complete with changes. It reports back the new issues.
This works perfectly well for smaller projects, but issues arise when dealing with larger codebases and longer analysis times.
Static analysis provides incremental analysis capabilities to solve this.
This means that a code change does not require a new analysis build of the entire system. Instead, it's an analysis of the changed files and any files with dependencies upon them.
However, that can still be inefficient. In a worst-case scenario, analyzing the dependencies is similar to analyzing your entire codebase. Depending on the project, this can range from a few minutes to several hours.
Use Cases for Incremental and Differential Analysis
When trying to understand where developers would get the most value out of incremental or differential static code analysis, it is important to consider the feedback times required for the developers to remain productive, and balance that with the constrains created by the code base size and complexity, and the numbers of changes and commits being made since the previous build.
Feature Branch CI Pipeline
As illustrated above, within a feature branch, a developer would typically would commit several versions of his/her branch and a CI job would run with each commit. In this case, differential analysis will save considerable time, provide fast feedback and focused results based on the changes.
i.e. differential analysis will report the new issues or vulnerabilities in a specific change set of files (a subset of the application). For smaller applications, where build and analysis times are less of an issue, the entire application could be analyzed with each commit. Either way, this approach will benefit the development teams by reducing the numbers of issues merged to the main branch.
Within development teams where there are multiple developers, each working on his/her own branch, post successful commits of the Feature branches, a further CI pipeline execution may be initiated for the Dev branch, prior to merging all of the changes to the Release branch. An incremental analysis would be extremely beneficial here, since it will carry all of the changes made by the various Feature branches and analyze them “incrementally” compared to the previous Release branch revision. code. In many cases, where codebases are large and complex, this would save tremendous amount of time, whilst still providing analysis results that are consistent with a full, clean build.
Additionally, as developers make daily changes to the Feature branches that end up eventually merged into the Dev branch, it is critical to have a cadence of re-establishing the most up-to-date system context data available, so that the next differential analysis runs performed on new Feature branches, are as relevant and correct as possible.
Release Branch CD Pipeline
Finally, with the creation of each release candidate upon the Release stream, it is usually recommended that a full, clean analysis is run to ensure that the most accurate analysis data is provided for the whole application, and with no interference from previous runs. This independence is often a requirement of safety and security standards in producing the final policy scan results and compliance reports.
How Differential Analysis Works with Klocwork
Here's how differential analysis works with Klocwork.
1. Connect to a Server Project
Klocwork’s analysis works automatically when a local desktop project workspace is connected to a server project.
2. Run an Integration Analysis
Each time you run a Klocwork integration analysis and push those results to the Klocwork server, they’re saved, and details of the interface behaviors of the existing codebase are then made available for the client tools.
Then each time you run an analysis from your Klocwork desktop or CI tools, this interface behavior information is then the baseline for the rest of the system context.
3. Run a Differential Analysis of Changed Code
When performing analysis of changed code, if Klocwork detects a call to another function or method that hasn’t changed, and therefore has not also been analyzed locally, the interface behaviors from the central server data are used to inform the analysis of your code instead.
Examples of Differential Analysis
Here are some examples of differential analysis with Klocwork.
The null pointer initialized on line 11 will be passed through the calls on line 14 and 21 into sendMessage(..). There it will be swiftly dereferenced and cause this program to crash. If you were to use one of Klocwork’s desktop tools — the command-line kwcheck tool or the Klocwork Desktop GUI to analyze just main.c — you wouldn’t see any defects:
But, let’s say that you’ve got an integration build analysis checked into a project on your Klocwork server. If you were to connect your local project to that project and then re-analyze main.c, you would be able to see the defect:
However, the traceback doesn’t show detail from sendMessage(..) because it hasn’t been analyzed locally, but the knowledge base on the server does show that sendMessage(..) dereferences the 3rd argument passed to it.
With this, you should have enough information now to fix the defect before you commit your code — all without having to analyze your entire codebase.
Entire Project Analysis
But, what if you were to analyze the whole project at once? You would see the detail in the traceback about the sendMessage(..):
The knowledge base that’s on the server is based on the most recent analysis checked in there. This is likely the last analysis of your project’s main codebase.
However, the knowledgebase records for certain functions may be different than the records that would result from analyzing the code within your own workspace.
This is beneficial as the code you’re committing will soon be merged into your mainline. The defects you see now will likely appear after that merge.
With this type of analysis, you get:
- A preview of the defects you’ll get after merging.
- A chance to fix them ahead of time.
Starting Using Differential Analysis
Sign up for a free trial of Klocwork and start using differential analysis today.