June 20, 2014

Your Code Review Questions Answered

Helix Swarm

DevTalk Code ReviewOn Tuesday, June 18th I presented a webinar on 3 ways to make the most of your code reviews. I covered the philosophy our team has adopted and the process we go through to make code review work for us. If you missed it, the recording is available here. We received a lot of excellent questions, so many I couldn't answer them all in the webinar. To ensure no question is left behind; I’ve answered all of the remaining questions below. If you have any additional questions or any of my answers are at all unclear, I encourage you to post on our Swarm forum. I will also be presenting code review best practices at MERGE 2014, The Perforce Conference, learn more here.

Q: Are reviewers focused on a section of code or does everyone review all of the code?

I'm fairly confident what you're asking is "do specific folks tend to review specific areas of the codebase."

We try to mix it up. Ideally we'd like everyone to have at least a passing familiarity with a broad chunk of the codebase.

That said, you inevitably end up with specific folks simply having a better understanding of certain areas (perhaps because they personally wrote them) making them the ideal candidate to do a review.

The solution we've been using is to have anyone who's available do a first pass review (avoiding any real bias for expertise or area of the codebase). The reviewer or the author is encouraged to add another person they feel is particularly expert as a required reviewer though.

As a general rule we also have one of the senior developers give anything non trivial at least a skim. Commonly they are also the expert you'd want to rope in which kills two birds with one stone.

One could read into your question "when performing a review how do folks determine how to limit what they look at", though I'm pretty sure that isn't what you meant; I'll answer it anyways :)

For our process, we focus most heavily on lines which have been added/edited/removed.

That said, to do a good job you need to look carefully at the context the code lives in. It’s not uncommon to notice some unmodified areas that would also benefit from updates and drag those into the conversation for updates.

Q: Do you suggest that the team as a whole go over the code as a group?
Q: How many people do you think are ideal for a code review?
Q: Are 'group code reviews' more efficient than individual reviews?

It’s common on our team to have a random dev do an initial round of review and an expert or senior developer do a final pass. When this occurs the later reviewer may spend a bit less time on it under the assumption its largely been looked at though that depends on the confidence of the initial reviewer and complexity of the change.

Beyond that I think it’s a matter of tradeoffs. Would more eyes find more bugs resulting in better software? Quite possibly. But at what point does the effort outweigh the improvement.

In non-trivial systems there will be bugs. You are not going to get them all. I think the approach of a single reviewer when they feel they confidently understand the work or two when the initial reviewer wants to call in an expert for a sweep hits a pretty reasonable sweet spot of quality v.s. time.

That said; I'm confident other approaches could work as well and I'd be curious to hear about your experience if you've tried larger groups.

Q: Do you have a checklist to focus the reviewers’ actions (design, requirements, logic, cascading affects, etc.)?
Q: Do you follow a review checklist of sorts? To key in on specifics?

As a team, we met and came up with the philosophy and general steps that I outlined in my talk.

We keep this information on our project's wiki allowing us to reference back to it and to get new team members up to speed.

That said, I certainly keep these points in mind but I don't personally utilize a literal checklist. Doing so, particularly when you start out, seems a fine idea if the approach appeals to you or your team though.

Q: What level of test coverage do you consider to be good enough?

This is a difficult question to answer. We shoot for an average line coverage across the project of at least 80%. At times it’s been a bit over 90% but we don't kill ourselves to keep it that high.

I'd note you can have 100% code coverage and still have a lot of bugs. Coverage just means a line was tickled not that every possible input was tested. On this basis, we shoot to verify the 'happy path' of usage has tests and at least a few edge cases and common error cases are also tested.

The 20% that remains untested commonly represents unlikely errors where the defensive code to improve logging or add grace to the failure is so unlikely to come up we don't kill ourselves on covering it.

Q: For automated deployments, that assumes post commit reviews. Do you recommend pre commit reviews?

Automated deployments work great for pre-commit reviews. As we use the term more, I come to feel Automated Deployments likely isn't the best description for the feature and may lend to your misimpression. You could think of it as an Automated Previews if that helps at all.

The idea is to provide a build or instance of the software that is running the proposed changes. For committed reviews we provide a copy that is running at the committed change.

For our particular project, a new virtual host is configured with a sample database and the proposed work is unshelved into that environment.

This allows folks looking at the review to simply click a 'try it' button and be looking at the proposed changes without that work needing to be first committed.

For compiled projects you may provide a 'try it' link to an exe or, if you're feeling fancy, a web based VNC session to a VM running the compiled software.

Q: Can you provide an example of integrating deployment automation with code review sessions?
Q: How do you integrate tests with pre-commit reviews?
Q: In which part of the process do you suggest to integrate the both of them?

At a high level, in most systems, you will be configuring a 'webhook' or other script to run on specific events.

For deployment and automated tests you want to be notified when a review is created or its files updated.

These events will trigger the code review system to call a specific web address (for example a Jenkins URL to start the tests).

On the far end you need a system to respond to this request and perform the desired actions (run tests or do a deploy in the example usages).

We provide details in the Swarm documentation on how to do this with Jenkins. I'd also note we're releasing a new version of the Perforce Jenkins plugin and updating the Swarm UI shortly to make this configuration step even easier.

We don't provide a ton of help on automated deploys at the moment beyond the infrastructure. Actually doing the deploy is left as an exercise to the reader as we anticipated the best approach would vary notably by product. That said; if you'd like to share details on your particular usage on the Swarm forum we'd be happy to advise.

On our team; we integrate both automated deployment and automated tests. They both run in parallel so an attempt will be made to deploy even if the tests fail.

Q: Is it a good practice before doing a code review run a static analysis tool? What other pre-code review tools can you suggest?

Static analysis tools are fantastic and certainly a fine addition to your pre-commit test run (particularly if they complete in a timely manner).

We utilize Jenkins to run our pre-commit review tests and this allows for a variety of steps to be defined. For our project, the Jenkins review test includes code sniffer (to enforce code styling rules), jslint, unit tests and, shortly, selenium tests.

Our goal is to make it unlikely checking the code into main would fail tests so we try to cover the same high points.

That said, our mainline tests include things like running against older server versions, enabling unicode support, enabling noisy triggers and otherwise playing with environment variations so its certainly possible for tests to fail at the later stage.

Balanced against responsiveness though, for us, the single environment tests have worked well.

Q: Do you run just smoke tests in your reviews? Or the whole suite? Have you seen an ideal maximum test run time for code review purposes?

Presently, we run all of our unit tests but only against a single server version and under a single environment. So it would be reasonable to consider it just a smoke test though, at present, it still has decent coverage.

As time goes on and we add tests I expect we'll trim back what runs on reviews a bit to keep the response time acceptable. We may also at some point stage tests so a quick one runs and, if that passes, a longer suite after.

Ideal time is hard to provide but certainly quicker is better. If tests take too long you're stuck waiting on that feedback to proceed after updates to the code. I'd shoot for under 15 minutes if you're able.

Q: I would like to have a follow up presentation with more specifics of the things we wish to catch in code review.

That's a good idea; thanks for the feedback.

We were all pleased to see how popular this presentation was and based on that it seems likely we'll host some related ones in future.

Q: How do you deal with leadership anxiety for finishing a 'fix' and at the same time do the code review?
Q: Thoughts on how to stress the importance of reviews to a team that seems to gloss them over quickly?
Q: How do you make sure reviewers perform the review?

I think leadership anxiety is mitigated by having a culture on the project where quality is seen as valuable and important. I'd note its not really enough just to say those words either. Its important people truly believe it and see the value in it (and there is certainly value to be found).

This also applies to getting developers/reviewers onboard.

If all a developer, or leader, gets told is that reviews are very import, we need to spend time to really really read the code; I don't suspect they'll be on board or do it. I wouldn't.

If you take the time to explain why you want it done, what you see as the value of it and the general motivation I think you'll have much better luck.

If a user experiences a bug, that detracts in an unrecoverable way from their impression of your product and company.

Additionally, tracking down the bug when its occurring at a customer site is difficult and time consuming (both for you and for the customer).

In my experience when we do reviews we commonly find bugs that otherwise would have slipped through. If you too find bugs in review, you'll note studies show fixing bugs that have shipped takes a great deal of time; more than catching them in review would in my experience.

We're not striving for perfection; I'm under no illusions we'll never ship bugs, we certainly do. We're trying to hit a level where bugs are uncommon enough to be acceptable and complex enough to elicit that we can be prideful of our work.

In those cases, hopefully rare, where you have a critical bug that needs fixing; that's an all hands on deck situation.

Commonly we'd consider pair programming the fix and that (with a last sweep of the change by the pair programmers) wraps a review right into the process and allows for a very timely resolution. The process doesn't have to make your reaction time to events of extreme importance slower.

Q: Do you recommend to commit then review or review (shelved) then commit?
Q: Pre-commit review or post-commit review? What do you see as the tradeoffs between the two?

I've really enjoyed having the reviews be shelved work and only committing it when it’s pretty well baked. If you're using Perforce I'd really recommend this approach.

I do imagine other approaches could work but shelving the work for review and considering the mainline a stable'ish branch that is reasonable to run off works well.

If you commit the work, I think you need to either consider main less reasonable to run off of or commit against another branch. This likely increases merging requirements.

You'll also need to deal with backing out changes, using feature flags or otherwise handling features that slip the release which in my experience adds complexity and risks stability.

That said; I'm not personally fixed on pre-commit. If I saw a nice approach for doing it another way I'd certainly give it a go. Particularly if I was using another system, for example git, pre-merge would be the best approach I'm aware of.

Q: Can regression test replace part of the review process?

I would say a resounding no to replace. At the same time though, I certainly wouldn't want to develop without tests. They are in my view a compliment to review and a very fine one at that.

Q: What are some common development tools?

I'm a fan of Perforce, and Swarm is also rather delightful :).

Jenkins, Electric Commander or other automated build/test tools are great.

Automated testing, static analysis, code sniffers are fantastic (but specific tools generally vary by language being processed)

Q: How much code do you cover in a review and for how long a time?
Q: What % of your development cycle time do you spend on code reviews?

A useless; but accurate; answer would be however long it takes.

In an attempt to not be useless though, I'd say it varies to some degree based on the complexity of the work being done and the reviewer's current familiarity with that area of the code.

If you're quite up to speed on it and the change isn't dramatic, its commonly pretty quick.

If the code is all new to you and it’s a big change (or even a modest one in a complex area) it can take a fair while.

I think our senior dev's spend around 40% of their time doing code review. We'd like to get this lower to be honest but to date we haven't found a great work around.

The less senior dev's spend less time on review, probably closer to the 10% range as a reviewer. More if you count the time they spend revving things to respond to review feedback though.

Q: Suggestions for reviewing 250k lines of legacy code?

... ugh.

I wouldn't purport to having authoritative knowledge on the topic but I can certainly tell you what I'd try.

I'd start by ignoring the bulk of the advice I provided in my presentation. You're not going to carefully look at 250k lines of code; your eyes would melt.

I'd start by focusing on automated tests. I wouldn't be satisfied till I had at least light tests of all happy path operations (so things the user is liable to do that aught to work). I'd also like to have tests of common failure paths that are worth testing (plausible to occur, can be handled with some grace when detected).

Once that's done, I'd be inclined to determine which usages of the product are most critical to the user. I'd consider giving a shallower review of code in those areas. That said; it’s a balance. If the system is established and running well I wouldn't change things for the sake of changing them I'd only fix clear bugs personally.

With that done, I'd focus on doing well going forward and fall back to my general advice. The primary caveat at this point is to keep an even broader view of your surrounding UI and code as you know its likely to be lightly if at all reviewed.

Q: You cited that the team success is related to the median skill--where did that come from?
Q: Now he's got me curious, if you can find the This American Life episode would it be possible to link to it later?

The study was referenced in episode 370, "Ruining It for the Rest of Us," of the radio program This American Life. Details are available here and you can also read a transcript.

- - - 

You can watch this webinar on-demand. Follow us on Twitter or sign up to get notifications to hear about future DevTalk webinars.