A Code Review Checklist
Thu 28 May, 2015 / By Jon D Jones
On a recent project, a client had a very in-depth peer review system using a web based code review system called Gerrit. Gerrit works with GIT and prevents unauthorized developers to push changes into master Git repository. Instead, Gerrit provides a web based front-end that will only allow commits to be merged after it has been plus two'ed, or, in more human terms, when someone has OK'ed it to be merged. If the submitted code is not up to scratch, then it gets minus 1'ed and the submitter gets an email notification. On this particular project, because deadlines were pushed and to try and improve velocity, I became the hub of reviews for the rest of the team.
As a good few hours of everyday was spent reviewing, it led to creating a check list of things I look for when I’m doing code reviews and to talk about the best way I’ve found to approach them.
What I look for during a review
DRY (don't repeat yourself) is one of the first things I'll comment on. If I see code that's duplicated in multiple places, it is usually a sign that one of the developers doesn't know about some API call, is unaware they've been working on a similar part of the project as someone else, or the architecture of their design needs looking at.
This is another big one I look for. If I can scan a bit of code and easily understand what it's doing, then, even if it's not the perfect architecture right now, when someone comes back to it in a week, month, year's time like me, they can understand what it does easily.
The biggest offenders I see are simplifications of names, 'address' can get shortened to 'add' , what does add mean? Is it a number etc.. Another big offender is people not writing descriptive names
This ties in with the single responsibility principle. In general, I think a method should be no longer than 20-25 lines max. If you don't think you can, then that's a good hint you're doing more than one thing in your method.
From experience with unit testing... the longer a method is.. the harder (sometimes impossible) to test correctly. When you use Resharper Extract to method it takes seconds, so there's no excuse
This also relates to above. The same holds true on a class. If you have a class that is longer than 500 lines you are probably trying to mix too many concepts in one file.
Using the businesses domain language for naming conventions
I've worked on so many companies that have multiple names for the same things. What usually happens is the business folk name a feature 'x', then when it comes to development, a developer decides to name something 'y' as it makes more sense to them. For the rest of the project, communication between the business and the developers gets strained as people are constantly talking on cross references. When you build a project, it's import to use the right business terminology for your classes and properties.
This is a another SOLID principle. If I see a lot of if statements in a class or pull request that always raises an issue that a pattern might be a better choice. if the system scales, then this will be a massive issue.
Programming To Interfaces
Whenever I see something new'ed up within a class, I'll always consider if it should be injected via the constructor.
Has It Got An Associated Unit Test ?
This one is pretty obvious. Like most projects, there always has to be some flex depending on how critical the component is and how tight the deadlines is.
Can It Be Unit Tested Easily ?
If it's agreed the method or function won't be unit tested now, then the code still has to be easy to unit test.
Will It Throw An Object Reference?
I see a lot of methods where parameters get passed in and are not checked if they are null. Over the years this has caused so many issues if it reaches production code. The easiest way to get around these issues is to use a Guard
that has unit tests around it. Using a fail fast approach to error detection usually means these random object reference issues never make it to production code.
This will vary depending on if we're using a fail fast or slow approach. When fast, make sure we throw exceptions, if slow, then make sure any code that might throw an exception is logged.
High Priority Issues
No Magic Numbers
The amount of code I struggle to read as there's a random 69 or other such random number. Put the magic number in a constant that describes your intentions and problem solved.
On all my projects, I always have a constant / resource manager. If you write a string in your class, unless it's a one off, I'll ask someone to put it in the resource file.
Every company has their own unique standards, single line if statements, using var etc.. having a code base that looks like it has a consist writer helps maintenance and troubleshooting later down the line.
Not everyone may agree with Resharpers best practice hints but for 8-% of the time they are usually the most optimal, things like .'Invert if to reduce nesting' etc..
Good spelling is always important.. 'nuff said.
If I ever see a comment, then I automatically assume the code around it is bad. If someone usually needs to write a comment, it's usually a sign the code is to difficult to read or understand. This is a very obvious sign that something needs extracting, renaming or refactoring.
Number of method arguments
I think if you have more than 5 parameters being passed into a method or class, then there's an issue. I've always used the code complete or clean code mantra (can't remember the source now) and extract everything into an injection DTO, or, look at an alternative to split the method up.
Complicated If Statements To Bool Property?
Sometimes we have legacy code or you need to have a very deep nested if statement. If it can't be avoided, at a minimum, extract the if to a boolean property whose name reveals the intention.
Missing Doc String
My personal preference is to not bother with doc strings. If you follow good single responsibility, good naming, small classes and methods, then doc strings are a bit obsolete if you're building an internal project.
Before we launch a project, performance deserves it's own phase. Until we're in a stage to get a measure where the bottlenecks are then this is a low. Obviously, if someone is iterating through a loop or doing something stupid, this will be flagged. If the issue is a subjective thing then it will be more likely to be allowed until the performance stage.
In a lot of projects, you will undoubtedly encounter subjective opinions in the best way to achieve something. If the feature or item in questions is not a show stopper (99% of them usually are) I'll always favour the code that's been written, or, the quickest win to get it shipped.
If a bit of code needs working on or changing at a later date, make sure there a todo next to it so it doesn't get forgotten.