Code Review Criteria
I have had the opportunity to review a lot of automation test code and to work with teams to begin peer reviewing each others code. I felt it would be useful to codify and share some of the criteria I use. And just to be clear, this criteria is what I use for the automated User Interface and Acceptance tests for web applications. Many of this may apply to other test types or system, but no warrantees are expressed or implied if used outside of its intended use.
> Coverage
Does this test code cover the cases that we think it should? Every automated test is designed to cover one or more use cases. It is surprising, but it is not uncommon for a use case to be dropped, or not truly covered. Or just as important for a test case to cover more that it is intended to cover, which is just as much of a problem. See the bonus section below on how to detect extraneous coverage.
> Fixed Waits
When testing a user interface, it is common to wait for an action to have an effect before testing the results. However waits should always be for some condition and continue as soon as it is met. A fixed wait has to be set to the longest possible time plus some extra leeway (in case future changes require more time). The test will always have to wait this long and it makes the test very inefficient. And some day the test will start taking longer and your fixed wait will not be enough and you will start getting “random” failures.
> Third Party Dependencies
In a complex application (web or mobile applications in particular), some features may be dependent on third party services that are unrelated to what you are actually testing. Any automated tests need to avoid being dependent on these third-party services. You have to look closely in a peer review to catch these. Third party dependencies are a major cause for unreliable tests. See the bonus section below for methods to detect third-part dependencies.
> Separation of test data / code / model
Watch for test data mixed with code or with your data model. These should all be kept separate to make test maintenance and trouble shooting easier.
> Hard Coded Values
When you see hard coded values in the sprinkled in with you test code you should call them out. Any hard coded values should preferably be configuration driven so that you can override them on a per-run basis, or consolidated into a single area where they are easy to find and update. Furthermore you should be sharing a limited set of values and not use unique values for each test. Again, this makes it easier to update and maintain.
> Performance
Watch for issues that will slow the tests or add inefficiency. Look for unnecessary or duplicate steps. They should be eliminated.
> Assertion Messages
When a test fails it should be clear what the underlying issue is that needs to be looked at. When you see an assertion, ask yourself “How would this appear in the error report and would I understand it?” The failure messages should make it clear what is being tested and why it was failed.
> Complex Code
Complicated code is difficult to understand, review, troubleshoot, and to maintain later. There is usually some why to simplify it. Even if it means breaking code up into separate parts that do more narrowly focused functions and having more total code.
> Race Conditions
Race conditions are the hobgoblins that cause many “random” test failures. They can be hard to spot in test code, but they are there. Watch for cases where things sometimes happen in a different order, but the test is dependent on only a single ordering.
> Custom Code
Watch for custom test code that does not use the agreed on generic routines. Your generic or library code is meant to be well tested, performant and maintainable. Custom one-off code in individual tests needs to be caught and corrected.
—— Bonus ——
> How do I detect extra coverage or third-party dependencies?
This usually seems difficult, but I have found a very simply way to find the extra things covered by a test or third-party dependencies. Ask your self, “What would cause this test to fail?” If the answer is something like “If the bob system is down”, then you have a dependency on the bob system in your test. And if the bob system is not part of the build you are testing, then it is a third-party dependency. If you answer with “if feature x fails” then your test also covers feature x.
It seems simple, but I find that asking this one little question to be very enlightening and helps me to find unintended dependencies.
Needed to compose you a very little word to thank you yet again regarding the nice suggestions you’ve contributed here.