visit
Number of Style Guide mismatches: Before collecting this metric, it’s vital to adopt a style guide. The team must be aware of it. After that, it’s time to properly configure linters’ config files to match those conventions for each repository. Then, the number of mismatches tends to be accurate, and you can use it as an indicator.
Number of Issues found by linters: some linters do more than just style guide checks. They can look for security issues, TODOs or FIXMEs mentions, and bad practices, like finding methods or functions declarations with too many arguments.
% of improvement: this is a nice metric that tech leads can extract periodically, like weekly or biweekly. To calculate that, you may use the formulae (number_of_current_week_issues - number_of_last_week_issues) / number_of_last_week_issues * -1.
Test Code Coverage: Test Coverage is another crucial indicator of quality. It’s essential not only to avoid quality to drop but also to measure whether a campaign for increasing the coverage is working. As mentioned in the metric above, you can also calculate the variation against any period.
Code Churn: Code that is rewritten or deleted briefly after publication may indicate hotspots for design issues. Knowing the level of churn is crucial to make data-driven decisions on this concern.
Number of Code Smells: Like the Number of Issues or Style Guide mismatches, it’s kind of impractical to zero it. Business is dynamic and forces teams to postpone refactors and rewrites. However, choosing this number as an indicator is critical to keep it under control.
Number of technical debts: a simple metric with the sum of technical debts currently in the backlog can be convenient. It gives an idea of how much effort the team needs to fix them. Another metric, though, is crucial. I present it below.
New:Paid Ratio: better than the total amount, periodically measuring the pace of its evolution is crucial. For a specific period, sum all the technical debts inserted in the code base, then compare it with the number of paid debts. If the number of paid debts is higher than the inserted ones, than you’re at a good pace.
Time to Review: how much time does it take from opening a pull request to merging it? A satisfactory answer would be the mean of the time to review the latest pull requests. I recommend using the median here. Averages hide too much information. Another tip: measure it in days. It reduces timezone-related problems and smooth forecasting projections.
Time to First Comment: This metric tells how much time it takes for the team to comment on a pull request. Opposite to Time to Review, I recommend measuring this metric in hours. If the value is too high, Tech Leads can investigate what’s going on.
Pull Requests Size: The weight of a pull request can be expressed in two ways: sum_of_lines_added + sum_of_lines_removed or number_of_changed_files. Both metrics are useful for finding whether pull requests are massive or not. Extensive pull requests are evil. Developers usually don’t thoroughly review them, which may end up pushing low-quality code forward.
Number of Collaborators by Pull Request: this metric is simple to get, you just need to find the mean number of collaborators by pull request. By collaborator, I mean every person that commented on the pull requests. People who approved or declined but didn’t discuss in the pull request are left out. They are considered in the Number of Approves and Declines.
Number of comments by pull requests: My tip is to find the mean and analyze it together with the Number of Reviewers by Pull Request. They are excellent indicators of how your team collaborates.
Number of Approves and Declines: most SaaSs that empower code review have an approve/decline feature. It’s common for members to approve a pull request and leave no comment. Sometimes it’s ok, but it can’t be the behavior of the majority. So, the sum of approves and declines should be close to the Number of Collaborators by Pull Request. Otherwise, the review quality may be in check.
Deployment frequency: Features add new capabilities to the software, which increases the value perception of end-users. Deploy is the final step of the development flow, and that’s why it’s so crucial. The more teams deploy in a day, the more it adds value. You can track the number of , calculating a mean or just summing them up. Most importantly, compare the metric among periods and keep a healthy pace.
Deployment size: You can measure the size of a deploy by looking to the number of commits, lines or files changed, or the number of work items in it, for instance. It’s crucial to keep this number low, so the frequency tends to increase. Lowering the size also reduces the risks of failure during deploys or rollbacks. Small deploys are also easier to test.
Bug Detection Rate: How many bugs are found in production? Anyone can find an escaped bug: end-users, Quality Assurance personnel, software engineers, anyone. Assuming there’s a process to register the bug in a proper tool, it’s easy to find the rate. To calculate the Bug Detection Rate, sum the number of created bugs in a given period and divide it by the number of weeks or months.