QR code

To Measure or Not to Measure

  • Moscow, Russia
  • comments

Managementmanagement

The question was asked on StackExchange nine years ago (just around the time the site was launched): “If not lines of code, then what is a good metric by which to measure the effectiveness of remote programmers.” The answers, not surprisingly, were all along this line: programmers are not supposed to be measured! I bet those who answered were programmers themselves. Indeed, why would a programmer be interested in being measured and being reduced to a mere number?

Better Call Saul, Season 5 (2019) by Vince Gilligan et al.
Better Call Saul, Season 5 (2019) by Vince Gilligan et al.

First, let’s see why putting a metric on a programmer may be considered bad practice (my opinion: these are merely excuses from overly-paid programmers/managers who are just trying to keep their jobs, doing nothing whatever they want, and wasting employers’ money):

How do you like that? “Keep paying me, but don’t expect me to give you anything back! And don’t you dare check my performance!”—this is what I hear and I’m not surprised. What’s happening is called the performance management revolution and the gist of it is this: modern management is so weak that we desperately need an official name for this chaos, to avoid confusion. Agile is the name.

Managers are not in charge anymore, programmers are. And it’s sad.

However, not everybody believes in this anarchy. Some experts think that metrics are actually helpful. Bradley Kirkman et al. claim that “recognizing a single team member seems to have a positive and contagious effect on all the other members in the team,” and the famous Vitality Curve suggested by Jack Welch in 2001 is still with us.

I believe the all the negativity aimed at metrics is caused by their incorrect usage. Indeed, if the only performance metric of software engineers, for example, is the amount of hours they stay in the office, that will suggest only bad things about all metrics. Such metrics do hurt, no doubt about that. But the absence of good metrics hurts even more. How to chose them—is the real question. Let me suggest a few, instead of the famous and incorrect LoC. Ideally you would pick a combination from this list, or even use them all together:

  • Features Delivered. Add a “feature” label to a ticket (or a GitHub issue) and then count the amount of tickets closed by a programmer with this label attached. Of course, each ticket must be closed by a product manager, not by the programmer.

  • Pull Requests Merged. Each pull request has to be peer reviewed to make sure it makes sense and adheres to the quality standard of the project. Pull requests that are too small or too big must be rejected.

  • Bugs Fixed. Bugs work the same way as features, but are usually smaller. The more bugs a programmers closes, the better. Of course, closing must be done by product managers or other programmers who reported the bugs.

  • Bugs Reported. Once a bug is reported and accepted by the project, the reporter must get an extra point. This is how the quality of a project grows: by encouraging everybody to report bugs.

  • Releases Published. In a disciplined project new versions are released every day; in others, every week or every few months (or never). Every release is a stressful operation and it only seems logical to reward programmers for them.

  • Uptime. There is a set of DevOps metrics that demonstrate the quality of service in production, including MTBF, MTTF, failure rate, and so on. The longer the uptime, the better the programmers and their product.

  • Cost of Pull Request. The less time it takes to merge a pull request submitted by a programmer, the better the programmer, I believe. The faster their PRs go through peer reviews and quality control, the better. Junior programmers usually submit overly large or complex PRs, causing a lot of trouble during peer review. They also cause merge conflicts and sometimes even stale branches and never-good-to-merge PRs.

  • Documentation Pages Published. FAQ pages, Javadoc blocks, Wiki pages, blog posts, and so on—they help the project get closer to users and to future developers by increasing maintainability. Of course, every piece of text must be validated before publishing.

  • Mentee Results. Senior programmers may be mentors of more junior ones and may be rewarded for the rewards received by their mentees. All metrics listed above can work this way, rewarding or punishing mentors when their students are doing better or worse.

I can’t stress this enough: each metric must have a quality control mechanism. Just measuring bugs reported without checking the quality of them would lead to cheating: programmers will report whatever they like, just to bump the numbers up. Each bug must be verified by an architect: duplicates or low-quality bug reports are to be rejected. The same is true for every metric: trust without control leads to cheating.

It is also worth mentioning that features, bugs, pull requests, and documentation pages may have different complexity, urgency, and severity, which also should be taken into account, increasing or decreasing the numbers in each metric.

Most of these metrics can be collected automatically, without any human interaction, for example via the GitHub API.

In an ideal world of ideal management, the project compensates the work of its programmers according to the metrics collected. Instead of salaries, programmers get money for features, bugs, documentation pages, and so on. How far your project is from this utopia—is the indicator of your professionalism as a project manager. Lousy managers don’t measure anything and make everybody “happy” by keeping wages high and control low … until the project runs out of money. On the other hand, exceptionally good managers let metrics control everybody, making the best happy and the worst … quickly find the way out.

Which one are you?

sixnines availability badge   GitHub stars