April 17, 2006

Misuse and Abuse of Bug Counts


I use bug counts to tell me something about the system-under-test.  They give me a rough idea of the volume of work remaining in different categories - how many bugs remain to be fixed at this instant, how many fixes remain to be verified, etc.  Bug counts are easy to gather and easy to chart.

But I try not to publicize these bug counts.  I avoid this because far too often, I see bug counts and rates being relied upon to provide information for which they are ill-suited, without a lot of context.  I don't want people to make bad decisions, based on how they mis-interpret or otherwise misuse bug counts.  There's a saying I try to remember: There is nothing more dangerous than a Vice President with statistics.

Here are some bad ways to use bug counts or bug rates, and some reasons why I think they are bad.

Using Bug Counts as a Measure of Software Quality
  • Bug counts don't really tell much about the quality of a system.  If I told you that we have 0 Open bugs at the moment, what conclusions could you draw?  Could you really say that the system had high quality?  What if nobody had tested anything yet?  What if I had simply deferred all the open bugs to a future release?  What if all the bugs had just been marked as Fixed, but the fixes hadn't yet been verified?  What if the product had only 1 bug yesterday "The software cannot even be installed", and that was just fixed today? copyrightjoestrazzere
Using Bug Counts as a Measure of a Tester's Performance
  • People are good at changing their behavior, based on how they are measured.  If I tell them that "more bugs are good", they will create more Bug Reports.  But, then I'll have to worry if these are superficial bug reports, or if there are duplicates, or if they are really finding the "important" bugs (which may take longer to find), rather than just blasting in many, "less important bugs" in order to drive their bug count up.
  • If testers are measured on how many Bugs they find, the developers may feel that they are being judged (in the opposite manner) by the same bug counts.
  • The bug count is usually strongly affected by the kind of testing tasks assigned to the QAer.  Someone assigned to test brand-new code, may find it simple to find lots of bugs, whereas someone assigned to do regression testing, may find fewer bugs.  Someone testing the UI may be able to find bugs more quickly than someone assigned to test lengthy database upgrades.
  • If people are judged based on their bug count, they may be less willing to perform all the other tasks that still need to be done - writing Test Cases, verifying fixes, helping others, etc.
Using Bug Counts as a Measure of a Developer's Performance
  • Again, people are good at tailoring their behavior to maximize their personal measures.  If they are incented to keep their individual bug count low, they can accomplish that - perhaps at the expense of other, desirable acitivies.  I worked in one shop that made it a big deal when developers had zero bugs at our weekly team meetings, held on Mondays.  So one developer made a habit of checking in a ton of changes at the end of the day on Friday, and marking all her bugs as "Fixed".  She knew there would be no time to verify any of these fixes before the meeting, and so she would always be the "hero with zero" during the meeting.  Immediately after the meeting, her bug count would zoom up as many of her quick fixes were invariably found to be defective.
  • When bugs are counted against developers, they may spend more energy arguing over bug assignment, rather than just fixing them.
  • All bugs are not equal.  One really bad bug is often far more important to a business than many trivial bugs.  Bug counts usually do not express this relationship well.
  • I never want a tester to be conflicted about writing a bug report.  If you find a bug, write the report!  No need to worry that it might adversely impact your developer friend.
Using Bug Counts as Project Completion Predictors
  • While your Release Criteria may include "Zero High-Priority Bugs" as one item, that doesn't mean you can use the current bug count, or even the rate of bug fixes as a predictor of when you can release.  Bugs counts don't predict anything.  For example, just because you have been able to close 10 bugs per week, and you have 10 remaining, does not mean you will be done in a week.  What happens if, at the end of the week, you find a High-Priority bug?  You still have something that effectively indicates you are not done.  Many measurement systems break down when the numbers are small.
  • Bug counts change in response to lots of factors that aren't always obvious in the raw numbers.  On one project, the bug count went down dramatically during one particluar week near the end of the development cycle.  When I went to the weekly review, several team members pointed to that as evidence that we were almost ready to ship the product.  Unfortunately, I had to point out that the two best QAers were on vacation that same week.  Once they returned, and our testing pressure was brought back up to normal levels, the bug count started to climb again.
Comparing Bug Counts Across Projects
  • It's very often not a valid comparison to compare the count of bugs found in one project versus the count found in another.  Projects vary in size, complexity, duration, technical challenge, staffing, etc.  Bug counts are very likely to vary as well.
Comparing Bug Counts Across Companies or Industries
  • This can be even more misleading than comparing within a company.  Companies don't build identical products, nor use identical release schedules, nor determine what is or isn't a bug identically.