May 22, 2007

Perhaps They Should Have Tested More - XM Radio

Starting around noon yesterday, XM Radio went off the air for over 24 hours for many of their customers.  XM Radio blames a problem while loading software.


From XMRadio's Web Site:

(http://www.xmradio.com/notices/signaldeg.xmc)

XM Service Update

As many of you know, XM customers have experienced service outages or significantly degraded service since Monday mid-morning, May 21.

We quickly identified the problem and are working hard to return to our normal levels of service. The problem occurred during the loading of software to a critical component of our satellite broadcast system, which resulted in a loss of signal from one of our satellites. We expect normal service to resume midday today (eastern daylight time).

XM apologizes for any inconvenience this has caused. For updates, please go to http://www.xmradio.com/

In the meantime, you can enjoy many of our music channels on XM Radio Online (xmro.xmradio.com) if you are close to computer.

Again, we regret any inconvenience for not having your XM Radio service fully available.

May 11, 2007

So The Urgent Drives Out The Important

I heard this quote recently from a co-worker, and found it very powerful.




“So the urgent drives out the important;

the future goes largely unexplored;

and the capacity to act, rather than the capacity to think and imagine becomes the sole measure for leadership.”


- Gary Hamel and C. K. Prahalad: "Competing for the Future"

Non-reproducible Bugs

Should you write bug reports for non-reproducible errors?
Or should you refrain unless and until specific steps to reproduce the problem can be determined?

To me, the answer is obvious - report the bug!  Let me tell you why.


When we talk about non-reproducible errors, there are several possibilities:

A. There is actually no bug, and what I thought I saw didn't represent a bug
B. There was a bug, but it has been fixed since I last observed it
C. There is a bug, but I don't currently know how to make it appear on demand

If I don't ever report the bug, then it's probably Ok for case A and B. copyrightjoestrazzere

But if I don't report the bug in case C - I have failed to report a real bug. And now, the possibility exists that this bug will not get fixed, and will ultimately be discovered by the customer. I consider this "a bad thing"!

If I report the bug, in case A some investigation will be needed, and it will ultimately be determined that the bug report is invalid.  If my company likes to (misguidedly, in my opinion) penalize testers for invalid bug reports, I may get penalized

If I report the bug, in case B some investigation will be needed, which will (hopefully) result in the bug report being marked as Fixed. Then the usual fix verification can occur.

If I report the bug, in case C the developer may be able to determine the cause by examining the source code, or other testers may encounter the same problem. In either of these cases, the bug report would be updated with additional findings and the bug will be fixed as usual.

Here's the table of possibilities:




(A) There is no bug


(B) There was a bug, but it has since been fixed


(C) There is a bug

Don’t write a bug report

Ok.

Ok.

FAILURE!
Bug may get to customer undetected and unfixed.



Write a bug report

Ok.

Some wasted effort.

Risk of penalty for writing an invalid bug report.

Ok.

Normal verification process.

Ok.

Normal fix process.

Normal verification process.

To me, the risk of not reporting this potential bug far outweighs the additional investigation that might unnecessarily be triggered.

On my teams, I encourage my skilled testers to use their talents wisely. If they see what they believe to be a bug, they should attempt to reproduce it.

If, after their investigation, they still feel they found a bug but are unable to reproduce it, then they should file a bug report, indicating that it isn't yet reproducible.

I never let my testers get penalized for bug reports like these that eventually turn out to be invalid.  If there are complaints about these bug reports, I am more than happy to take the hit.


From James Bach's Blog comes this excellent post "How to Investigate Intermittent Problems":
http://www.satisfice.com/blog/archives/34