May 25, 2006

Quality is in the doing

From "Verbing the Noun" - Dave Thomas and Andy Hunt - The Pragmatic Programmers:


Quality is another important word. We plan it, measure it, hire folks to manage it, and buy big posters about it (“Consolidated Widgets Inc., where Quality is a Word on the Wall”). But again, you shouldn’t use quality as a noun: you can’t measure, draw, or describe it. All you can measure are its effects, the things that result from doing things a certain way. People mistakenly equate (say) bug rates with quality. But in reality, bug rates simply correlate with the way the software was written. If we do things a certain way, the bug rates will be lower and we can claim to have higher quality. Quality is part of the process; it’s in the doing. Quality isn’t a set of rules or a set of metrics; it’s in the spirit of the smallest of our daily activities. Again, quality is a verb trapped in a noun’s body.
http://www.pragmaticprogrammer.com/articles/may_03_verbnoun.pdf

Perhaps They Should Have Tested More - Telus Voice Mail

Telus blames software for message service shutdown 

Last updated May 24 2006 08:32 AM PDT
CBC News


Faulty computer software was responsible for the six-hour loss of message service for about 25 per cent of Telus phone customers in B.C. and Alberta on Tuesday, the company says.

"Two pieces of equipment that need to communicate with each other failed to communicate with each other effectively," Telus spokesman Shawn said, "and our technicians needed to get in there and solve that."

Hall said callers were able to leave messages normally, but mailbox owners ran into delays and busy signals when they tried to retrieve their messages. He said it took 20 technicians to fix the problem.

No messages were actually lost to the technical glitch, the company said, just delayed.

The company said the malfunction started at about 3:30 p.m. The system was restored at about 10 p.m.

Some customers are demanding financial compensation for the problem, Hall said, and those complaints are being dealt with on a case-by-case basis.

Perhaps They Should Have Tested More - Puyallup, WA Lahar Warning

False alarm: Radio broadcasts mistaken mudflow warning

The Associated Press

TACOMA – An emergency radio station mistakenly warned that a massive, volcanic-caused mudflow was headed from the flanks of Mount Rainier and that listeners in the valley below should rush to higher ground.

The emergency lahar warning was broadcast Wednesday for nearly an hour on the 1580 AM frequency in the suburban Pierce County town of Puyallup. Some listeners said they were horrified.

Authorities had no estimate how many people heard the broadcast on the weak frequency, or how many evacuated. Fewer than a dozen called Pierce County Emergency Management, the city of Orting or the Puyallup Fire Department.

Nancy Eldred heard it while driving in the Puyallup Valley and called her daughter, Renee Hutchinson, in Tacoma shortly after noon.

"I was in tears," Hutchinson told The News Tribune newspaper. "I was shaking."

Her 17-month-old son, Ethan, was in the car with his grandmother.

After Hutchinson warned co-workers, about 30 people started frantically calling loved ones. Some called their children at schools in the Puyallup Valley and told them to leave immediately, said LaNell Hoppe, the office manager.

"It was so scary," Hoppe said.

Tracy Frye, Eldred's daughter, also heard the warning as she surfed channels for traffic information.

"It kept repeating, 'This is not a test,"' she said.

The family went to schools to collect their children.

Someone called Orting City Hall, and officials there contacted federal authorities. They all confirmed that no lahar was coming.

The prerecorded radio message apparently was triggered by an error in software operated by Puyallup.

Emergency officials in communities around Mount Rainier routinely test the system that would, in the event of a real lahar from the volcano, activate 24 sirens around the valley and broadcast a radio alert. But on Wednesday, 1580 AM picked up the test signal as real and said the lahar was coming.

Officials said a software glitch apparently has caused similar false warnings in the past, according to scattered calls they received. Puyallup Fire Chief Merle Frank said the problem should be taken care of in the next few days.

Geologists have warned that a huge mudflow could break loose from Rainier's west flank with little warning and that a wall of mud and debris could swallow everything in its path and bury the Puyallup Valley floor where 60,000 people live.

In 2001, Seattle television stations falsely reported a major lahar based on unconfirmed police scanner chatter about a minor mudflow within Mount Rainier National Park. The region has had a number of lahar sirens mistakenly activated.

Copyright © 2006 The Seattle Times Company

It Works on My Machine - An Interesting Bug

An interesting bug today...

I was testing the Admin tool of a product.

For this particular release, when the Administrator logs in to the Admin tool for the first time, he is warned that the database is out of date and will automatically be upgraded.  When he clicks "Yes", the upgrade is supposed to start.

When I tried it, I got the expected warning and clicked "Yes".  But the upgrade didn't happen.  Instead, I got an error message indicating that the database could not be upgraded, and to check that the user had permission to upgrade the database. 
Very odd!

I made sure everything was set up correctly.
I uninstalled, installed the prior version, installed the current version, and tried again.
Still, it didn't work.

After a few more attempts, I collected the error message and log files, and wrote an Issue Report.

The Issue Report came back as "Not Reproducible".

I brought the developer over and showed him the bug.  It was clearly easily reproducible.

I attached a pre-upgrade version of the database I was using to the Issue Report and sent it back to the developer.
Again he tried, and again the upgrade worked fine on his machine.

I tried a few more things on my machine - still it failed each time.
I asked the developer to try a clean install, rather than the debug version he normally used - still it worked each time for him.

The developer and I tried it together on my machine, and we stumbled on the answer.

Normally the User field of the Admin tool login is case-insensitive.
It doesn't matter if I login as "administrator" or "Administrator".

In this case, I could still correctly login either way.
But, if I logged in as "administrator" (which I usually do) - the upgrade failed.
If I logged in as "Administrator" (which the developer usually does) - the upgrade worked.

It turns out that the User field was being validated during login and again just before the upgrade.  For login, the field was validated in a case-insensitive manner.  For upgrade the field was validated in a case-sensitive manner.

While it was a bit frustrating to have to circle around so many times before we determined the cause, I'm very glad that this one didn't get out to the customers!

May 22, 2006

Quotes on Quality

"Fast is fine, but accuracy is everything."
Wyatt Earp

"Quality is not an intrinsic characteristic of software. It's a relationship among the product, the people who have expectations about the product, and the world around them."
James Bach

"Quality means doing it right when no one is looking."
Henry Ford

"It is easier to do a job right than to explain why you didn't."
Martin Van Buren

"One of the rarest things that a man ever does, is to do the best he can."
Josh Billings

"Quality begins on the inside... and then works its way out."
Bob Moawad

"Quality is everyone's responsibility."
W. Edwards Deming

"Quality is never an accident; it is always the result of high intention, sincere effort, intelligent direction and skillful execution; it represents the wise choice of many alternatives."
William A. Foster

"The bitterness of poor quality remains long after low pricing is forgotten."
Leon M. Cautillo

"The trouble with the public is that there is too much of it; what we need in public is less quantity and more quality."
Don Marquis

"Quality is a proud and soaring thing."
Jessica Julian

"The best ad is a good product."
Alan H. Meyer

"It is easier to confess a defect than to claim a quality."
Max Beerbohm

"There is hardly anything in the world that some man cannot make a little worse and sell a little cheaper."
John Ruskin

"Almost all quality improvement comes via simplification of design, manufacturing,, layout, processes, and procedures."
Tom Peters

"The bitterness of poor quality lingers long after the sweetness of meeting schedules is forgotten."
Kathleen Byle, Sandia National Laboratories

"When one find's oneself in a hole of one's own making, it is a good time to examine the quality of the workmanship."
John Renmerde

"Quality is not an act. It is a habit."
Aristotle

"Everything can be improved."
C. W. Barron

"Quality in a product or service is not what the supplier puts in. It is what the customer gets out and is willing to pay for."
Peter F. Drucker

"I consider a bad bottle of Heineken to be a personal insult to me."
Freddy Heineken, founder of Dutch beer giant

"It is a funny thing about life: if you refuse to accept anything but the best you very often get it."
W. Somerset Maugham

"Quality is free, but only to those who are willing to pay heavily for it."
Philip Crosby

"The quality of a person's life is in direct proportion to their commitment excellence, regardless of their chosen field of endeavor."
Vince Lombardi

"Good is not good where better is expected."
Thomas Fuller

"We are what we repeatedly do. Excellence, then, is not an act, but a habit."
Aristotle

"People wish to produce quality; they wish to have pride in workmanship."
W. Edwards Deming

"Quality also marks the search for an ideal after necessity has been satisfied and mere usefulness achieved."
William A. Foster

"You can't fake quality any more than you can fake a good meal."
William S Burroughs

"Ninety percent of everything is crud."
Theodore Sturgeon

"Be a yardstick of quality.  Some people aren't used to an environment where excellence is expected."
Steve Jobs

May 19, 2006

Software Manager Basics

I really liked this article by Mark I. Himelstein.  Vitually everything here applies to QA Managers as well.

Read more from Mark in Dr. Dobb's Journal, and in his blog: http://heavenstone.typepad.com/



Software Manager Basics

Most software managers didn't start out as managers, but began their careers as developers.

By Mark I. Himelstein,  Dr. Dobb's Journal
May 16, 2006
URL:http://www.ddj.com/dept/architect/187203587
Mark is a management consultant with 25 years experience in the software industry. You can check out Mark's blog at www.heavenstone.us.


Most software managers began their careers as software developers. They either had some ambition, some skill recognized as good management material, or were in the right place at the right time. No one I know who manages software was trained to be a manager.

Managers serve multiple masters: The customer, the company, their own manager, their employees, and themselves—and each will tell you what they mean by a good manager. You are stuck with balancing those efforts.

For instance, when I was interviewing for the job of running Solaris Engineering at Sun Microsystems, I asked my interviewers what they would consider success. The (somewhat strange) answer I got was that if they were better managed, it would be a success.

So here's what we have so far: You haven't been, and probably won't be, trained for a job that has too many bosses who either don't know what they want or want everything! What do you do? For starters, you go back to basics—execution, communication, and empowerment.

Execution

In the end, you, your team, and your organization will be rewarded if you develop and release software that customers need in a timely fashion. This is what I label as "execution."
Here are 10 questions involving execution that let you grade yourself:
  1. Do you have your customer's requirements?
  2. Do you have an approved budget?
  3. Do you have an approved roadmap?
  4. Do you have an approved schedule?
  5. Are you delivering the product on time?
  6. Do you hire developers in a timely fashion?
  7. Is your team capable of dealing with change?
  8. Are you capable of keeping your team focused and resisting change?
  9. Do your customers encounter a lot of quality issues with released products?
  10. Do you and your team measure how well you do your work on a regular basis to find ways to improve?
Someone once asked me what they should do when management wants something that is unreasonable or impossible. My answer had two parts: First you must ensure that management is informed and understands that this is unreasonable or impossible, and second, you must decide if you can disagree and commit. If you can't, then you need to examine your own career options.

While the second and more dramatic answer may have caught your attention, it is the first answer that leads us to the next basic skill—communication.

Communication

As a manager, you must communicate with each master you serve. For each action or event that affects you or your team, you should be thinking about to whom and how you communicate it. It doesn't matter whether the item is positive or negative.

You must also learn to communicate in different ways with different constituents. For example, you might do formal presentations for your boss's boss, but be more casual with your direct reports. Or you might use e-mail for officially documenting agreements between you and your peers, but need face-to-face meetings to explain that agreement with its rationale and implications to your developers.

Here is a second set of 10 questions with which you can grade yourself on communication:
  1. Does your team understand your company's strategy?
  2. Does your team understand engineering's roadmap?
  3. Does your team understand why the roadmap meets the goals of the strategy?
  4. Do you have regular communication meetings and e-mail with your team?
  5. Are people on your team willing to tell you bad news?
  6. Do you hear information about your team from your team before you hear it from others?
  7. Do members of your team communicate with each other and the rest of the company in a respectful manner?
  8. Do you provide information to your boss before he or she has to ask for it?
  9. Do other people in the company know what your team is doing and accomplishing?
  10. Do you communicate in a positive fashion?
How you communicate is as important as communicating itself. The attitude of your words, the respect for those with whom you communicate, the body language or inflection of voice, or choice of words all contribute to whether you communicate well. Cynicism, sarcasm, and negativity could remove all of the advantage you might otherwise realize by communication.

Unlike being a developer, a large part of your job is interacting with people. Just as my final words on communication reflect a care you must have in communication, you must also show care in how you treat your team and peers. I searched for the word that represents this skill, and "empowerment" came to mind.

Empowerment

You can't do it all yourself. You must develop an organization that breeds the next managers and leaders, and an organization that can focus, innovate, and succeed.

You should communicate requirements, work with your teams to develop plans, and then let them execute those plans. If you dictate plans or micromanage the execution, your teams will not succeed. They must feel ownership. Only you can make them feel ownership.

With this in mind, here are 10 questions on empowerment:
  1. Does your team develop and buy into their schedules?
  2. Do you avoid micromanagement?
  3. Do you delegate tasks and let your reports proceed without interference?
  4. Do you make it clear what your employees are accountable for?
  5. Do you provide leadership opportunities for your employees?
  6. Does your team have a sense of urgency in addressing issues?
  7. Do you set clear roles and responsibilities for your employees?
  8. Do all the members in your team know what they need to accomplish each week before they can go home for the weekend?
  9. Do your developers understand the difference between accountability and micromanagement?
  10. Do your developers consider your organization a positive work environment?
Empowerment also requires accountability. If you delegate without some checks and balances, you and your team may never accomplish your goals. Do not misconstrue accountability with micromanagement. Many developers take any accountability as micromanagement and you must disabuse them of that notion.

Here are some signs you are micromanaging:
  • Ignoring previously agreed upon reporting points and asking for status more frequently.
  • Getting angry with people for missing deliverables.
  • Constantly changing working assignments.
  • Dictating implementation instead of requirements.
You have to give people a chance to do their jobs in a positive environment. You need to look at problems as just things to be solved. You need to engender trust so you can get the truth when you ask for status.

Empowerment also means letting your employees help develop their own schedules. While you can set a goal for a release, you must rectify any mismatches between the goals for the content of the release, the goals for the timeframe for the release, and the resources at hand.

You always have the same four things you can adjust in making a schedule—resources, features, dates, and quality. If you go back to the same thing each time you are planning a release in order to make your dates, your company can get out of balance. For example:
  • If you remove too many features, you won't have a competitive product.
  • If you add too many features, you won't make your dates.
  • If you scrimp on quality, you'll get a bad reputation.
  • If you wait until the product is perfect, you'll miss the market window.
  • If you make your engineers work extra hours all the time, they'll burn out.
  • If you add too many resources, you can run out of money.
  • If you slip the schedule, you make it hard for the sales team to sell and you might miss a market window.
When you define your product or release correctly and develop aggressive but accomplishable schedules, you may find resistance. The industry is so used to unreasonable schedules and unreasonable goals that many people will think your team is not working hard enough because there is no crisis!

Companies and their customers are best served by creating teams and products that can serve them over a long period of time. The sky is always falling somewhere. You should be aggressive and demand the best from your developers, but you should not abuse them as a resource.

Final Words

Obviously, each question posed here may spawn many more questions, and they may also have responses somewhere between yes and no. Take the time to answer them—and manage wisely.

May 17, 2006

Perhaps They Should Have Tested More - History's Worst Software Bugs

History's Worst Software BugsBy Simson Garfinkel Simson Garfinkel
02:00 AM Nov, 08, 2005 EST

Last month automaker Toyota announced a recall of 160,000 of its Prius hybrid vehicles following reports of vehicle warning lights illuminating for no reason, and cars' gasoline engines stalling unexpectedly. But unlike the large-scale auto recalls of years past, the root of the Prius issue wasn't a hardware problem -- it was a programming error in the smart car's embedded code. The Prius had a software bug.

With that recall, the Prius joined the ranks of the buggy computer -- a club that began in 1945 when engineers found a moth in Panel F, Relay #70 of the Harvard Mark II system.The computer was running a test of its multiplier and adder when the engineers noticed something was wrong. The moth was trapped, removed and taped into the computer's logbook with the words: "first actual case of a bug being found."

Sixty years later, computer bugs are still with us, and show no sign of going extinct. As the line between software and hardware blurs, coding errors are increasingly playing tricks on our daily lives. Bugs don't just inhabit our operating systems and applications -- today they lurk within our cell phones and our pacemakers, our power plants and medical equipment. And now, in our cars.

But which are the worst?

It's all too easy to come up with a list of bugs that have wreaked havoc. It's harder to rate their severity. Which is worse -- a security vulnerability that's exploited by a computer worm to shut down the internet for a few days or a typo that triggers a day-long crash of the nation's phone system? The answer depends on whether you want to make a phone call or check your e-mail.

Many people believe the worst bugs are those that cause fatalities. To be sure, there haven't been many, but cases like the Therac-25 are widely seen as warnings against the widespread deployment of software in safety critical applications. Experts who study such systems, though, warn that even though the software might kill a few people, focusing on these fatalities risks inhibiting the migration of technology into areas where smarter processing is sorely needed. In the end, they say, the lack of software might kill more people than the inevitable bugs.

What seems certain is that bugs are here to stay. Here, in chronological order, is the Wired News list of the 10 worst software bugs of all time … so far.

July 28, 1962 -- Mariner I space probe.
A bug in the flight software for the Mariner 1 causes the rocket to divert from its intended path on launch. Mission control destroys the rocket over the Atlantic Ocean. The investigation into the accident discovers that a formula written on paper in pencil was improperly transcribed into computer code, causing the computer to miscalculate the rocket's trajectory.

1982 -- Soviet gas pipeline.
Operatives working for the Central Intelligence Agency allegedly plant a bug in a Canadian computer system purchased to control the trans-Siberian gas pipeline. The Soviets had obtained the system as part of a wide-ranging effort to covertly purchase or steal sensitive U.S. technology. The CIA reportedly found out about the program and decided to make it backfire with equipment that would pass Soviet inspection and then fail once in operation. The resulting event is reportedly the largest non-nuclear explosion in the planet's history.

1985-1987 -- Therac-25 medical accelerator.
A radiation therapy device malfunctions and delivers lethal radiation doses at several medical facilities. Based upon a previous design, the Therac-25 was an "improved" therapy system that could deliver two different kinds of radiation: either a low-power electron beam (beta particles) or X-rays. The Therac-25's X-rays were generated by smashing high-power electrons into a metal target positioned between the electron gun and the patient. A second "improvement" was the replacement of the older Therac-20's electromechanical safety interlocks with software control, a decision made because software was perceived to be more reliable.

What engineers didn't know was that both the 20 and the 25 were built upon an operating system that had been kludged together by a programmer with no formal training. Because of a subtle bug called a "race condition," a quick-fingered typist could accidentally configure the Therac-25 so the electron beam would fire in high-power mode but with the metal X-ray target out of position. At least five patients die; others are seriously injured.

1988 -- Buffer overflow in Berkeley Unix finger daemon.
The first internet worm (the so-called Morris Worm) infects between 2,000 and 6,000 computers in less than a day by taking advantage of a buffer overflow. The specific code is a function in the standard input/output library routine called gets() designed to get a line of text over the network. Unfortunately, gets() has no provision to limit its input, and an overly large input allows the worm to take over any machine to which it can connect.

Programmers respond by attempting to stamp out the gets() function in working code, but they refuse to remove it from the C programming language's standard input/output library, where it remains to this day.

1988-1996 -- Kerberos Random Number Generator.
The authors of the Kerberos security system neglect to properly "seed" the program's random number generator with a truly random seed. As a result, for eight years it is possible to trivially break into any computer that relies on Kerberos for authentication. It is unknown if this bug was ever actually exploited.

January 15, 1990 -- AT&T Network Outage.
A bug in a new release of the software that controls AT&T's #4ESS long distance switches causes these mammoth computers to crash when they receive a specific message from one of their neighboring machines -- a message that the neighbors send out when they recover from a crash.

One day a switch in New York crashes and reboots, causing its neighboring switches to crash, then their neighbors' neighbors, and so on. Soon, 114 switches are crashing and rebooting every six seconds, leaving an estimated 60 thousand people without long distance service for nine hours. The fix: engineers load the previous software release.

1993 -- Intel Pentium floating point divide.
A silicon error causes Intel's highly promoted Pentium chip to make mistakes when dividing floating-point numbers that occur within a specific range. For example, dividing 4195835.0/3145727.0 yields 1.33374 instead of 1.33382, an error of 0.006 percent.

Although the bug affects few users, it becomes a public relations nightmare. With an estimated 3 million to 5 million defective chips in circulation, at first Intel only offers to replace Pentium chips for consumers who can prove that they need high accuracy; eventually the company relents and agrees to replace the chips for anyone who complains. The bug ultimately costs Intel $475 million.

1995/1996 -- The Ping of Death.
A lack of sanity checks and error handling in the IP fragmentation reassembly code makes it possible to crash a wide variety of operating systems by sending a malformed "ping" packet from anywhere on the internet. Most obviously affected are computers running Windows, which lock up and display the so-called "blue screen of death" when they receive these packets. But the attack also affects many Macintosh and Unix systems as well.

June 4, 1996 -- Ariane 5 Flight 501.
Working code for the Ariane 4 rocket is reused in the Ariane 5, but the Ariane 5's faster engines trigger a bug in an arithmetic routine inside the rocket's flight computer. The error is in the code that converts a 64-bit floating-point number to a 16-bit signed integer. The faster engines cause the 64-bit numbers to be larger in the Ariane 5 than in the Ariane 4, triggering an overflow condition that results in the flight computer crashing.

First Flight 501's backup computer crashes, followed 0.05 seconds later by a crash of the primary computer. As a result of these crashed computers, the rocket's primary processor overpowers the rocket's engines and causes the rocket to disintegrate 40 seconds after launch.

November 2000 -- National Cancer Institute, Panama City.
In a series of accidents, therapy planning software created by Multidata Systems International, a U.S. firm, miscalculates the proper dosage of radiation for patients undergoing radiation therapy.

Multidata's software allows a radiation therapist to draw on a computer screen the placement of metal shields called "blocks" designed to protect healthy tissue from the radiation. But the software will only allow technicians to use four shielding blocks, and the Panamanian doctors wish to use five.

The doctors discover that they can trick the software by drawing all five blocks as a single large block with a hole in the middle. What the doctors don't realize is that the Multidata software gives different answers in this configuration depending on how the hole is drawn: draw it in one direction and the correct dose is calculated, draw in another direction and the software recommends twice the necessary exposure.

At least eight patients die, while another 20 receive overdoses likely to cause significant health problems. The physicians, who were legally required to double-check the computer's calculations by hand, are indicted for murder.

http://www.wired.com/news/technology/bugs/0,2924,69355,00.html

May 9, 2006

The First Bug Report - September 9, 1947

While it doesn't depict a software bug, I ran across this photo and article in Wikipedia, and thought it was interesting:




The First "Computer Bug"

Moth found trapped between points at Relay # 70, Panel F, of the Mark II Aiken Relay Calculator while it was being tested at Harvard University, 9 September 1947.

The operators affixed the moth to the computer log, with the entry: "First actual case of bug being found". They put out the word that they had "debugged" the machine, thus popularizing the term "debugging a computer program" (although the term "bug" had been in use for many years previously by engineers to indicate an indefinite problem).

In 1988, the log, with the moth still taped by the entry, was in the Naval Surface Warfare Center Computer Museum at Dahlgren, Virginia. In 1994 the logbook (with moth still attached) was acquired by the Smithsonian Institution, and is in the collection of the National Museum of American History (where more photos can be seen online under "Object ID 1994.0191.01".


(Admiral Grace Hopper received an honorary Doctorate during my college graduation.  She loved to tell this story.)

May 3, 2006

Perhaps They Should Have Tested More - National Grid, GE

Glitch hurt storm response
National Grid says software faltered during February wind damage 
 
By LARRY RULISON, Business writer
Click byline for more stories by writer.
First published: Tuesday, May 2, 2006

ALBANY -- National Grid said problems with a computer software system delayed its efforts to get accurate information to the public during February's wind storm that knocked out power to more than 121,000 Capital Region customers.

In a mandatory assessment report filed April 24 with the state Public Service Commission, National Grid said that due to the "sheer volume of information flooding the system," the software used to manage power outages crashed several times.

"Unfortunately, and as a result, customer contact center representatives were unable to supply accurate restoration information to customers during those periods," the report says.

National Grid, heavily criticized by business owners and political leaders in the days following the statewide storm's destruction, says it is working hard to correct the problem.

The software, called PowerOn, is made by General Electric Co. Dennis Murphy, a spokesman for GE Energy, could not initially comment on the issues raised by National Grid but said he was looking into it.

National Grid spokesman Alberto Bianchetti said that overall, the utility responded "appropriately" to the storm in terms of the number of personnel used and the time it took to restore power, given the storm's severity.

But he said National Grid would provide additional training and add more computer hardware to support PowerOn. He said the software, in use for two years, has been been effective for day-to-day operations and for regional storms. Bianchetti said the utility is happy with the software and will continue to use it.

"This was its first real-life experience with a tremendously large storm," he said.
Anne Dalton, a spokeswoman with the PSC, said the commission's executive staff has been conducting its own analysis of National Grid's storm response.

The staff will take National Grid's report into consideration if it decides to make any recommendations to the PSC board regarding the utility's storm response plan. The board could direct National Grid to make changes, but there is no indication yet if that will occur.
In its report to the PSC, National Grid describes a perfect storm that made restoration efforts extremely difficult. In all, 229,025 National Grid customers lost power statewide. It took five days to completely restore power.

The Capital Region had wind gusts of more than 65 mph, with a peak of 98 mph at Saratoga County Airport.

National Grid said the storm was more challenging than the ice storms of 2002 and 1999 because unlike those storms, the wind storm affected the company's entire service area.
Furthermore, crews from neighboring utilities were initially unavailable because the storm affected them as well.

Despite those problems, National Grid said most of its storm response went smoothly and according to its plan on file with the PSC. The company had nearly 1,600 employees dedicated to the effort, including 647 crews in the field.

State Sen. Elizabeth Little, R-Queensbury, said Monday she had not yet seen National Grid's storm response report, but she knew that it was difficult at times for the public to get accurate information during the storm. She held a public workshop in Warrensburg in March with National Grid and local officials about what could be done better during future storms and outages.

She said Monday that one idea that emerged was the creation of a radio channel that could be reserved to broadcast information to the public during outages.

"We think that could be beneficial," Little said. "There was a lot of misinformation that was out there."

http://timesunion.com/AspStories/storyprint.asp?StoryID=477409

May 1, 2006

Using {Ctrl-C} to Copy Error Dialog Text to the Clipboard

Did you know that you can use {Ctrl-C} to capture the text of some error dialogs to the clipboard?

Try this:
  • Start Notepad
  • Type some text
  • Select File, Exit from the menu
  • When the confirmation dialog appears, hold down the Ctrl key and press C
  • Click the Cancel button to cancel the Exit
  • Select Edit, Paste from the menu
You should see something like this:
---------------------------
Notepad
---------------------------
The text in the Untitled file has changed.

Do you want to save the changes?
---------------------------
Yes   No   Cancel  
---------------------------
I use this technique to easily copy error messages into my bug reports.
I also use this technique in some of my Test Automation Scripts.

Writing Issue Reports That Work

It's important not only to write Issue Reports, but to write them well.


Issue Reports (Bug Reports) are one of the main communication methods that QAers use.
You are creating a statement to your stakeholders - "I have found what I think is a problem, and here's my clear explanation of what it is and how you can see it too.  Please look into this".
  • Understand the Audience for the Report
It's important to know who is going to read your Issue Reports, and what they are expected to do with them.

For some shops, the only readers will be yourself and one or more developers.  If that's true, you can use all sorts of jargon and abbreviations that you each understand.

But in many shops, there will be lots of stakeholders who need to read what you write - other QAers, Developers, Support, Product Management, Documentation, Management, etc.  It may become more important to use less jargon, and add more details.

There are a few special cases that must be kept in mind as well. 

For example, if offshore testers or developers must read your Issue Reports, you'll need to pay special attention not to use confusing jargon or colloquialisms in your writing.

If customers will eventually read your Issue Reports, you'll need to be very, very careful in choosing your words.  (By the way, this is not something I'd recommend, without first sending the Issue Reports to a skilled writer for "sanitizing" first.)  You may even be better off having two different descriptions of the bug - one for internal consumption, and one for customers.
  • Choose a Good Summary/Title
Since the one-line Summary or Title is often what prompts someone to decide to read your Issue Report further, and is often the only piece of information about the bug showing up on summary reports, it's important to put some thought into this field.

The title should be short (because it may become truncated) and to-the-point.
It's tempting to cut corners and write just a few words here "Program x crashed" - but clearly that's not useful. copyrightjoestrazzere

Also, you don't need to include text here that is already included in other fields of the bug tracking system.  For most systems, items such as severity/priority, product, component, etc are tracked in their own fields.  No use wasting valuable space in the Summary/Title on these.
  • Describe the problem concisely and effectively
In a paragraph or two, describe the problem.  Here you can use more words than the Summary/Title will allow, but still avoid wordiness.
  • Include Steps to Reproduce the Problem you are Seeing
Not every bug is fully reproducible.  But, it's important to try to find out a relatively minimal set of steps to reproduce the problem and to note them in the Issue Report.
This gives the developer a fighting chance of seeing what you are seeing, and actually getting it fixed quickly.

Avoid steps that don't matter - those which have nothing to do with reproducing the bug.  Including too many steps can waste time and lead to confusion.

Include all steps which actually seem to matter.  Write them in a clear style, in a manner which avoids guesswork.

It takes a bit longer to narrow down the steps, but it's usually time well spent.

If the bug is not fully reproducible, indicate that clearly in the issue report.
  • Include the Results you Expected
Often, QAers know what is expected out of a sequence of steps better than Developers.

Sometimes, their expectations are right, sometimes they are wrong.  Either way, include what you expected to see.
  • Include the Results you Actually Observed
Since this is an Issue Report, we assume that something unexpected occurred.  Note what actually occurred.

If you observed an error message, include that.

If you observed something significant in a log file, include that portion of the log.
  • Include Enough Details for Searching
Try to think like a Support person, or a Manager, or a new QAer who wasn't familiar with this Issue.  What would you search for if you saw these same symptoms happen?  If you are lucky enough to have a defect tracking system that includes full-text searching, then make sure to include these important keywords somewhere in the text of your Issue Reports.

If your defect tracking system has a "keywords" field, put them in there.
  • Explain the Effects on the Customer
Somewhere along the line, your Issue Report will be analyzed for Priority and/or Severity.

If you explain the effects of this problem on the customer, you'll have a better chance that the Issue is properly analyzed.  And if the problem makes further testing impossible, make sure to indicate that fact (in this context, you are a customer, too). 
  • Attach Anything Else that Could Help
Attach anything that could help clarify and/or debug the problem.

As they say "a picture is worth 1000 words".  So, often attaching a picture of the problem can be very helpful.  Log files, test files, etc, can also be attached if they will help reproduce or debug the problem.

Think about whether something is better off being attached (in order to avoid an overly-wordy Issue Report), or included within the body of the Report itself (in order to be useful during a search).
  • Avoid speculation
Report the facts - what you saw, what you expected to see, and any other symptoms.  But in general, avoid speculation as to the root cause of the problem, unless you have sufficient facts to back up your speculation.

Speculation can send the reviewer on a wild goose chase and waste their time.  It can also make this bug report appear as a search result for cases where it's not relevent.
  • Be careful of the tone of your report
Don't use a negative tone in your writing.  Remember, your job is to describe the bug so that it can be fixed effectively and efficiently.  There's no benefit in criticizing the developer here, or the designer - you are their partners, you are both on the same side.  Be objective and respectful.
  • Avoid duplication - search first
You don't want to waste people's time reading issue reports that have already been covered by someone else.  And in some shops, you may be penalized for writing such bug reports.  So, search first, using the kinds of keywords that you would write into your report if needed.

If the problem has already been written into an issue report, it's sometimes useful to add additional facts if you have them.