April 30, 2006

What do QAers and Testers do?

People hear about QA and Testing, but don't always know what folks in this profession do.
The real answer, of course, is "it depends". Every company is different, every department is different. Some shops might limit the scope of the role much more tightly than others.

That said, here are some of the things my QA Teams do, in no particular order.

Review and respond to requirements
Participate in design reviews
Participate in architecture reviews
Create and maintain Test Cases
Execute Test Cases
Exploratory testing
Functional testing
Load/Stress/Volume/Performance testing
Security testing
Internationalization testing
Localization testing
Usability testing
Installation testing, post-installation testing
Data testing
Release testing
Analyze Results
Write and update Bug Reports
Create and distribute Bug Status Reports
Create and Maintain test automation assets
Test Help
Test Documentation
Verify bug fixes
Create test data
Participate in Code Reviews
Conduct "Lessons Learned" sessions
Mentor junior testers/QAers
Conduct interviews of potential team members
Plan for future releases
Provide estimates of required test efforts
Attend miscellaneous meetings
Research upcoming releases, new technologies, and new methods
Teach others about products which have just been tested
Set up and maintain test environments
Discuss features, bugs, etc with Developers, Support and Product Management
Reproduce customer-reported problems
Publish lists of fixed bugs for customers
Maintain a Knowledge Base of known problems, limitations, workarounds
Validate new tools
Validate new methods and procedures
Chair the Production Control Board
Keep track of upcoming releases to platforms of interest (such as Windows and Internet Explorer versions)

April 29, 2006

Things I Like to Have in my Test Automation Suites

I've used lots of different tools - some commercial, some open-source, some home-grown - for test automation. I usually use a mix of such tools in my overall automation efforts.

Over the years, I have found some nice-to-have features and attributes that I end up looking for, or building, as I assemble a new Test Automation Suite. Some of these attributes are part of the tools themselves. Others come about because of the way I assemble my Test Suites and tools into a complete package.

(For the purposes of this article, assume I am talking only about Functional Test Automation, involving scripts.)

Some things are must-haves, and most are obvious:

Run in my environment

If I'm running in a Windows shop, I may not be allowed to introduce a bunch of Linux machines (and vice-versa).

Automate my System-Under-Test

My Test Suite must be able to automate the system I'm testing. If the system is web-based, my scripts must be able to automate my browser (or sometimes, multiple types of browsers). If the system is Java, the scripts must be able to automate a Java system.

Be able to "see" most of the objects in my System-Under-Test

Since I usually want my scripts to validate the contents of the system at various points during the automation, I need them to be able to "see" the contents. Sometimes the scripts have a direct way to do this (usually with standard controls, it's built-in). Sometimes, my scripts have to be creative (for example, with some non-standard controls, I might have to compare images, or copy data to the clipboard, in order to "see" the contents).

Usable by my test team

In general I don't favor having just "specialists" able to use the Test Automation Suite. I strongly prefer that most of the team be able to use the test system, and contribute to building it up over time. copyrightjoestrazzere

Be affordable

Obviously, the Test Suite has to be affordable. The commercial software and maintenance fees have to be within my budget. But also the hardware needed to run it, the training required, etc, etc - all need to be affordable.

Be generally more efficient than strictly manual testing

Seems pretty obvious. Once everything is considered, if it's more efficient to perform the testing manually, then perhaps I don't need a Test Automation Suite after all.

Other things are nice-to-have:

Detect changes in the System-Under-Test

Bug reports, checkin comments, and build summaries provide clues as to what changed in the current build of my system-under-test. But, often, they don't tell the whole story.

I like to be able to depend on my Test Suite to detect unexpected changes, so I can then dig in and find out if this was intentional or if it was a bug.

For example, when I build a Test Suite for a web-based system, I like to capture the non-dynamic text of each page, and compare it to a baseline. If there's a difference, it might mean that I have an intentional change, or it might mean a bug. If it's intentional, then I want to be able to easily update the baseline, so it's ready for the next test run.

Create Smoke Tests which run after every build

I see this as one of the basic uses for my Test Automation Suite. I want to be able to quickly run a test after the build, so I can assess whether or not my team should bother to dig in and spend their time testing it. If the system-under-test passes the smoke test, we can proceed. If not, we reject the build until it is fixed.

If the builds occur overnight, I like to be able to schedule this smoke test so that it runs after the build and so that the results are ready for me when I get in the next morning. Sometimes, this allows me to run a larger overnight test and still have the results ready for the morning.

Run unattended

It's important that I don't have to sit and watch my Test Suite run, or to help it along. Otherwise, I may not be saving much time. If the Suite can run by itself, overnight, then I can take advantage of hours and machines that might otherwise be unused.

Run overnight, and have a report ready the next morning

There are really two parts to this - overnight, and results the next morning. Running overnight allows me to take advantage of "free time". But for this to be effective, I need a good post-run log of what happened during those overnight hours.

Automate the boring, repetitive stuff

This is where Test Automation Suites should shine. They should be able to simply run the same things over and over. People get bored doing this, or they get less attentive after they have seen the same thing several times. Automated Scripts don't have this problem.

Run predictably and repeatedly

I need confidence that I can run my Test Suite and expect it to run correctly each time. It seems obvious, but this also means that the System-Under-Test also needs to be running, along with any parts of the system. If they are flakey, then I can't depend on being able to run my tests when I need them.

Additionally, I can't have a database that may or may not be available, or may have unpredictable data in it. Ideally, I like to have my Test Suite start with an empty database, and use my scripts to populate it to a known state. If I have to share my database with other testers, I want to be able to focus only on my part of the database, and not have other testers' actions cause my scripts to go awry.

Randomize

Almost all scripting languages have a randomize function. This often turns out to be very useful in varying wait times, varying the order that tests are run, and varying the data sent to the System-Under-Test.

Perform timings

When I run my Test Suite, I want to time certain actions. I use those timings to detect when things are starting to go awry in my System-Under-Test.

Unexpected changes in timings can point to new bugs, or sometimes just unexpected changes under the covers.

Run some load, stress, and volume tests

As part of my suite of Test Automation tools, I need load testing capabilities. Sometimes this can be fulfilled (perhaps only to a small extent) by my Functional Test Automation Suite.

Isolate failures easily

My Test Suite needs to provide some way for me to easily isolate where bugs are occurring. I don't want to run a long test that only has a Pass/Fail result. Instead, I want my Suite to tell me where the real failure occurred, as much as possible.

Run many tests, in spite of unexpected failures along the way

Some Automated Test Suite are overly-sensitive to failures. That is, once a single problem occurs, the rest of the tests fail. What was intended to be a 500-test Suite, effectively can only test until the first failure occurs - the rest becomes useless.

But, the reason for running these tests is to find bugs! Hopefully, I can find many of them - not just one!

I want my Test Suite to be able to recover and continue when it encounters a bug, or an unexpected situation. This is not always possible, but the chances of continued testing can be greatly enhanced by have each test (or at least each group of tests) able to reset the System-Under-Test back to a known state and continue. The better Test Suites can do this quite well, while others cannot.

Start wide, build depth later

I like to build an evolving set of tests in my Test Suite. When I first start out, I want to cover features lightly, so that I can get at least some coverage in a lot of areas. Later on, I'll go back and add depth in the important areas.

I want a Test Suite that lets me do this simply - create a "small" script which can run and be useful, then later enhance the same script to make it more useful - without having to throw things away and start over.

Automate what users do first (Getting Started Manual?)

I like to try to automate important, useful things first. When customers first use the System-Under-Test, I want them to have a good experience. If we have a Getting Started Manual or equivalent Help page, that's often a good place to start.

Isolate the maintenance effort

Test Suites are constantly evolving - due to added tests, changing requirements, and changes in the System-Under-Test. I want to be able to maintain these tests without having to constantly throw large chunks away and rewrite them.

Produce "readable" scripts

I want lots of people on my QA Team to be able to go in and at least understand what the Test Suite is doing. That's often made simpler by having a scripting language that is readable. It's often aided by having well-commented scripts, too.

Ability to reset the environment as needed

I like to have a Test Suite that's able to reboot a machine and continue. I find that's often needed for a really full-featured Suite.

I also like to be able to re-initialize a database, or kill a stuck program or two.
These things allow me to create tests that can survive the unexpected, and run longer without manual intervention.

Avoid false failures

If my Test Suite logs a lot of "false failures" then I will be forced to spend a lot of time investigating them, before I can determine if they represent real bugs or not. So, I want a Test Suite that can accurately log an error when there is a real error, and not when there isn't.

Also, when a single failure occurs, I don't want every test after that to fail unnecessarily. To that end, I need my individual Test Cases to be able to set my System-Under-Test to a known state - even if a failure occurred in the previous test.

Extensible - since we cannot predict all uses

I never used to think extensibility would be very important. But over time, I find more unanticipated needs for my test tools. So I want my tools to be as flexible as possible, and able to be extended to handle objects that I hadn't anticipated - perhaps non-standard objects that we haven't yet developed.

Survive trivial changes to the System Under Test

When minor changes occur in my System-Under-Test, I don't want my Test Suite to decide that every change is a bug. That's why, for example, I avoid full screenshots for verification points. Too many things can change on the screen - many of which are just incidental and don't represent bugs.

I want to be able to create verification points for specific needs of my Test Case, and ignore everything else.

Validate during tests, and at the end as appropriate

I want the option to validate aspects of my System-Under-Test as my Test Case runs, and optionally validate at the end of the run as well.

So I may need to "look" at parts of the system at any time, but I don't want my Test Suite to always look at everything.

I may want to drive my System through various actions, and check things along the way.

But sometimes, I just want to drive my System to a particular state, then look at the a database export, for example. While it's driving the System, I may not want to validate anything automatically along the way at all.

Ability to select and run subsets of the entire test suite

I often build a large, relatively complete regression.

But sometimes, I don't want to run that entire regression - I just want to run a subset. Perhaps I want to quickly verify a fix, or a portion of my System on a new platform, etc.

If I've constructed my Test Suite correctly, it should be simple to select and run just a portion.

Ability to select and skip particular tests

It's often necessary to skip particular tests in a Test Automation Suite.

Sometime, it's necessary to skip a test until a bug fix is available. Sometimes, the Test itself needs work. Sometimes, the code that the Test exercises is being re-built and running the Test wouldn't be productive.

Skipping tests can sometimes be achieved by commenting out the statement that invokes that test, sometimes there are other methods. Either way, this will happen, so it should be simple.

Variable log levels (Verbose, Normal, Minimal)

The ability to log minimally sometimes, and verbosely other times is very useful.
When I run a subset of my full Regression Suite in order to narrow in on the root cause of a bug, I want lots of details in my logs. I want to know pretty much everything that was done, and what was seen along the way.

But when I just run my full nightly Regressions, I usually want much less information - usually just what Test Cases were run, and what errors were found.

Minimize dependencies between scripts

Ideally each script in a Test Suite is independent of all others. It can run by itself, or in any position with the entire Suite. That's the ideal.

In reality, it can be more efficient to have some dependencies. So, for example a script initializes the database, then another starts the System-Under-Test, then the third populates the database to a base state.

In general, I don't want strong dependencies if it's not necessary.

Minimize learning curve

A QA team changes over time. People leave, or move on to other roles. New people arrive.

I want the QA team to be able to learn how to use the Test Automation Suite in fairly short order. Part of that is hiring the right people. Part of that is using a tool that can be learned relatively quickly, and that is well-documented.

Minimize maintenance time

As my System-Under-Test changes, I don't want to spend too much time updating my Test Automation Suite in order to let it run against the new build.

I need to be sure that my Test Suite isn't too brittle - it must not be overly-sensitive to minor changes in the System. But changes inevitably happen, so the Test Suite still must be easy to maintain.

Minimize post-run analysis time

If I run my Test Suite overnight, unattended, then I must be able to come in the next morning and quickly understand what ran, and what problems were found. I want that to be simple and quick. I want to be able to quickly see where the errors were, and be able to dig in, write the relevant bug reports, and get on to the rest of the day's work.

Minimize dependence on golden machines

While it's not always possible to avoid completely, I don't want my Test Suite to depend on being run only on a particular machine. I want to be able to pick the Test Suite up, and move it to another machine (perhaps several other machines) as needed.

To that end, I want to avoid hard-coded wait times (which may be inappropriate on a faster or slower machine). I also want to place any necessary machine-specific details into variables that can be easily adapted to a new home.

Record and Playback capability

I never depend on Record-and-Playback as the sole method of developing a Script or a Test Suite. It's just not possible to develop robust, full-featured Test Suites that way.

On the other hand, a quick recording is very often a nice way to create Script code rapidly; code which can then be edited into the final form.

I've used Test Tools which didn't provide a recording capability. It's still usable, but not nearly as efficient.

April 23, 2006

The Cult of Quality

While this essay appeals to me, I understand that businesses need to get projects done, to deliver. I've seen QAers who believe that it is absolutely essential to get every single bug fixed before a product is shipped. I know that isn't true. There must be a way to balance the need for quality from a corporate culture aspect against the need to get things done.

It's clear to me that some (few?) companies have a "Cult of Quality" and some (most?) do not.

(from "Peopleware - Productive Projects and Teams" - DeMarco and Lister)

The judgment that a still imperfect product is "close enough" is the death knell for a jelling team. There can be no impetus to bind together with your cohorts for the joint satisfaction gained from delivering mediocre work. The opposite attitude, of "only perfect is close enough for us," gives the team a real chance. This cult of quality is the strongest catalyst for team formation.

It binds the team together because it sets them apart from the rest of the world. The rest of the world, remember, doesn't give a hoot for quality. Oh, it talks a good game, but if quality costs a nickel extra, you quickly see the true colors of those who would have to shell out the nickel.

Our friend Lou Mazzucchelli, chairman of Cadre Technologies, Inc., was in the market for a paper shredder. He had a salesman come in to demonstrate a unit. It was a disaster. It was enormous and noisy (it made a racket even when it wasn't shredding). Our friend asked about a German-made shredder he'd heard about. The salesman was contemptuous. It cost nearly half again as much and didn't have a single extra feature, he responded. "All you get for that extra money," he said, "is better quality."

Your marketplace, your product consumers, your clients, and your upper management are never going to make the case for high quality. Extraordinary quality doesn't make good short-term economic sense. When team members develop a cult of quality, they always turn out something that's better than what their market is asking for. They can do this, but only when protected from short-term economics. In the long run, this always pays off. People get high on quality and out-do themselves to protect it.

The cult of quality is what Ken Orr calls "the dirt in the oyster." It is a focal point for the team to bind around.

April 21, 2006

Pseudo-Translation of Strings as an Aid to Internationalization Testing

Often, during the Internationalization process, all the text strings in an application or web site are extracted into separate files (sometimes called "resource files") for translation into a target language.

One technique useful in Internationalization and Localization testing, is to pseudo-translate your text into strings that can be understood by testers, but still will indicate if any strings remain untranslated. Often a Perl script, or other automated technique, can do the pseudo-translation for you. After translation, rebuild the application-under-test and look at it.

Any remaining unstranslated strings then probably represent strings that were incorrectly hard-coded and were not extracted into resources files.

There are a number of techniques for doing this. For example:

UPPERCASEIAN – this method translates each string to uppercase
Bracketed – this brackets each string with [ and ]
Egg Language – this places the string “egg” in front of each vowel
All Xes – this replaces each character in the string with X
Leading and Trailing Zs – this puts the string “zzz” in front and at the end of the phrase

How the phrase “All Things Quality” might be “translated” into various pseudo-locales for Internationalization testing purposes:

UPPERCASEIAN	ALL THINGS QUALITY
Bracketed	[All Things Quality]
Egg Language	eggAll Theggings Qeggueggaleggity
All Xes	XXX XXXXXX XXXXXXX
Leading and Trailing Zs	zzzAll Things Qualityzzz

How Babelfish (http://babelfish.altavista.com/) translates the phrase “All Things Quality” into various languages

Chinese (Simplified)	所有事质量
Chinese (Traditional)	所有事質量
Dutch	Al Kwaliteit van Dingen
French	Toute la Qualité De Choses
German	Alle Sache-Qualität
Greek	Όλη η ποιότητα πραγμάτων
Italian	Tutta la Qualità Di Cose
Japanese	すべての事の質
Korean	모든 것 질
Portuguese	Toda a Qualidade Das Coisas
Russian	Все Качество Вещей
Spanish	Toda la Calidad De las Cosas

And for fun, how the Dialectizer (http://rinkworks.com/dialect/) translates the phrase "All Things Quality" into various "dialects":

Redneck	All Thin's Quality
Jive	All Doodads Quality
Cockney	All Finks Quality
Elmer Fudd	Aww Dings Qwawity
Swedish Chef	Ell Theengs Qooeleety
Moron	All Diggs Kality
Pig Latin	Allyay Ingsthay Ualityqay
Hacker	all tihngz quiality

[03/23/07] An update:
It turns out that the folks at Fog Creek Software first psuedo-translated their strings into Pig Latin as they prepared to translate Fog Bugz into German.

http://www.joelonsoftware.com/items/2007/03/13b.html

April 20, 2006

What can get in the way of good testing?

Testability isn't just an attribute of the system-under-test. Testability can be affected by lots of things - in the software, in the processes, in the environment.

Here are a few things that can get in the way of good testing. Not to say that good testing cannot be achieved in spite of these obstacles, but at least they can make it more difficult, more expensive, or consume more time and/or resources.

In no particular order:

Lack of clear requirements
Requirements which change frequently
Buggy code
Turnover (of testers, programmers, managers, etc.)
Layoffs of testing (or other development) staff.
Organizational changes over the life of the project.
Number of people who influence the project
Low level of concensus about the project
Inappropriate ratio of testers-to-developers
Different language abilities among testers
Different language abilities among testers and developers
Distance between developers and testers
Tests blocked by defects
Low defect fix rate
High number of times a bug is reopened
Time lost to development issues
Broken builds
Time lost to emergency fixes
Time spent providing support for pre-release users
Poor test environment
Poor working environment
Lack of testing skills
Lack of management support

Perhaps They Should Have Tested More - Hartsfield-Jackson Atlanta International Airport

Glitch Caused Atlanta Airport Shutdown

The Associated Press
Thursday, April 20, 2006; 1:08 PM

ATLANTA -- A bomb scare that shut down security checkpoints for two hours at the world's busiest airport was the result of a computer software glitch, the nation's top security administrator said Thursday.

Transportation Security Administration Director Kip Hawley said a screener at Hartsfield-Jackson Atlanta International Airport spotted what looked like an explosive Wednesday on an X-ray machine.

She pressed a button that should have signaled a routine security test was being conducted but it failed to respond, Hawley said. The screener notified her supervisor of the suspicious image on the X-ray machine monitor.

Officials then closed security checkpoints for two hours. By the time checkpoints reopened, no planes had departed for more than an hour and all arrivals were delayed by at least 90 minutes. At least 120 flights were affected, officials said.

Hawley apologized for the delays but said he was pleased with the performance of the TSA screeners in Atlanta.

TSA Administrator Kip Hawley told reporters Thursday that the screening system at the airport sends random test images of bombs. When the system works correctly, the screeners are immediately notified that they saw a test image.

The message did not come across to a screener on Wednesday who noticed a suspicious image on the monitor of her x-ray machine. A "ground halt" of all flights was ordered to allow security officials time to determine if the item in question was dangerous.

A day after Hartsfield-Jackson International Airport shut down for several hours, the head of the TSA says the entire problem was a computer glitch.

The security incident grounded 120 flights while thousands of passengers waited for hours before receiving the all clear.

“What we’ve learned today, is that the screener did her job, but the screening system did not,” said TSA Director Kip Hawley.

TSA employees are trained to spot explosive devices, and that training never ends. A computer system projects images of bombs into random bags, and it’s up to the screener to catch them.

On Wednesday, the screener saw a bomb in a bag, but the system failed to tell her it was just an image.

Hawley had already planned to visit Hartsfield on Thursday, but he arrived just in time to answer questions about Wednesday’s shutdown.

“There was an image of a bomb that is used for testing purposes that was projected on the screen. The TSA properly ID'd it as a threat object, pressed the button to see if it was a test. There was a software malfunction,” said Hawley.

The software should have told the employee that they correctly identified the image and that it was only a test. But there was no message. So the screener called her supervisor.

All the bags were checked. No bomb was found. But they still didn't know it was a software malfunction. Not until they poured through hundreds of images of bombs on the computerized system.

“It was during the overnight audit of all the library, that we have that this one was identified,” Hawley said.

Hawley and airport manager Ben DeCosta also answered complaints passengers weren’t told anything for almost two hours.

Both said this incident hightlighted that airport officials need to do a better job of letting passengers know what is going on during security incidents.

Delta: Ga. airport security glitch costly

By HARRY R. WEBER, AP BUSINESS WRITER
Published: Apr 21, 2006

ATLANTA (AP) - Delta Air Lines Inc., which is operating under bankruptcy protection, said Friday a computer software glitch at a security checkpoint at the world's busiest airport cost it more than $1.3 million.
The revelation about Wednesday's disruption at Hartsfield-Jackson Atlanta International Airport came in a letter from Joe Kolshak, Delta's executive vice president and chief of operations, to Transportation Security Administration Director Kip Hawley.

"These costs are not insignificant for an airline that is fighting for its survival," Kolshak wrote.

Hawley told reporters Thursday that the scare that shut down security checkpoints for two hours was the result of a computer glitch in testing software. He said an airport screener spotted what looked like an explosive on an X-ray machine. She pressed a button that should have signaled a routine security test was being conducted, but it failed to respond, Hawley said.

By the time checkpoints reopened, no planes had departed for more than an hour and all arrivals were delayed by at least 90 minutes. The shutdown came at peak travel time and at least 120 flights were affected.

Kolshak said in his letter to Hawley that more than 7,000 Delta customers were affected by flight cancelations, diversions and delays.

"We believe that a breakdown in the testing process as well as communications caused an undue burden on passengers and on this airline, and could have been avoided," Kolshak wrote.

In a meeting with reporters at the airport Thursday, Hawley apologized for the delays that affected passengers. But he said he is pleased with the performance of the TSA screeners in Atlanta. The screener acted properly in what should have been a routine drill, he said.

Delta, the nation's No. 3 carrier, has lost more than $12.8 billion since January 2001. The Atlanta-based company filed for bankruptcy protection in New York on Sept. 14.

April 18, 2006

It's a Bug, It's a Feature

It’s a

~~BUG~~

~~Feature~~

~~BUG~~

~~Feature~~

~~BUG~~

Feature

(August, 2007: An old Bug/Feature)

(January 2007: A new bug/feature)

(July 2015: It's a Feature)

(July 2015: It's a Maine Feature)

April 17, 2006

Misuse and Abuse of Bug Counts

I use bug counts to tell me something about the system-under-test. They give me a rough idea of the volume of work remaining in different categories - how many bugs remain to be fixed at this instant, how many fixes remain to be verified, etc. Bug counts are easy to gather and easy to chart.

But I try not to publicize these bug counts. I avoid this because far too often, I see bug counts and rates being relied upon to provide information for which they are ill-suited, without a lot of context. I don't want people to make bad decisions, based on how they mis-interpret or otherwise misuse bug counts. There's a saying I try to remember: There is nothing more dangerous than a Vice President with statistics.

Here are some bad ways to use bug counts or bug rates, and some reasons why I think they are bad.

Using Bug Counts as a Measure of Software Quality

Bug counts don't really tell much about the quality of a system. If I told you that we have 0 Open bugs at the moment, what conclusions could you draw? Could you really say that the system had high quality? What if nobody had tested anything yet? What if I had simply deferred all the open bugs to a future release? What if all the bugs had just been marked as Fixed, but the fixes hadn't yet been verified? What if the product had only 1 bug yesterday "The software cannot even be installed", and that was just fixed today? copyrightjoestrazzere

Using Bug Counts as a Measure of a Tester's Performance

People are good at changing their behavior, based on how they are measured. If I tell them that "more bugs are good", they will create more Bug Reports. But, then I'll have to worry if these are superficial bug reports, or if there are duplicates, or if they are really finding the "important" bugs (which may take longer to find), rather than just blasting in many, "less important bugs" in order to drive their bug count up.
If testers are measured on how many Bugs they find, the developers may feel that they are being judged (in the opposite manner) by the same bug counts.
The bug count is usually strongly affected by the kind of testing tasks assigned to the QAer. Someone assigned to test brand-new code, may find it simple to find lots of bugs, whereas someone assigned to do regression testing, may find fewer bugs. Someone testing the UI may be able to find bugs more quickly than someone assigned to test lengthy database upgrades.
If people are judged based on their bug count, they may be less willing to perform all the other tasks that still need to be done - writing Test Cases, verifying fixes, helping others, etc.

Using Bug Counts as a Measure of a Developer's Performance

Again, people are good at tailoring their behavior to maximize their personal measures. If they are incented to keep their individual bug count low, they can accomplish that - perhaps at the expense of other, desirable acitivies. I worked in one shop that made it a big deal when developers had zero bugs at our weekly team meetings, held on Mondays. So one developer made a habit of checking in a ton of changes at the end of the day on Friday, and marking all her bugs as "Fixed". She knew there would be no time to verify any of these fixes before the meeting, and so she would always be the "hero with zero" during the meeting. Immediately after the meeting, her bug count would zoom up as many of her quick fixes were invariably found to be defective.
When bugs are counted against developers, they may spend more energy arguing over bug assignment, rather than just fixing them.
All bugs are not equal. One really bad bug is often far more important to a business than many trivial bugs. Bug counts usually do not express this relationship well.
I never want a tester to be conflicted about writing a bug report. If you find a bug, write the report! No need to worry that it might adversely impact your developer friend.

Using Bug Counts as Project Completion Predictors

While your Release Criteria may include "Zero High-Priority Bugs" as one item, that doesn't mean you can use the current bug count, or even the rate of bug fixes as a predictor of when you can release. Bugs counts don't predict anything. For example, just because you have been able to close 10 bugs per week, and you have 10 remaining, does not mean you will be done in a week. What happens if, at the end of the week, you find a High-Priority bug? You still have something that effectively indicates you are not done. Many measurement systems break down when the numbers are small.
Bug counts change in response to lots of factors that aren't always obvious in the raw numbers. On one project, the bug count went down dramatically during one particluar week near the end of the development cycle. When I went to the weekly review, several team members pointed to that as evidence that we were almost ready to ship the product. Unfortunately, I had to point out that the two best QAers were on vacation that same week. Once they returned, and our testing pressure was brought back up to normal levels, the bug count started to climb again.

Comparing Bug Counts Across Projects

It's very often not a valid comparison to compare the count of bugs found in one project versus the count found in another. Projects vary in size, complexity, duration, technical challenge, staffing, etc. Bug counts are very likely to vary as well.

Comparing Bug Counts Across Companies or Industries

This can be even more misleading than comparing within a company. Companies don't build identical products, nor use identical release schedules, nor determine what is or isn't a bug identically.

There are ALWAYS Requirements

Occasionally, I hear the question - "How can I design test cases, if I don't have a Requirements Document?"

But, there are ALWAYS requirements - even if they are not formally documented. They may take some time to discover and list, but they exist. Here's one approach to finding those "hidden" requirements.

First, look for general requirements and work to document them. Some of these requirements come from previous versions of the application, some come from generally accepted usage. For example:

Must run on platforms x,y,z (perhaps because those platforms have always been supported)
Must use abc database
Must be able to process n records in m seconds
Must be at least as fast as release n - 1
Must not consume more memory (or other resources) than release n - 1
Must not crash
Must not corrupt data
Must use standards relevant to the platform (standard Windows UI, for example)
Must be consistent with relevant laws, regulations or business practices
Must not have any misspellings
Must be grammatically correct
Must incorporate the company's usual look-and-feel
Must be internally consistent
Must work in particular locales
Must be complete when expected by the stakeholders (perhaps for some event, such as a Beta)

If it's a web site or application, some additional requirements might include:

Must not be missing any images
Must not have any broken links copyrightjoestrazzere
Must bascially work the same in all browsers which are officially supported by the company

Then, interview the project manager or developers and find out what they intend to do with this release. Document the intentions and use them as requirements.

Solicit input from anyone who is a stakeholder in the project. Share everything you find with everyone and revise it as needed.

Does the product have a Help file or a User Guide? If so, that's a good source of requirements.

Do Sales materials exist for the product? Certainly the product should do what these materials say that it does.

Sometimes, writing all of this up as assumptions can go a long way toward gaining a consensus as to the "real requirements" you can use to test against.

Once the system is at all testable, do some exploratory testing. As you find "undocumented features", add them to the list of topics to be discussed.

Find out if the product is internally consistent. (This is an area I find to be very useful) Even if I know nothing at all about a product, I assume it must be consistent within itself, and within the environment in which it must operate.

Look for external standards within which the product must operate. If it is a tax or accounting program - tax law must prevail and generally accepted accounting principles must apply.

Ideally, all of these issues have already been considered and written into the formal Requirements documentation which is handed to you. But if not, don't give up. Dig in and discover!

(also see: http://strazzere.blogspot.com/2010/04/checklist-for-good-requirements.html )

added 10/20/09:

As Catherine Powell points out in her excellent blog, even without specific requirements, software must also be "a good citizen". (http://blog.abakas.com/2009/10/good-citizen.html)

She defines the term as:

"Software that is a good citizen behaves in a manner consistent with other software, with regard to interaction with other assets with which it interacts."

And she elaborates this way- Software that is being a good citizen does:

support logging in a common format (e.g., NT Event logs, etc)
use centralized user or machine management (e.g., Active Directory or NIS)
does automatic log rolling
can be configured to start on its own after a power outage or other event
can be disabled or somehow turned off cleanly (to allow for maintenance, etc)

Software that is being a good citizen does not:

log excessively (at least, except maybe in debug, which should be used sparingly)
create excessive traffic on infrastructure servers (DNS, Active Directory, mail, firewall, etc)
send excessive notifications (e.g., a notification for every user logging in would probably be overkill)

April 15, 2006

General Input Tests for Time Fields

Here are some routine tests to try for a time field.

Decide which of the following are relevant for your input field and use them.
If you are using an automated test tool, these values can easily be used exhaustively, or randomly, by a test script:

00:00:00
00:00:01
01:01:01
11:59:59
12:00:00
12:01:01
23:59:59
(now)
(1 second ago)
(1 second from now)
(1 minute ago)
(1 mnute from now)
(1 hour ago)
(1 hour from now)
25:01:01
01:60:01
01:01:60
01:01:61
01::01
01::
01:01:xx
01:xx:01
xx:01:01

Why "42" is The Answer to Life, the Universe, and Everything

When I'm asked for a simple, numeric answer to a complex question, I like to respond "42".

Here's why:
http://en.wikipedia.org/wiki/The_Answer_to_Life,_the_Universe,_and_Everything

According to the Hitchhiker's Guide, researchers from a pan-dimensional, hyper-intelligent race of beings, construct Deep Thought, the second greatest computer of all time and space, to calculate the Ultimate Answer to Life, the Universe, and Everything. After seven and a half million years of pondering the question, Deep Thought provides the answer: "forty-two."

"Forty-two!" yelled Loonquawl. "Is that all you've got to show for seven and a half million years' work?"

"I checked it very thoroughly," said the computer, "and that quite definitely is the answer. I think the problem, to be quite honest with you, is that you've never actually known what the question is."

Even Google agrees. Try this.

As to why it's 42, rather than 7 or 1,080,213 or such, Douglas Adams writes this:

The answer to this is very simple. It was a joke. It had to be a number, an ordinary, smallish number, and I chose that one. Binary representations, base thirteen, Tibetan monks are all complete nonsense. I sat at my desk, stared into the garden and thought '42 will do' I typed it out. End of story.

Best,

Douglas Adams

General Input Tests for Listbox Fields

Here are some routine tests to try for a listbox field.
Decide which of the following are relevant for your input field and use them.

If you are using an automated test tool, these values can easily be used exhaustively, or randomly, by a test script:

None selected
Top item selected
Second item selected
Middle item selected
Next to bottom item selected
Bottom item selected
Top and bottom items selected
Several adjoining items selected
Several non-adjoining items selected
All items selected
All items except the top item selected
All items except the bottom item selected
All items except one middle item selected

General Input Tests for Radiobutton Fields

Here are some routine tests to try for a radiobutton field.

Decide which of the following are relevant for your input field and use them.

If you are using an automated test tool, these values can easily be used exhaustively, or randomly, by a test script:

On
Off
(All buttons in a group Off)
(One button in a group On)
(More than one button in a group On)
(One button in a group Off)
(All buttons in a group On)

General Input Tests for Checkbox Fields

Here are some routine tests to try for a checkbox field.

Decide which of the following are relevant for your input field and use them.

If you are using an automated test tool, these values can easily be used exhaustively, or randomly, by a test script:

Checked
Unchecked
3rd State (if applicable)

General Input Tests for Date Fields

Here are some routine tests to try for a date field.

Decide which of the following are relevant for your input field and use them.

If you are using an automated test tool, these values can easily be used exhaustively, or randomly, by a test script:

0/0/0000
(yesterday)
(today)
(today with all leading zeros, like 01/01/2006)
(tomorrow)
(same day last week)
(same day next week)
(same day last month)
(same day next month)
(same day last year)
(same day next year)
(Jan 1 this year)
(Dec 31 this year)
(Jan 1 last year)
(Dec 31 last year)
(Jan 1 next year)
(Dec 31 next year)
1/1/1999
12/31/1999
1/1/2000
2/28/2000
2/29/2000
2/30/2000
2/31/2000
3/1/2000
12/31/2000
1/1/2001
1/1/202
2/28/2004
2/29/2004
3/1/2004
12/31/2004
9/9/9999
2/29/2001
4/31/2001
9/31/2001
2/29/2100
13/1/2006
00/1/2006
01/32/2006
01/00/2006

June 5, 2001
06/05/2001
6/5/2001
06/05/01
6/5/01
06-05-01
06-05-2001
6-5-01
6-5-2001

April 14, 2006

General Input Tests for Strings

Here are some routine tests to try for a simple string field.

To use these values, you should first know the Minimum and Maximum number of characters required.

Then decide which of the following are relevant for your input field and use them.

If you are using an automated test tool, these values can easily be used exhaustively, or randomly, by a test script:

Nothing
Empty field (clear any defaults)
More than the maximum number of characters
Much more than the maximum number of characters
Any valid string
A single leading space
Many leading spaces
A single trailing space
Many trailing spaces
Leading and trailing spaces
A single embedded space
Many embedded spaces
Nonprinting character (e.g., Ctrl+char)
Operating system filename reserved characters (e.g., "*.:")
Language-specific reserved characters
Upper ASCII (128-254) (a.k.a. ANSI) characters
ASCII 255 (often interpreted as end of file)
Uppercase characters
Lowercase characters
Mixed case characters
Modifiers (e.g., Ctrl, Alt, Shift-Ctrl, and so on)
Function key (F2, F3, F5, and so on)
Characters special to sprintf (like or %d)
Keyboard special characters (~!@...)

Character Input Testing

The following is a list of characters that are, or may be, special to some portion of the processing chain for many web applications.

A robust input and display testing strategy will verify that these characters are

correctly rendered on input
if persisted in some way, correctly rendered after being persisted and retrieved from the persistent store
work correctly when placed in different parts of an input string (beginning, middle, end)

Note that some of these collections overlap -- the intent is for each list to be complete in and of itself.

Characters that are special to file system paths

. \ / : ;

Characters that are illegal in file system paths

unknown, TBD

Characters that are special to XML

< > & " '

Characters that are illegal in XML

The following BNF rule describes valid characters in XML (from XML specification):

Char ::= #x9
#xA
#xD
[#x20-#xD7FF]
[#xE000-#xFFFD]
[#x10000-#x10FFFF]

Any character not in these ranges (for example #xC) is not allowed to appear anywhere in an XML document. (We still may need to handle these values correctly in some places. Even if the user may not be able to enter them, other services sometimes return them to us. For example, some SNMP object discoveries return strings containing #x0.)
Characters that are special in Javascript

TBD, includes " ' \ and some character combinations beginning with \, like \u \n.

Characters that are illegal in Javascript

Characters that are special in URLs

See RFC 2396, sections 2.2-2.4. The following characters are special:

reserved = ";" "/" "?" ":" "@" "&" "=" "+" "$" ","

Characters that are illegal in URLs

See RFC 2396, sections 2.2-2.4. The following characters (the "unreserved" rules) are legal in URLs:

unreserved = alphanum | mark
mark = "-" "_" "." "!" "~" "*" "'" "(" ")"
alphanum = A-Z a-z 0-9

Any characters not in this set (for example 16-bit characters, 8-bit non-US-ASCII characters, and punctuation marks not in this list, except reserved characters being used in their official capacity) must be hex-encoded to be included in a URL.
The following rules describe characters that are explicitly disallowed (this set is NOT exhaustive, as the previous paragraph is):

control =
space =
delims = "<" ">" "#" "%" "<" ">"
unwise = "{" "}" "" "\" "^" "[" "]" "`"

Characters that are 8-bit (e.g. European characters from ISO-8859-1,2,3): a with umlaut, e with accent, o with slash
Characters that are 16-bit: Korean, Japanese characters

Pages