13 October 2013

Arguing for Exploratory Testing, part 2, Reuse

You can read a bit more about this series in the first post:
Arguing for Exploratory Testing, part 1, Traceability

The topic for our second Transpection Tuesday on "Arguing for Exploratory Testing" was Reuse.

We finished with two open questions:
Can we ensure we actually repeat the exact same test a second time?
How do you actually achieve reuse in exploratory testing (when it is desired)?

Reasons to reuse tests
First we tried to state reasons someone would want to reuse test cases:
  • Save time during test design
  • Functionality is changed and we want to rerun the full/part of the test scope
  • We want to verify a (bug) fix
Looking at reasons quickly led us to some preconceptions which became the topic for a big portion of the session:
  • Effort = Value
  • Equal execution = Equal value
  • Our scope is (almost) complete
  • Reuse = free testing
  • A monkey can run a test case
Preconception: Effort = ValueSince we've invested so much time (as well as money and prestige) in writing test cases they must be worth more than a single execution.
  • Even if presented with clear evidence we may reject it to defend out judgement
  • We may overestimate what a test case is useful for (we want to get the most out of our work)
  • It's my work, criticize it and you criticize me not the work! (common and unfortunate misconception)
It takes a lot of self-esteem to say: "Yeah I screwed up, could you help me?", especially in an environment where mistakes are not accepted. Notice many of the "so how can we make the most of this mistake" still communicates "so you made a mistake, now you'll have to suffer for it by telling us why you are a failure". It takes a lot of work to change this.

Preconception: Equal execution = Equal value
Let's say we execute the exact same steps in a scripted and an exploratory way, wouldn't that be two identical tests? We believe not.
  1. Your goal differs. With test cases your goal is to finish as many test cases as possible (progress). That's how you measure "how much testing you were able to perform". In exploratory testing you are judged based on the information you provide thus you should be more incline to spend a few extra minutes observing/following something up even when it's not "part of your test".
  2. Your focus differs. When you have a script you have to focus on following that script. In exploratory testing your goal is typically to find new leads to base the next test on. That means in one case your focus is on the product and in the other on an artifact. Think about the Invisible Gorilla experiment.
  3. Scripts easier bias you not to observe. In a script you typically have verification steps e.g. "verify X=5". We believe this could bias you to not be as observant during the other steps: "this is just setup so nothing should happen that concerns me".
Preconception: Our scope is (almost) complete
We know a feature's boundaries (specifications, requirements) so when we set the scope for testing we can, and usually will, cover almost the entire feature.
  • We can't know the boundaries of a feature:
    • We will impact and use other components not "part of the feature" e.g. other code in the application, the operating system, surrounding applications, third party plugins, hardware, hardware states etc.
    • We interpret planning documents differently, adding parts, discover things we couldn't had anticipated, correct mistakes or interpret something differently than intended by the author and/or interpreted by the tester.
  • We can (almost) always tweak a test a little bit (e.g. change input data or timing). But testing all combinations (we recognize) is way too expensive. Also there are usually so many ways an application can be misused (intentionally or unintentionally) that even with a ton of creativity we can't figure out them all (ask any security expert .)
So our scope is basically a few small dots on a big canvas rather than a well colored map. But those dots are (hopefully) carefully selected to protect us from the greatest risks we can anticipate. Still, they are only dots.
As testers we easily support the preconception of full coverage by answering questions like "do we cover this feature now?", "is all testing done?" etc. with a simple "yes" or "almost". The more accurate answer would be "we cover the most important risks we identified limited by our current knowledge, time available and other constraints", but that answer is not very manager friendly which leads us to...

There is a general lack of knowledge and understanding of testing in most organizations. And we decided to stop there since that was a way too big question to tackle at the point we got there. But it's an important questions so please take a moment and think about it for yourself: How can you improve understanding and interest for test in your organization?

A final note. Since we only cover a small part, reusing a test scope will not help us catch the bugs we missed the first time. How big of a problem that is differs but repeat a few times and it may scale up in a nasty way.

Preconception: Reuse = free testing
We've already written the test case so wherever it's applicable (which should be self-explanatory) we can just paste it into out test scope and voí la! Free coverage!

The big issue here is the "self-explanatory" part. Problem is what fitted well in one feature might not do it in another even similar one. Even without needed tweaks we still have to figure out what the test case actually does, so that we know what we have covered with it and what we still need to cover in other ways.

This process is expensive, really expensive, so sure we save time not having to figure out the test and how to practically run it all over again but consider the time it takes to find the test case, analyse what it covers, analyse what it doesn't cover, analyse how it interacts with existing test cases, analyse if something has changed that impacts the test case compared to last time and so forth.

Preconception: A monkey can run a test case
  • We all interpret things differently. Click can mean single click, double click, right click (already assuming the first two were left clicks), tab and use enter, middle button click, etc. Even a well written, simple test case can lead to different interpretations.
  • One thing we're looking for is unexpected behavior and it's in the nature of "unexpected" to be something we can't plan for. Thus to get much use of a test case we need to handle the system well enough to investigate and identify when a behavior is unexpected or undesired.
  • We do a ton more observations than we consciously think of. These observations takes practice, focus and skill. For example, when you boot your computer you would react to a lot more things than you would add in a "boot test". Examples: screen is blinking, smoke is coming out, lights in the room flickers, you hear strange mechanical sounds, all these should catch you attention but are unlikely written down.

    More skill and/or focus can lead to more valuable observations: The login screen looks different, memory calculations are wrong, it's slower than usual/expected, BIOS version is incorrect, the operating system's starting mode is wrong etc.
  • When we don't fully understand something we tend to write it down less detailed (sucks to look stupid by writing down something incorrect and we're too lazy to investigate every detail we don't understand, it's easier to investigate as we get there).
  • When we write a test case based on specifications, requirements and other "guesses" of how the end system will work even a flawless instruction will sometimes not correspond to how the system is actually working (including when working as desired). This of course requires the person executing to be able to correct the test case thus understand both the intention with the test case and how the system works.
  • If we don't understand the system we may lose a lot of time setting up fully or partly irrelevant variables to the values stated in the instructions. The immediate comment is, if we have stated irrelevant variables in the test case we've failed. Consider then that the variable might be irrelevant to the test but mandatory to the system (e.g. you have to set a valid time server). Leave that out and the person executing once again needs to understand the system.
When is reuse actually beneficial?
  • We have rewritten something from the ground up but want it to externally still work the same. Reuse could save time.
  • We have some critical paths through the system that can't break.
  • We need to quickly regression test a feature without having to dig in too deep in the feature itself.
But remember that the one executing still should understand the test and the system to ensure tweaks (using different input values, triggering different fault cases etc.) can be made and important observations are more likely to be made.

How can we achieve reuse in Exploratory Testing
Not covered much by this particular session but a few thoughts:
  • Charters
  • Debrief notes
  • Test ideas
  • Test plans
Try creating a report specifically used as a "feature summary" including valuable operational instructions, general test ideas, impacts, lessons, problems, tools, important details, testability etc. We did kind of this at my former company where we let the test plan continuously turn into an end report as we added lessons from our testing. This would not only help when retesting a similar feature but also as educational material or test plan input for instance. Important though is to stay concise, noise is a huge enemy! The number of readers of a document is inversely proportional to the number of pages in the document, you know .)

A few notes on test case storage
First off I love this post on having a big inventory of test cases by Kristoffer Nordström.

It's easy to think something you've already created is free, but there's no such thing. Having a large inventory to test cases costs in many different ways:
  • Storage
  • Noise (it's one more thing testers have to keep track of)
  • Another tool/part of tool for testers to stay updated with / learn / understand
  • For a test case to be fully reusable later it should be kept up to date. How many refactors all their old test cases as functionality is changed?
  • ... if you do, that sounds really expensive.
Reuse has it's place but be careful!

Remember reuse means inheriting blind spots, has a cost and still requires the person "reusing" to know just as much about the feature, system and testing in general as if (s)he wasn't reusing old checks.

Take care, and I hope these Transpection Tuesday notes (even though somewhat messy) were helpful!

... and of course, thank you Helena!

No comments:

Post a Comment