03 October 2013

Arguing for Exploratory Testing, part 1, Traceability

The topic for my and Helena Jeret Mäe's last Transpection Tuesday was Arguing for Exploratory Testing. What we basically wanted to achieve was to get better at explaining the pros (and cons) about exploratory testing, in a concise way, as well as identify common preconceptions about scripted versus exploratory testing.

We had defined 15 subtopics such as time estimations, credibility and making sure the important testing is done. The first item on this list was traceability which turned out to be enough material to fill the whole 2 hour session.

What is Traceability
First question was: What do we mean with traceability?

Our answer: Being able to track what has been tested, how, when and by who.

Why do we want Traceability
The next question was why we want traceability. We quickly formed a list but reading it now makes me realize we mixed together traceability and claimed benefits of having a trunk of test cases. But anyway:
  • External demands
  • Ensure work has been performed
  • Base for further testing
  • Support handovers
  • Create a map
  • Reuse
General thoughts
One thing we got back to over and over again was: The best way (often related to level of detail) to achieve good enough traceability is highly context dependent! For example having a simple mind map with short comments is good enough for one company while another requires every session to be recorded with the recordings being stored and indexed together with session notes, debrief summaries and saved logs. It all depends!

Another reoccurring theme was: "But do we really achieve that kind of traceability with test cases". I will not bring up those discussions much in this post but expect another one on "false assumptions about scripted and exploratory testing" soon.


Charter is basically an area to test, a way to break down a big testing mission. Notice though that as you test new charters might come up so it's by no means a definite plan. Read more >>

Test idea
Typically a one liner describing one or more tests you want to do. Read more >>

A timeboxed, uninterrupted test sitting, typically 60-120 minutes. Read more >>

Refers to an activity happening after a session where the tester explains what has been done to, for example, a test manager. This also includes clarifying questions, feedback and other kinds of dialog to help both parties learn from the session. Read more >>


We mainly refer to screen recording (video, either using a screen recording tool or an external video camera) but could as well mean record audio, save logs/traces or other ways to save what has been done. A good resource >>

External demands
This refers to regulated businesses (watch the excellent presentation What is good evidence by Griffin Jones), evidence in a potential lawsuit or customers demanding test data.

Possible solutions:
  • Record the sessions, preferably with configuration (device, version, settings etc.) explained if that matters. Adding commentary might improve the value as well (communicating purpose, observations etc.). This is also typically a scenario where logs/traces can be a required addition to a video recording. Once again, watch What is good evidence.
  • Store session notes
  • Store session summaries
  • Store charters
  • Store debrief summaries
  • Store test ideas (assuming they has been covered by your testing)
Creating support to find old information (index) seems key as well. For this charters, time stamps and/or categories might be useful to tag your save material with.

Ensure work has been performed
First question raised was: Is this really something we want to encourage? And our general answer is no; with the motivation that people in our experience tend to do things to look good rather than do what is needed/valuable when closely monitored. But being able to know that the testers actually do their job is closely connected to credibility and transparency so still a valid question.

Possible solutions:
  • Debriefs
  • Recordings
  • Notes
  • Bugs reported (a really bad metric for this but can indicates something!)
Debriefs seemed to most often be the preferred approach. During a good debrief the person being debriefed asks followup questions that will require the person debriefing to explain the testing done. A byproduct in this process would be to ensure that the tester actually did a good job / any job at all. But once again; if your focus is on monitoring, the people monitored (testers as well as non-testers) is likely to waste time proving job has been done rather than actually work!

Base for further testing
Let's say we've finished the prepared scope or are suddenly given an extra week to test something. If we can't go back and use already executed tests as inspiration, how do we know where to continue?

Possible solutions:
  • Having a bulk of charters as inspiration
  • Make comments about testing you've left out in your finished charters/sessions
  • Review session notes
We also brought up if there's a value of actually looking at what has been done. Often we found that the time it takes to analyse the work already done might not be worth it (information being too detailed making it hard to overview and learn from quickly). Simply exploring using knowledge we might not had had the first time or by having a different tester from when we first tested, is often more than enough to add value. After all, the time we analyse is time we cannot test (which might or might not be well invested).

Support handovers
One tester leaves (quits, parental leave, other tasks etc.) and another has to take over, how can we manage such a change when not having a set scope of test cases? First of all the new tester do have to spend some time getting familiar with the feature in exploratory testing but this is also true for using test cases since we, for instance, can't predict what problems we will run into thus can't prepare instructions for those!

But we can make it easier:
  • Charters (with status)
  • Debrief
  • Documented test ideas with already investigated ideas being marked
  • Session notes or session summaries
  • Mind maps or other test planning with already tested parts commented
  • Documenting lessons learned (like operational instructions)
Debrief in this case refers to a general debrief of what has been done, what we know is left, problems seen, lessons learned, where information is stored, who to talk to etc. by the tester leaving. Of course if the switch happens very suddenly (e.g. sickness) performing this is not possible and in that case it's important testers are professional enough to document what has been done (mind maps, short plans, visualizations, debrief/session summaries, charters). This is once again true for both exploratory and scripted testing.

Create a map
A bulk of test cases combined with statuses can somewhat be used to draw a map of what has been covered and what is left to test. How can we visualize this without test cases?

Possible solutions:
  • Charters
  • A mind map describing what has been tested
  • A picture/model of our product with comments about testing/coverage
  • Other visualizations like diagrams
  • The Low Tech Dashboard
A few important notes:
  1. You sure have a map with test cases but is it actually anyway near accurate? Say we have two equally complex functions. One takes 1 argument, one takes 10. We likely will have at least 10 times as many test cases to cover the second function. So if we execute all the test cases for the second function, have we really covered over 90% (with "covered" only considering these 2 functions)?
  2. Even if equally sized, that map would not cover what we didn't anticipate from the beginning so you still need to add an up to date judgement/evaluation (e.g. "wow that network protocol sure was more complex when we expected during the planning, we need more testing of it!").
  3. Scale is really important. Do we want to see Tartu, Estonia, Europe, the world or the Milky Way galaxy? We might need different visualizations to create all the maps we need (once again, think about value, how much time can we spare to keep these updated).
Later a similar feature or a feature impacting the one we just tested is developed and we want to reuse the work previously done. How can we do this without test cases?

First of all, reuse is one of the places where test cases are powerful. However you have the minesweeper problem: If you walk the same lane in a mine field over and over, as new mines are constantly added, it's likely that the number of mines beside your narrow track start to build up while few will happen to end up in your path. Meaning, running the same tests over and over is less likely catch new bugs as creating new tests are so value quickly diminishes (more tests executed is not equal to more valuable ground covered).

What we often would suggest is to use knowledge acquired the first time as foundation for new testing to speed it up. Think about the new risks introduced and what needs to be tested based on that (like with new functionality) rather than how old test cases might fit into your testing.

Possible solutions:
  • Reuse of charters
  • Reuse of test ideas
  • Look at old session notes / summaries
  • Use old recordings (the simpler the form of the recordings the better for this, watching several hours of screen recording is probably waste)
  • Start a wiki page/document/similar for each feature and add lessons learned, where to find info, problems etc. as you test.
There are many ways of achieving traceability (and similar potential benefits of test case trunks) in exploratory testing, Session Based Test Management principles seems to be the most straight forward way but keeping track of test ideas or using other approaches works as well. All have their own contexts where they seem to work best (e.g. SBTM might add too much overhead for a simple project).

All and all, if someone claims "You lose traceability with exploratory testing", ask what that person means more precisely (e.g. present testing data to customer) and explain the alternatives. Notice this is only based on our two hour discussion and there are a whole lot more to add so think for yourself as well! Also question whether you actually achieve the kind of traceability requested using a scripted approach and to what cost. Finally question if the requested traceability is actually worth its cost no matter if exploratory or scripted testing is used. Doing unnecessary work is wasteful no matter what approach you use.

Finally: There are still contexts where a highly scripted approach is likely the best option but the closer you get to a pure scripted approach the fewer and more extreme the contexts become.

Thank you for reading!

And thank you Helena, see you next week!

No comments:

Post a Comment